git-svn-id: file:///home/ksmith/gitmigration/svn/xmpp/trunk@1842 4b5297f7-1745-476d-ba37-a9c6900126ab
This commit is contained in:
Peter Saint-Andre 2008-05-13 01:50:30 +00:00
parent 27dd68e062
commit 6782945240
1 changed files with 72 additions and 40 deletions

View File

@ -22,14 +22,20 @@
</dependencies>
<supersedes>None</supersedes>
<supersededby>None</supersededby>
<shortname>TO BE ASSIGNED</shortname>
<shortname>NOT_YET_ASSIGNED</shortname>
&ianpaterson;
&stpeter;
<revision>
<version>0.9</version>
<date>2008-05-12</date>
<initials>psa</initials>
<remark><p>Specified that cid field is required; more clearly defined structure of challenge stanza; added several security considerations; corrected several errors in the text and examples.</p></remark>
</revision>
<revision>
<version>0.8</version>
<date>2008-05-12</date>
<initials>psa</initials>
<remark><p>Move text regarding labels to new internationalization considerations section; removed necessity that ID of IQ-set shall match ID of challenge, since this is not consistent with existing usage that IDs are generated by the sender of an IQ.</p></remark>
<remark><p>Moved text regarding labels to new internationalization considerations section; removed necessity that ID of IQ-set shall match ID of challenge, since this is not consistent with existing usage that IDs are generated by the sender of an IQ.</p></remark>
</revision>
<revision>
<version>0.7</version>
@ -83,9 +89,10 @@
<section1 topic='Introduction' anchor='intro'>
<p>The appearance of large public IM services based on &rfc3920; and &rfc3921; makes it desirable to implement protocols that <em>discourage</em> the sending of large quantities of instant messaging spam (a.k.a. "spim") or, in general, abusive traffic. Abusive stanzas could be generated by XMPP clients connected to legitimate servers or by XMPP servers with virtual clients, where the malicious entities are hosted on networks of "zombie" machines. Such abusive stanas could take many forms; a full taxonomy is outside the scope of this document.</p>
<p>Several of the most effective techniques developed to combat abusive messages and behavior via non-XMPP technologies require humans to be differentiated from bots using a "Completely Automated Public Turing Test to Tell Computers and Humans Apart" or CAPTCHA (see &lt;<link url='http://www.captcha.net/'>http://www.captcha.net/</link>&gt;). These challenge techniques are easily adapted to discourage XMPP abuse. The very occasional inconvenience of responding to a CAPTCHA (e.g., when creating an IM account or sending a message to a new correspondent) is small and perfectly acceptable -- especially when compared to the countless robot-generated interruptions people might otherwise have to filter every day.</p>
<p>An alternative technique to CAPTCHAs requires Desktop PC clients to undertake a <span class='ref'>Hashcash</span> <note>Hashcash &lt;<link url='http://hashcash.org/'>http://hashcash.org/</link>&gt;.</note> challenge. These are completely transparent to PC users. They require clients to perform specified CPU-intensive work, making it difficult to send large amounts of spim.</p>
<p>The generic challenge protocol described in this document is designed for incorporation into protocols such as &xep0077;, &xep0045;, and &xep0159;.</p>
<p>One technique developed to combat abusive messages and behavior via non-XMPP technologies requires humans to be differentiated from bots using a "Completely Automated Public Turing Test to Tell Computers and Humans Apart" or CAPTCHA (see &lt;<link url='http://www.captcha.net/'>http://www.captcha.net/</link>&gt;). These challenge techniques are easily adapted to discourage XMPP abuse. The very occasional inconvenience of responding to a CAPTCHA (e.g., when creating an IM account or sending a message to a new correspondent) is small and perfectly acceptable -- especially when compared to the countless robot-generated interruptions people might otherwise have to filter every day.</p>
<p>An alternative technique to CAPTCHAs requires Desktop PC clients to undertake a <span class='ref'>Hashcash</span> <note>Hashcash &lt;<link url='http://hashcash.org/'>http://hashcash.org/</link>&gt;.</note> challenge. These are completely transparent to PC users. They require clients to perform specified CPU-intensive work, making it difficult to send large amounts of abusive traffic.</p>
<p>Both CAPTCHAs and hashcash have been criticized regarding their effectiveness (or lack thereof). Therefore, the challenge protocol specified herein provides a great deal of flexibility, so that challenges can include CAPTCHAs, hashcash, word puzzles, so-called kitten authentication, and any other mechanism that may be developed in the future.</p>
<p>The generic challenge protocol described in this document is designed for incorporation into protocols such as &xep0077;, &xep0045;, &xep0016;, and &xep0159;.</p>
</section1>
<section1 topic='Requirements' anchor='require'>
@ -97,10 +104,11 @@
</section2>
</section1>
<section1 topic='Protocol Usage' anchor='protocol'>
<section1 topic='Protocol' anchor='protocol'>
<section2 topic='Simple Challenge' anchor='protocol-simple'>
<p>An entity (client or server) MAY send a challenge immediately after receiving a stanza from another entitiy. An entity MUST NOT send challenges under any other circumstances. Hereafter, the entity that generates the stanza that triggers the challenge is called the "sender" and the entity that sends the challenge is called the "challenger".</p>
<example caption='Sender Generates Stanza'><![CDATA[
<section3 topic='Trigggering Stanza' anchor='protocol-trigger'>
<p>A "triggering stanza" is an XMPP &MESSAGE;, &PRESENCE;, or &IQ; stanza that is deemed abusive by the receiving entity (e.g., a client) or an intermediate router (e.g., a server). The entity that generates a triggering stanza is called a "sender".</p>
<example caption='Sender Generates Triggering Stanza'><![CDATA[
<message from='robot@abuser.com/zombie'
to='innocent@victim.com'
xml:lang='en'
@ -110,10 +118,24 @@
<url>http://www.abuser.com/lovepills.html</url>
</x>
</message>
]]></example>
]]></example>
</section3>
<section3 topic='Challenge Stanza' anchor='protocol-challenge'>
<p>The challange consists of a message containing a form for the sender to fill out, formatted according to &xep0004;. Each of the challenge form's &lt;field/&gt; elements that are not hidden MAY contain a different challenge and any media required for the challenge (see &xep0221;). The hidden 'from' field MUST contain the value of the 'to' attribute of the sender's triggering stanza. If the stanza from the sender included an 'id' attribute then the hidden 'sid' field MUST be set to that value. The 'xml:lang' attribute of the challenge stanza SHOULD be the same as the one received from the sender. In accordance with &xep0068;, the hidden 'FORM_TYPE' field MUST have a value of "urn:xmpp:tmp:challenge" &NSNOTE;.</p>
<p>The challenger SHOULD include an explanation (in the &BODY; element) for clients that do not support this protocol. The challenger MAY also include a URL (typically a Web page with instructions) using &xep0066; as an alternative for clients that do not support the challenge form. Note: Even if it provides a URL, a challenger MUST always provide a challenge form. <note>A constrained client, like a mobile phone, cannot present a Web page to its user.</note></p>
<p>Upon receiving a triggering stanza, an entity MAY send a "challenge stanza". An entity MUST NOT send a challenge stanza under any other circumstances. The entity that generates the challenge stanza is called the "challenger".</p>
<p>The challange stanza consists of an XMPP &MESSAGE; stanza containing a data form for the sender to fill out, formatted according to &xep0004;, optionally along with a &BODY; and other elements. The following rules apply to the challenge stanza.</p>
<ol>
<li>The challenge stanza MUST include an 'id' attribute set to the challenge ID (i.e., a unique identifier for this challenge within the challenger's application).</li>
<li>The challenge stanza SHOULD include a &BODY; element that provides an explanation of the challenge for clients that do not yet support challenge forms.</li>
<li>The challenge stanza MAY include a URL (typically a Web page with instructions) using &xep0066; as an alternative for clients that do not yet support challenge forms.</li>
<li>The 'xml:lang' attribute of the challenge stanza SHOULD be the same as the one received from the sender, if any.</li>
<li>The challenge stanza MUST include a challenge form, i.e., a data form of type "form" containing one or more challenges. <note>Inclusion of a challenge form not only makes it possible to flexibly support or require a large number of challenge types, but also enables constrained clients to respond to challenges (e.g., mobile phone clients that cannot present web pages, or clients on XMPP-only networks).</note></li>
<li>The challenge form MUST include a hidden field named "FORM_TYPE" (in accordance with &xep0068;) whose value MUST be "urn:xmpp:tmp:challenge" &NSNOTE;.</li>
<li>The challenge form MUST include a hidden field named "cid" set to the challenge ID.</li>
<li>The challenge form MUST include a hidden field named "from" set to the value of the 'to' attribute from the triggering stanza.</li>
<li>If the triggering stanza included an 'id' attribute, then the challenge form MUST include a hidden field named "sid" set to that value.</li>
<li>Each of the challenge form's non-hidden &lt;field/&gt; elements MAY contain a different challenge.</li>
<li>Each challenge field MAY contain a media element (see &xep0221;) that in turn contains media (and/or a pointer to media) that the sender shall use in solving puzzles, performing optical character recognition, identifying audio or video samples, etc. When the sender replies to a media element via a data form of type "submit", the field type SHOULD be "text-single" (which is the default for data form fields) but MAY in turn include a media element if acceptable to the challenger application.</li>
</ol>
<example caption='Challenger Offers a Choice of Challenges to Sender'><![CDATA[
<message from='victim.com'
to='robot@abuser.com/zombie'
@ -132,8 +154,9 @@
<value>urn:xmpp:tmp:challenge</value>
</field>
<field type='hidden' var='from'><value>innocent@victim.com</value></field>
<field type='hidden' var='cid'><value>F3A6292C</value></field>
<field type='hidden' var='sid'><value>spam1</value></field>
<field var='ocr'>
<field var='ocr' label='Enter the text you see'>
<media xmlns='urn:xmpp:tmp:media-element'
height='80'
width='290'>
@ -144,7 +167,7 @@
type='image/jpeg'> ** Base64 encoded image ** </data>
</media>
</field>
<field var='picture_recog'>
<field var='picture_recog' label='Identify the picture'>
<media xmlns='urn:xmpp:tmp:media-element'
height='150'
width='150'>
@ -155,7 +178,7 @@
type='image/jpeg'> ** Base64 encoded image ** </data>
</media>
</field>
<field var='speech_recog'>
<field var='speech_recog' label='Enter the words you hear'>
<media xmlns='urn:xmpp:tmp:media-element'>
<uri type='audio/x-wav'>
http://www.victim.com/challenges/speech.wav?F3A6292C
@ -165,7 +188,7 @@
</uri>
</media>
</field>
<field var='video_recog'>
<field var='video_recog' label='Identity the video'>
<media xmlns='urn:xmpp:tmp:media-element'
height='150'
width='150'>
@ -225,6 +248,7 @@
<value>urn:xmpp:tmp:challenge</value>
</field>
<field var='from'><value>innocent@victim.com</value></field>
<field var='cid'><value>F3A6292C</value></field>
<field var='sid'><value>spam1</value></field>
<field var='ocr'><value>7nHL3</value></field>
</x>
@ -272,7 +296,7 @@
</section3>
</section2>
<section2 topic='Multiple Challenges' anchor='protocol-multiple'>
<p>The challenger MAY demand responses to more than one of the challenges it is offering; this is done by including an 'answers' &lt;field/&gt; element in the form. The challenger also MAY require responses to particular challenges; this is done by including &lt;required/&gt; elements in the compulsory fields.</p>
<p>The challenger MAY demand responses to more than one of the challenges it is offering; this is done by including an 'answers' &lt;field/&gt; element in the form, which specifies how many answers the sender needs to include. The challenger also MAY require responses to particular challenges; this is done by including a &lt;required/&gt; element in the compulsory fields.</p>
<example caption='Challenger Sets Multiple Challenges'><![CDATA[
<message from='victim.com'
to='robot@abuser.com/zombie'
@ -286,9 +310,10 @@
<value>urn:xmpp:tmp:challenge</value>
</field>
<field type='hidden' var='from'><value>innocent@victim.com</value></field>
<field type='hidden' var='cid'><value>73DE28A2</value></field>
<field type='hidden' var='sid'><value>spam2</value></field>
<field type='hidden' var='answers'><value>2</value></field>
<field var='ocr'>
<field var='ocr' label='Enter the text you see'>
<media xmlns='urn:xmpp:tmp:media-element'
height='80'
width='290'>
@ -297,7 +322,7 @@
</uri>
</media>
</field>
<field var='audio_recog'>
<field var='audio_recog' label='Describe the sound you hear'>
<media xmlns='urn:xmpp:tmp:media-element'>
<uri type='audio/x-wav'>
http://www.victim.com/challenges/audio.wav?F3A6292C
@ -326,9 +351,10 @@
<value>urn:xmpp:tmp:challenge</value>
</field>
<field var='from'><value>innocent@victim.com</value></field>
<field var='cid'><value>73DE28A2</value></field>
<field var='sid'><value>spam2</value></field>
<field var='answers'><value>2</value></field>
<field var='qa'><value>divad</value></field>
<field var='qa'><value>red</value></field>
<field var='SHA-256'><value>innocent@victim.com2450F06C173B05E3</value></field>
</x>
</challenge>
@ -355,8 +381,9 @@
<value>urn:xmpp:tmp:challenge</value>
</field>
<field type='hidden' var='cid'><value>F3A6292C</value></field>
<field type='hidden' var='sid'><value>reg1</value></field>
<field type='hidden' var='answers'><value>3</value></field>
<field var='ocr'>
<field var='ocr' label='Enter the text you see'>
<media xmlns='urn:xmpp:tmp:media-element'
height='80'
width='290'>
@ -382,7 +409,6 @@
</query>
</iq>
]]></example>
<p>The server MAY include an &lt;instructions/&gt; element and a URL using <cite>Out-of-Band Data</cite> (e.g., a web page) in the &QUERY; element (see example above). <cite>In-Band Registration</cite> recommends that the challenger SHOULD submit the completed x:data form, however if it does not understand the form, then it MAY present the instructions and the included URL to the user instead of providing the required information in-band.</p>
<example caption='Entity Provides Required Information In-Band'><![CDATA[
<iq type='set' xml:lang='en' id='reg2'>
@ -392,6 +418,7 @@
<value>urn:xmpp:tmp:challenge</value>
</field>
<field var='cid'><value>F3A6292C</value></field>
<field var='sid'><value>reg1</value></field>
<field var='answers'><value>3</value></field>
<field var='ocr'><value>7nHL3</value></field>
<field var='username'><value>bill</value></field>
@ -406,10 +433,10 @@
<p>A service that hosts multi-user chat rooms in accordance with <cite>XEP-0045</cite> MAY challenge unknown entities that seek to join such rooms or that send messages in such rooms.</p>
<example caption='Sender Attempts to Join Chat Room'><![CDATA[
<presence from='robot@abuser.com/zombie'
to='friendly-chat@muc.victim.com'/>
to='friendly-chat@muc.victim.com/robot101'/>
]]></example>
<example caption='Challenger Offers a Choice of Challenges to Sender'><![CDATA[
<message from='muc.victim.com'
<message from='friendly-chat@muc.victim.com'
to='robot@abuser.com/zombie'
id='A4C7303D'>
<body>
@ -424,9 +451,9 @@
<field type='hidden' var='FORM_TYPE'>
<value>urn:xmpp:tmp:challenge</value>
</field>
<field type='hidden' var='from'><value>muc.victim.com</value></field>
<field type='hidden' var='sid'><value>spam3</value></field>
<field var='ocr'>
<field type='hidden' var='from'><value>friendly-chat@muc.victim.com</value></field>
<field type='hidden' var='cid'><value>A4C7303D</value></field>
<field var='ocr' label='Enter the text you see'>
<media xmlns='urn:xmpp:tmp:media-element'
height='80'
width='290'>
@ -437,7 +464,7 @@
type='image/jpeg'> ** Base64 encoded image ** </data>
</media>
</field>
<field var='picture_recog'>
<field var='picture_recog' label='Identify the picture'>
<media xmlns='urn:xmpp:tmp:media-element'
height='150'
width='150'>
@ -448,7 +475,7 @@
type='image/jpeg'> ** Base64 encoded image ** </data>
</media>
</field>
<field var='speech_recog'>
<field var='speech_recog' label='Enter the words you hear'>
<media xmlns='urn:xmpp:tmp:media-element'>
<uri type='audio/x-wav'>
http://www.victim.com/challenges/speech.wav?A4C7303D
@ -458,7 +485,7 @@
</uri>
</media>
</field>
<field var='video_recog'>
<field var='video_recog' label='Identity the video'>
<media xmlns='urn:xmpp:tmp:media-element'
height='150'
width='150'>
@ -484,7 +511,7 @@
<p>Challenge types are distinguished by the 'var' attribute of each &lt;field/&gt; element. Several types of challenges are described below. More challenges MAY be documented elsewhere and registered with the XMPP Registrar (see <link url='#registrar-formtypes'>Field Standardization</link>).</p>
</section2>
<section2 topic='SHA-256 Hashcash' anchor='challenge-hashcash'>
<p>The SHA-256 Hashcash challenge is transparent to average PC users. It is indicated when the value of the 'var' attribute is 'SHA-256'. It forces clients to perform CPU-intensive work, making it difficult to send large amounts of spim. This significantly reduces spim, but alone it will not completely stop abusive stanzas from being sent through large collections of 'zombie' computers. <note>The hope is that the extra CPU usage will often be noticed by the owners of the zombie machines, who will be more likely to fix them.</note></p>
<p>The SHA-256 Hashcash challenge is transparent to average PC users. It is indicated when the value of the 'var' attribute is 'SHA-256'. It forces clients to perform CPU-intensive work, making it difficult to send large amounts of abusive traffic. This significantly reduces abusive traffic, but alone it will not completely stop abusive stanzas from being sent through large collections of 'zombie' computers. <note>The hope is that the extra CPU usage will often be noticed by the owners of the zombie machines, who will be more likely to fix them.</note></p>
<p>The challenger MUST set the 'label' attribute of the &lt;field/&gt; element to a hexadecimal random number containing a configured number of bits (e.g., 2<span class='super'>20</span> &#8804; label &lt; 2<span class='super'>21</span>).</p>
<p>To pass the test, the sender MUST return a text string that starts with the JID the sender sent the first stanza to (i.e., the stanza that triggered the challenge). The least significant bits of the SHA-256 hash (see &nistfips180-2;) of the string MUST equal the hexadecimal value specified by the challenger (in the 'label' attribute of the &lt;field/&gt; element). For example, if the 'label' attribute is the 20-bit value 'e03d7' then the following string would be correct:</p>
<code>innocent@victim.com2450F06C173B05E3</code>
@ -500,7 +527,7 @@
<th>Name</th>
<th>Media type</th>
<th>MIME-type</th>
<th>Example generic instructions *</th>
<th>Suggested generic instructions *</th>
</tr>
<tr>
<td>audio_recog</td>
@ -514,7 +541,7 @@
<td>Optical Character Recognition</td>
<td>image</td>
<td>image/jpeg</td>
<td>Enter the code you see</td>
<td>Enter the text you see</td>
</tr>
<tr>
<td>picture_q</td>
@ -589,6 +616,7 @@
<value>urn:xmpp:tmp:challenge</value>
</field>
<field type='hidden' var='from'><value>innocent@victim.com</value></field>
<field type='hidden' var='cid'><value>F3A6292C</value></field>
<field type='hidden' var='sid'><value>spam1</value></field>
<field label='Type the color of a stop light' type='text-single' var='qa'/>
<field label='93C7A' type='text-single' var='SHA-256'/>
@ -626,21 +654,21 @@
<section1 topic='Discontinuation Policy' anchor='stop'>
<p>It is RECOMMENDED that entities employ other techniques to combat abusive stanzas in addition to those described in this document (e.g., see <cite>XEP-0161</cite> and &xep0205;).</p>
<p>It is expected that this protocol will be an important and successful tool for discouraging spim. However, much of its success is dependent on the quality of the CAPTCHAs employed by a particular implementation.</p>
<p>The administrator of a challenger MUST discontinue the use of Robot Challenges under the following circumstances:</p>
<p>It is expected that this protocol will be an important and successful tool for discouraging abusive traffic. However, much of its success is dependent on the quality of the CAPTCHAs and other puzzles employed by a particular implementation.</p>
<p>The administrator of an application that functions as a challenger SHOULD discontinue the use of Robot Challenges under the following circumstances:</p>
<ul>
<li>If he realises that the challenger's challenges are largely ineffective in combating spim, and that the reduction in abuse does not compensate for the inconvenience to humans of responding to the challenger's challenges.</li>
<li>If other, <em>more transparent</em>, techniques being employed by the challenger are so successful that challenges are offering only negligible additional protection against spim.</li>
<li>If the challenger needs no protection at all because it receives only a negligible amount of spim.</li>
<li>If he realises that the challenger's challenges are largely ineffective in combating abusive traffic, and that the reduction in abuse does not compensate for the inconvenience to humans of responding to the challenger's challenges.</li>
<li>If other, <em>more transparent</em>, techniques being employed by the challenger are so successful that challenges are offering only negligible additional protection against abusive traffic.</li>
<li>If the challenger needs no protection at all because it receives only a negligible amount of abusive traffic.</li>
</ul>
</section1>
<section1 topic='Internationalization Considerations' anchor='i18n'>
<p>Each form field SHOULD include a 'label' attribute. If the sender did not include an 'xml:lang' attribute, then the challenger may not know the correct language for the labels. Therefore, depending on user preferences the client that receives a challenge MAY present generic but localized text instead of label text that would not be understood by the user. Recommended generic text (to be suitably localized) is provided by <link url='#table-1'>Table 1</link> in the <link url='#challenge-captcha'>CAPTCHAs</link> section of this document.</p>
<p>Each form field SHOULD include a 'label' attribute. If the sender did not include an 'xml:lang' attribute, then the challenger may not know the correct language for the labels. Therefore, depending on user preferences, the client that receives a challenge MAY present generic but localized text instead of label text that would not be understood by the user. Suggested generic text (to be suitably localized) is provided by <link url='#table-1'>Table 1</link> in the <link url='#challenge-captcha'>CAPTCHAs</link> section of this document.</p>
</section1>
<section1 topic='Security Considerations' anchor='sec'>
<p>This document introduces no security considerations above and beyond those described in <cite>RFC 3920</cite> and <cite>RFC 3921</cite>.</p>
<p>The use of robot challenges is not a panacea, and should be combined with other anti-abuse mechanisms, such as those described in <cite>XEP-0161</cite> and <cite>XEP-0205</cite>. For example, the task of finding solutions to CAPTCHAs and other computational puzzles is becoming easier for computer programs, and in any case can be farmed out to third parties. Therefore challengers should limit the number of triggering stanzas (e.g., registration attempts, subscription requests, or chatroom joins) allowed per JabberID or IP address during any given time period, and may simply refuse repeated stanzas by terminating an XML stream with a &policy; stream error or returning a &notacceptable; stanza error as appropriate. In addition, a challenger should feel free to deploy additional anti-abuse mechanisms as needed.</p>
</section1>
<section1 topic='IANA Considerations' anchor='iana'>
@ -756,6 +784,10 @@
var='SHA-256'
type='text-single'
label='least significant bits of SHA-256 hash of text should equal hexadecimal label'/>
<field
var='sid'
type='hidden'
label='stanza ID'/>
<field
var='speech_q'
type='text-single'
@ -802,6 +834,6 @@
</section1>
<section1 topic='Open Issues' anchor='open'>
<p>Another protocol could allow users to edit the challenges their server will make on their behalf. For example, the number of SHA-256 bits, a personal or original question and answer, a picture, a video, or a sound recording. Of course Aunt Tillie would typically use this feature only if she was plagued by spim.</p>
<p>Another protocol could allow users to edit the challenges their server will make on their behalf. For example, the number of SHA-256 bits, a personal or original question and answer, a picture, a video, or a sound recording. Of course Aunt Tillie would typically use this feature only if she was plagued by abusive traffic.</p>
</section1>
</xep>