xeps/xep-0287.xml

<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE xep SYSTEM 'xep.dtd' [
  <!ENTITY % ents SYSTEM 'xep.ent'>
%ents;
]>
<?xml-stylesheet type='text/xsl' href='xep.xsl'?>
<xep>
<header>
  <title>Spim Markers and Reports</title>
  <abstract>This document defines an XMPP protocol extension that enables XMPP entities to interact with spim filters by marking unsolicited or suspicious XMPP stanzas.</abstract>
  &LEGALNOTICE;
  <number>0287</number>
  <status>Deferred</status>
  <type>Standards Track</type>
  <sig>Standards</sig>
  <approver>Council</approver>
  <dependencies>
    <spec>XMPP Core</spec>
    <spec>XEP-0001</spec>
    <spec>XEP-0030</spec>
  </dependencies>
  <supersedes/>
  <supersededby/>
  <shortname>NOT_YET_ASSIGNED</shortname>
  <author>
    <firstname>Evgeniy</firstname>
    <surname>Khramtsov</surname>
    <email>ekhramtsov@process-one.net</email>
    <jid>xram@jabber.ru</jid>
  </author>
  <revision>
    <version>0.1</version>
    <date>2010-10-04</date>
    <initials>psa</initials>
    <remark><p>Initial published version.</p></remark>
  </revision>
  <revision>
    <version>0.1</version>
    <date>2010-09-13</date>
    <initials>evk</initials>
    <remark><p>Initial version.</p></remark>
  </revision>
</header>
<section1 topic='Introduction' anchor='intro'>
  <p>There are various spim protection methods exist in XMPP: &xep0016;, &xep0158;, &xep0191;, &xep0268; and &xep0275;. But they may not be sufficient enough:</p>
  <ul>
    <li>&xep0016; and &xep0191; define blocking mechanism only which is not always appropriate.</li>
    <li>&xep0158; interacts badly with automated software such as gateways.</li>
    <li>&xep0268; implies trusted network of servers.</li>
    <li>&xep0275; concentrates on ranking only.</li>
  </ul>
  <p>Service administrators might want to deploy server-based spim recognition software to fill in the gaps. However, every automated spim recognition suffers from <em>false positives</em> - situations where a stanza incorrectly qualified as spim. To avoid them, a spim filter doesn't block suspicious stanza, but marks it and sends to a client in a regular manner. A client software doesn't need to interrupt a user when processing such marked stanzas: for example, it may put them silently in "SPAM" folder, so a user can look through them at any time later. Furthermore, a spim filter may take user's experience into account. When a user receives an unsolicited stanza, he or she can mark it as spim. In this case a client software sends an automatic complaint to a server-based spim filter. This specification deals with both cases. Thus, in contrast to &xep0159;, it doesn't introduce any spim blocking techniques. Also, the various spim recognition procedures that may be employed by the server are beyond the scope of this document.</p>
</section1>
<section1 topic='Requirements' anchor='reqs'>
  <p>An implementation compliant with this document MUST support spim markers as described in <link url='#spim-marker'>Spim Marker</link> use case. Support for spim reports, as described in <link url='#spim-report'>Spim Report</link> use case, is RECOMMENDED.</p>
</section1>
<section1 topic='Glossary' anchor='glossary'>
  <p>The following terms are used throughout this document:</p>
  <dl>
    <di><dt>Filtering Entity</dt><dd>An XMPP entity which performs spim recognitions, blocks or marks suspicious stanzas and accepts spim reports. Example: a server or an external component with built-in spim recognition module.</dd></di>
    <di><dt>Receiving Entity</dt><dd>An XMPP entity which directly receives marked stanzas and sends spim reports. Example: a client or a conference (&xep0045;).</dd></di>
  </dl>
</section1>
<section1 topic='Use Cases' anchor='usecases'>
  <section2 topic='Spim Marker' anchor='spim-marker'>
    <p>The filtering entity marks abusive stanza by adding &lt;mark/&gt; child element qualified by the 'urn:xmpp:spim-marker:0' namespace. The element MUST possess the 'filter' attribute whose value MUST be a full jid of the filtering entity. The &lt;mark/&gt; element MAY contain character data which SHOULD be a human-readable description of the reason to mark. The filtering entity MUST NOT add more than one &lt;mark/&gt; element and MUST delete all other &lt;mark/&gt; elements matching itself before adding a new one. The filtering entity MAY remove any &lt;mark/&gt; elements matching itself even if it doesn't add a new one.</p>
    <example caption="User's Server Marked Abusive Message"><![CDATA[
<message from='robot@abuser.com/zombie'
         to='innocent@victim.com/laptop'
         id='spam1'>
  <body>Love pills - 75% OFF</body>
  <mark xmlns='urn:xmpp:spim-marker:0'
	filter='victim.com'/>
    Unsolicited advertising
  </mark>
</message>
]]></example>
    <example caption="Several Services Marked Abusive Message"><![CDATA[
<message from='robot@abuser.com/zombie'
         to='innocent@victim.com/laptop'
         id='spam1'>
  <subject>You won $1,000,000!</subject>
  <body>Visit http://www.abuser.com/</body>
  <mark xmlns='urn:xmpp:spim-marker:0'
	filter='dnsbl-filter.victim.com'>
    Blocked by too many DNSBLs
  </mark>
  <mark xmlns='urn:xmpp:spim-marker:0'
	filter='bayes-filter.victim.com'/>
</message>
]]></example>
    <p>Processing rules of marked stanzas taken by the receiving entity are beyond the scope of this document. One possible solution is to put such stanzas silently in so-called "SPAM" folder.</p>
  </section2>
  <section2 topic='Spim Report' anchor='spim-report'>
    <p>If the filtering entity wishes to receive abuse report for the stanza, it MUST add &lt;report/&gt; child element qualified by the 'urn:xmpp:spim-report:0' namespace and MUST possess the 'key' and the 'filter' attributes. A value of the 'key' attribute is arbitrary, but SHOULD have at least 128 bits of randomness. The 'key' attribute is needed to match the corresponding complaint (if any) with the sender. The value of the 'filter' attribute MUST be a full jid of the filtering entity. The filtering entity MUST NOT add more than one &lt;report/&gt; element and MUST delete all other &lt;report/&gt; elements matching itself before adding a new one. The filtering entity MAY remove any &lt;report/&gt; elements matching itself even if it doesn't add a new one.</p>
    <example caption="Multiple Filters Wishes to Receive Abuse Report"><![CDATA[
<presence type='subscribe'
          from='robot@abuser.com'
          to='innocent@victim.com'
          id='spam2'>
  <report xmlns='urn:xmpp:spim-report:0'
	  key='571c9641d8442920'
	  filter='filter.victim.com'/>
  <report xmlns='urn:xmpp:spim-report:0'
	  key='b258acbcb4bb8e66ac'
	  filter='victim.com'/>
</presence>
]]></example>
    <p>The receiving entity MAY complain by sending an IQ-set containing the &lt;query/&gt; child element qualified by the 'urn:xmpp:spim-report:0' namespace. A value of the 'filter' attribute MUST be copied in the 'to' attribute of the IQ-set stanza. The element MUST possess 'key' attribute copied from the original stanza.</p>
    <p>The receiving entity MUST ignore any &lt;report/&gt; elements generated by untrusted filtering entities. If there are more than one &lt;report/&gt; element matching the same filtering entity, all of them MUST be ignored.</p>
    <example caption="Receiver Sends Complaint"><![CDATA[
<iq type='set'
    from='innocent@victim.com/laptop'
    to='filter.victim.com'
    id='complaint1'>
  <query xmlns='urn:xmpp:spim-report:0'
	 key='571c9641d8442920'/>
</iq>

<iq type='set'
    from='innocent@victim.com/laptop'
    to='victim.com'
    id='complaint2'>
  <query xmlns='urn:xmpp:spim-report:0'
	 key='b258acbcb4bb8e66ac'/>
</iq>
]]></example>
    <p>The filtering entity MUST respond with an empty IQ-result stanza upon successful completion of the request:</p>
    <example caption="Complaint Was Accepted"><![CDATA[
<iq type='result'
    from='filter.victim.com'
    to='innocent@victim.com/laptop'
    id='complaint1'/>

<iq type='result'
    from='victim.com'
    to='innocent@victim.com/laptop'
    id='complaint2'/>
]]></example>
  </section2>
</section1>
<section1 topic='Business Rules' anchor='rules'>
  <p>A filtering entity SHOULD only add &lt;mark/&gt; or &lt;report/&gt; elements and a receiving entity SHOULD only process those elements if the corresponding stanza envolves an interaction with a human user: subscription requests, messages, conference invites, voice calls, etc. For example, it doesn't make a lot of sense to mark &xep0232; stanzas.</p>
  <p>To avoid obvious false positives and user confusions, a filtering entity SHOULD NOT add &lt;mark/&gt; or &lt;report/&gt; elements to a stanza and a receiving entity SHOULD ignore &lt;mark/&gt; and &lt;report/&gt; elements of a stanza if:</p>
  <ul>
    <li>The receiving entity has the sender's subscription information of the type "both", "from" or "to".</li>
    <li>The receiving entity has pending subscription to the sender, i.e. subscription of type "none" and ask='subscribe'.</li>
    <li>The receiving entity has sent direct presence to the sender.</li>
  </ul>
</section1>
<section1 topic='Determining Support' anchor='support'>
  <p>If an entity supports the spim markers, it MUST report that by including a service discovery feature of "urn:xmpp:spim-marker:0" in response to a &xep0030; information request. If an entity supports the spim reports, it MUST report that by including a service discovery feature of "urn:xmpp:spim-report:0" in response to a &xep0030; information request:</p>
  <example caption="Service Discovery Information Request"><![CDATA[
<iq type='get'
    from='juliet@capulet.lit/balcony'
    to='capulet.lit'
    id='disco1'>
  <query xmlns='http://jabber.org/protocol/disco#info'/>
</iq>
]]></example>
  <example caption="Service Discovery Information Response"><![CDATA[
<iq type='result'
    from='capulet.lit'
    to='juliet@capulet.lit/balcony'
    id='disco1'>
  <query xmlns='http://jabber.org/protocol/disco#info'>
    ...
    <feature var='urn:xmpp:spim-marker:0'/>
    <feature var='urn:xmpp:spim-report:0'/>
    ...
  </query>
</iq>
]]></example>
</section1>
<!--
<section1 topic='Implementation Notes' anchor='impl'>
  <p>OPTIONAL.</p>
</section1>
<section1 topic='Accessibility Considerations' anchor='access'>
  <p>OPTIONAL.</p>
</section1>
<section1 topic='Internationalization Considerations' anchor='i18n'>
  <p>OPTIONAL.</p>
</section1>
-->
<section1 topic='Security Considerations' anchor='security'>
  <section2 topic='CAPTCHA challenges' anchor='captcha-challenges'>
    <p>Care should be taken if a receiving entity chooses to generate a CAPTCHA challenge (&xep0158;) in response to a marked stanza. A spim recognition system rarely has more than 5-10% of false positives. Thus, producing CAPTCHA images or audio/video samples is likely a waste of system resources and also may overload the receiving entity at high rate of spim stanzas.</p>
  </section2>
  <section2 topic='Fake &lt;mark/&gt; element' anchor='fake-mark-element'>
    <p>A rogue server may add fake &lt;mark/&gt; elements to compromise filtering entities: a user may decide to remove such entities from the trusted list because, for example, he or she thinks they produce too many false positives. To avoid such situation, a filtering entity MUST remove any &lt;mark/&gt; elements matching itself before adding new &lt;mark/&gt; element as described in <link url='#spim-marker'>Spim Marker</link> use case. Also, a filtering entity MAY remove any &lt;mark/&gt; elements matching itself even if it doesn't add a new one.</p>
  </section2>
  <section2 topic='Fake &lt;report/&gt; element' anchor='fake-report-element'>
    <p>An attacker may add fake &lt;report/&gt; element. For example, it may do that for checking an activity of the user. To avoid such situation, a receiving entity MUST send spim reports to the trusted filtering entities only as desribed in <link url='#spim-report'>Spim Report</link> use case.</p>
  </section2>
  <section2 topic='Multiple fake &lt;report/&gt; elements' anchor='multiple-fake-reports'>
    <section3 topic='Single filtering entity' anchor='multiple-reports-single'>
      <p>An attacker may add thousands of fake &lt;report/&gt; elements matching the single trusted filtering entity in one stanza. A poorly written receiving entity may generate a complaint for all of them. As an effect, a distributed DoS attack on the filtering entity is performed if there are multiple receiving entities envolved. To avoid such situation, a receiving entity MUST ignore multiple &lt;report/&gt; elements matching the same filtering entity as desribed in <link url='#spim-report'>Spim Report</link> use case.</p>
      <p>In its turn, a filtering entity MUST remove any &lt;report/&gt; elements matching itself before adding new &lt;report/&gt; element as described in <link url='#spim-report'>Spim Report</link> use case. Thus, it is guaranteed that the element will not be ignored by the receiving entity.</p>
    </section3>
    <section3 topic='Several filtering entities' anchor='multiple-reports-several'>
      <p>An attacker may gain an information about user's trusted filtering entities. In this case he or she may add the &lt;report/&gt; element per every such entity in one stanza. If there are too many filtering entities in the list, a user may generate enormous traffic when generating spim reports. Although this attack is not very effective, a client software MUST not generate spim reports without user's acknowledgement.</p>
    </section3>
  </section2>
  <section2 topic='Fake IQ-set report' anchor='fake-iq-set-report'>
    <p>An attacker may try to mark an innocent user as a spimmer by producing several IQ-set stanzas  qualified by "urn:xmpp:spim-report:0" containing different value of the 'key' attribute each (so-called "dictionary attack"). As a protection, sanity checks MUST be performed when processing such reports. For example, if a filtering entity doesn't store any information about a receiving entity, the value of the 'key' attribute SHOULD have at least 128 bits of randomness.</p>
  </section2>
</section1>
<section1 topic='IANA Considerations' anchor='iana'>
  <p>This document requires no interaction with &IANA;.</p>
</section1>
<section1 topic='XMPP Registrar Considerations' anchor='registrar'>
  <section2 topic='Protocol Namespaces' anchor='registrar-ns'>
    <p>This specification defines the following XML namespaces:</p>
    <ul>
      <li>urn:xmpp:spim-marker:0</li>
      <li>urn:xmpp:spim-report:0</li>
    </ul>
    <p>Upon advancement of this specification from a status of Experimental to a status of Draft, the &REGISTRAR; shall add the foregoing namespace to the registry located at &NAMESPACES;, as described in Section 4 of &xep0053;.</p>
  </section2>
</section1>
<section1 topic='XML Schema' anchor='schema'>
  <section2 topic='urn:xmpp:spim-marker:0' anchor='schema-marker'>
    <code><![CDATA[
<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='urn:xmpp:spim-marker:0'
    xmlns='urn:xmpp:spim-marker:0'
    elementFormDefault='qualified'>

  <xs:annotation>
    <xs:documentation>
      The protocol documented by this schema is defined in
      XEP-xxxx: http://www.xmpp.org/extensions/xep-xxxx.html
    </xs:documentation>
  </xs:annotation>

  <xs:element name='mark'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='xs:string'>
          <xs:attribute
             name='filter'
             type='xs:string'
             use='required'/>
	  <xs:attribute
	     name='reason'
	     type='xs:string'
	     use='optional'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

</xs:schema>
]]></code>
  </section2>
  <section2 topic='urn:xmpp:spim-report:0' anchor='schema-report'>
    <code><![CDATA[
<?xml version='1.0' encoding='UTF-8'?>

<xs:schema
    xmlns:xs='http://www.w3.org/2001/XMLSchema'
    targetNamespace='urn:xmpp:spim-report:0'
    xmlns='urn:xmpp:spim-report:0'
    elementFormDefault='qualified'>

  <xs:annotation>
    <xs:documentation>
      The protocol documented by this schema is defined in
      XEP-xxxx: http://www.xmpp.org/extensions/xep-xxxx.html
    </xs:documentation>
  </xs:annotation>

  <xs:element name='query'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='xs:string'>
	  <xs:attribute
	     name='key'
	     type='xs:string'
	     use='required'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

  <xs:element name='report'>
    <xs:complexType>
      <xs:simpleContent>
        <xs:extension base='xs:string'>
          <xs:attribute
             name='filter'
             type='xs:string'
             use='required'/>
	  <xs:attribute
	     name='key'
	     type='xs:string'
	     use='required'/>
        </xs:extension>
      </xs:simpleContent>
    </xs:complexType>
  </xs:element>

</xs:schema>
]]></code>
  </section2>
</section1>
<section1 topic='Acknowledgements' anchor='ack'>
  <p>Thanks to Sergei Golovan for the feedback.</p>
</section1>
</xep>