ecaps2 (XEP-0390) to version 0.1

* Some editing
* Explicitly mention the possible need for rate-limiting of caps
  processing in Security Considerations
* Reference UTF-8 and Base64 RFCs
* Reference U+002E properly
* Clarify error condition for Hash Function Input algorithm
This commit is contained in:
Jonas Wielicki 2017-03-23 17:25:13 +01:00 committed by Sam Whited
parent 3d35b24f57
commit 5b588a1876
1 changed files with 25 additions and 16 deletions

View File

@ -15,6 +15,7 @@
<!ENTITY sepl3 "0x1e (ASCII Record Separator)">
<!ENTITY sepl2 "0x1d (ASCII Group Separator)">
<!ENTITY sepl1 "0x1c (ASCII File Separator)">
<!ENTITY fullstop 'FULL STOP character (U+002E, ".")'>
<!ENTITY procent "<em>Processing Entity</em>">
<!ENTITY procents "<em>Processing Entities</em>">
<!ENTITY genent "<em>Generating Entity</em>">
@ -50,6 +51,12 @@
<email>jonas@wielicki.name</email>
<jid>jonas@wielicki.name</jid>
</author>
<revision>
<version>0.1</version>
<date>2017-03-23</date>
<initials>jwi</initials>
<remark><ul><li>Add security consideration for caps spamming.</li><li>Properly reference UTF-8 and Base64 RFCs.</li><li>Remove namespace prefixes from examples.</li><li>Properly reference &fullstop;.</li><li>Clarify the first error condition in the Hash Function Input algorithm.</li></ul></remark>
</revision>
<revision>
<version>0.0.1</version>
<date>2017-02-28</date>
@ -62,10 +69,10 @@
<p>XMPP applications often face choices based on the disco#info (see &xep0030;) exposed by other entities. For example, for a client, knowledge about whether a roster entry is a &xep0369; entity or a normal client is important for user experience. It may also be desirable to provide indicators on the type of client a contact is using (mobile or not).</p>
<p>The canonical way to do so has been issueing <cite>XEP-0030</cite> requests to the entities emitting presence. This, with the evergrowing featureset of XMPP, induces a lot of traffic for all involed parties, especially during startup. This is a waste of resources, as <cite>XEP-0030</cite> information rarely changes and even more, common client configurations and versions share exactly the same information.</p>
<p>&xep0115; has provided the XMPP ecosystem with a way to share this information with less bandwith. Entities using that protocol send a hash of their disco#info result along with presence or stream features. As those hashes can be cached, entities receiving these hashes only need to query the information for each hash once, greatly reducing the Service Discovery traffic.</p>
<p>However, <cite>XEP-0115</cite> has several flaws:</p>
<p>However, <cite>XEP-0115</cite> has two main flaws:</p>
<ul>
<li>It has no working hash agility mechanism; change of the hash function requires a new specification or change of namespace.</li>
<li>The algorithm to generate the input for the hash function has flaws as pointed out in by Waqas Hussain &mlwaqas1;. Even though these flaws have partially been fixed and worked around, the fundamental problem that the structural information of the individual strings from the disco response is lost persists.</li>
<li>The hash agility mechanism is underspecified. While it is possible to change the hash function, there is no clearly defined way to send multiple hashes at once to allow for a transition period. Even though it is technically not forbidden to send multiple <cite>XEP-0115</cite> &lt;c/&gt; elements with different hashes at once, it is unclear how implementations behave when this happens. Possible issues lie in the use of caps optimization, as well as clients expecting only one &lt;c/&gt; element.</li>
<li>The algorithm to generate the input for the hash function has flaws as pointed out by Waqas Hussain &mlwaqas1;. Even though these flaws have partially been fixed and worked around, the fundamental problem that the structural information of the individual strings from the disco response is lost persists.</li>
</ul>
</section1>
@ -102,9 +109,9 @@
<section2 topic='Hash Function Input' anchor='algorithm-input'>
<p>The input to this algorithm is a &xep0030; disco#info &lt;query/&gt; response. The output is an octet string which can be used as input to a hash function or an error.</p>
<p>General remarks: The algorithm strongly distinguishes between character data (sequences of Unicode code points) and octet strings (sequences of 8-bit bytes). Whenever character data is encoded to octet strings in the following algorithm, the UTF-8 encoding is used. Whenever octet strings are sorted in the following algorithm, the i;octet collation as specified in &rfc4790; is used.</p>
<p>General remarks: The algorithm strongly distinguishes between character data (sequences of Unicode code points) and octet strings (sequences of 8-bit bytes). Whenever character data is encoded to octet strings in the following algorithm, the UTF-8 encoding as specified in &rfc3629; is used. Whenever octet strings are sorted in the following algorithm, the i;octet collation as specified in &rfc4790; is used.</p>
<ol>
<li>If the &lt;query/&gt; element contains any elements except &lt;identity/&gt;, &lt;feature/&gt; or &xep0128; data forms, abort with an error.</li>
<li>If the &lt;query/&gt; element contains any elements except &lt;identity/&gt;, &lt;feature/&gt; (both from the &xep0030; disco#info namespace) or &xep0128; data forms, abort with an error.</li>
<li>If any &xep0128; &lt;x/&gt; element contains a data form which contains a &lt;reported/&gt; or &lt;item/&gt; element, abort with an error.</li>
<li>If any &xep0128; &lt;x/&gt; element does not adhere to the "FORM_TYPE" protocol specified by &xep0068;, abort with an error.</li>
<li><p>Processing of &lt;feature/&gt; elements:</p>
@ -164,13 +171,13 @@
<p>The &hashnode; is obtained from a &hash; with the following simple algorithm:</p>
<ol>
<li>To the namespace prefix "urn:xmpp:caps#", append the name of the hash function as per &xep0300;.</li>
<li>Append a ".".</li>
<li>Append the base64 encoded hash value.</li>
<li>Append a &fullstop;.</li>
<li>Append the Base64 encoded (as specified in &rfc3548;) hash value.</li>
</ol>
<p>The &hashnode; can be decomposed into its original components with the following algorithm:</p>
<ol>
<li>Remove the namespace prefix "urn:xmpp:caps#" from the input.</li>
<li>From the <em>end</em> of the string, start searching for the "." separator.</li>
<li>From the <em>end</em> of the string, start searching for the &fullstop; separator.</li>
<li>Split the string into the hash function and the Base64-encoded hash value at the position found in the previous step.</li>
</ol>
</section2>
@ -316,9 +323,9 @@
000001d9</code>
<p>Running this octet string through the hash functions leads as to the following &hashset;:</p>
<code><![CDATA[
<c xmlns="urn:xmpp:caps" xmlns:hashes="urn:xmpp:hashes:2">
<hashes:hash algo="sha-256">kzBZbkqJ3ADrj7v08reD1qcWUwNGHaidNUgD7nHpiw8=</hashes:hash>
<hashes:hash algo="sha3-256">79mdYAfU9rEdTOcWDO7UEAt6E56SUzk/g6TnqUeuD9Q=</hashes:hash>
<c xmlns="urn:xmpp:caps">
<hash xmlns="urn:xmpp:hashes:2" algo="sha-256">kzBZbkqJ3ADrj7v08reD1qcWUwNGHaidNUgD7nHpiw8=</hash>
<hash xmlns="urn:xmpp:hashes:2" algo="sha3-256">79mdYAfU9rEdTOcWDO7UEAt6E56SUzk/g6TnqUeuD9Q=</hash>
</c>]]></code>
</section3>
<section3 topic='Complex Example' anchor='algorithm-examples-complex'>
@ -551,9 +558,9 @@ cDp0aW1lHxw=</code>
00000543</code>
<p>Feeding the concatenated octet string as input to the hash functions yields the following &hashset;:</p>
<code><![CDATA[
<c xmlns="urn:xmpp:caps" xmlns:hashes="urn:xmpp:hashes:2">
<hashes:hash algo="sha-256">u79ZroNJbdSWhdSp311mddz44oHHPsEBntQ5b1jqBSY=</hashes:hash>
<hashes:hash algo="sha3-256">XpUJzLAc93258sMECZ3FJpebkzuyNXDzRNwQog8eycg=</hashes:hash>
<c xmlns="urn:xmpp:caps">
<hash xmlns="urn:xmpp:hashes:2" algo="sha-256">u79ZroNJbdSWhdSp311mddz44oHHPsEBntQ5b1jqBSY=</hash>
<hash xmlns="urn:xmpp:hashes:2" algo="sha3-256">XpUJzLAc93258sMECZ3FJpebkzuyNXDzRNwQog8eycg=</hash>
</c>]]></code>
</section3>
</section2>
@ -723,7 +730,7 @@ cDp0aW1lHxw=</code>
<section2 topic='Rules for Processing Entities' anchor='rules-processing'>
<ul>
<li>Entities MAY limit the rate at which they process incoming &hashsets;.</li>
<li>Entities MUST be able to process &hashnodes; which use a hash function whose name includes the "." character.</li>
<li>Entities MUST be able to process &hashnodes; which use a hash function whose name includes the &fullstop;.</li>
<li>Entities MAY verify incoming &hashsets;.</li>
<li>Entities MUST NOT expect to receive &hashsets; on every presence sent by an entity supporting &caps;.</li>
</ul>
@ -758,13 +765,14 @@ cDp0aW1lHxw=</code>
<section1 topic='Security Considerations' anchor='security'>
<section2 topic='Hash Function Input Data Separators' anchor='security-separators'>
<p>The codepoints used for separating the different parts in the <link url='#algorithm-input'>Hash Function Input Algortihm</link> (&sepl4; through &sepl1;) are not allowed in well-formed XML character data. As entities are, per &xmppcore;, required to close a stream if non-well-formed XML data is received, these codepoints cannot occur in the input to the algorithm and their use as separators is safe.</p>
<p>The codepoints used for separating the different parts in the <link url='#algorithm-input'>Hash Function Input Algortihm</link> (&sepl1; through &sepl4;) are not allowed in well-formed XML character data. As entities are, per &xmppcore;, required to close a stream if non-well-formed XML data is received, these codepoints cannot occur in the input to the algorithm and their use as separators is safe.</p>
</section2>
<section2 topic='Caching' anchor='security-caching'>
<p>If the algorithm for constructing the input to the hash function or the used hash function itself allow for cheap collisions, caching the hashes will become dangerous as it allows for cache poisoning. This in turn allows entities to effectively fake disco#info responses of other entities.</p>
<p>This was an issue with &xep0115; and has been addressed with a new algorithm for generating the hash function input which keeps the structural information of the disco#info input.</p>
<p>An entity MUST NOT ever use disco#info which has not been verified to belong to a &hash; obtained from a cache using that &hash;. Using cache contents from a trusted source (at the discretion of the entity) counts as verifying.</p>
<p>A malicious entity could send a large amount of &hashsets; in short intervals, while making sure that it provides matching disco#info responses. If a &procent; uses caching, this can overflow or thrash the caches. &procents; should be aware of this risk and apply proper rate-limiting for processing &hashsets;. To reduce the attack surface, an entity MAY choose to not cache &hashes; obtained from entities not in its roster.</p>
</section2>
<section2 topic='Directed Presence' anchor='security-directed-presence'>
@ -837,6 +845,7 @@ cDp0aW1lHxw=</code>
<p>Thanks to the authors of &xep0115; for coming up with the original idea of using presence broadcast to convey service discovery information, as well as the optimization strategies.</p>
<p>The note below the example in <link url='#usecases-stream-feature'>Advertisement of Support and Capabilities by Servers</link> has been copied verbatimly from <cite>XEP-0115</cite>.</p>
<p>Thanks to Waqas Hussain for originally (to my knowledge) pointing out the security flaws in <cite>XEP-0115</cite> (see &mlwaqas1;).</p>
<p>Thanks to Georg Lukas, Link Mauve, Sebastian Riese, Florian Schmaus and Sam Whithed for their input, editorial and otherwise.</p>
</section1>
</xep>