<!ENTITY mlwaqas1 "<note>org.jabber.security Mailing List Archive: '[Security] Trivial preimage attack against the entity capabilities protocol)' from 2009-07-22, <<linkurl='https://mail.jabber.org/pipermail/security/2009-July/000812.html'>https://mail.jabber.org/pipermail/security/2009-July/000812.html</link>>.</note>">
<abstract>This document overhauls the XMPP protocol extension Entity Capabilities (XEP-0115). It defines an XMPP protocol extension for broadcasting and dynamically discovering client, device, or generic entity capabilities. In order to minimize network impact, the transport mechanism is standard XMPP presence broadcast (thus forestalling the need for polling related to service discovery data), the capabilities information can be cached either within a session or across sessions, and the format has been kept as small as possible.</abstract>
<remark><ul><li>Add security consideration for caps spamming.</li><li>Properly reference UTF-8 and Base64 RFCs.</li><li>Remove namespace prefixes from examples.</li><li>Properly reference &fullstop;.</li><li>Clarify the first error condition in the Hash Function Input algorithm.</li></ul></remark>
<p>XMPP applications often face choices based on the disco#info (see &xep0030;) exposed by other entities. For example, for a client, knowledge about whether a roster entry is a &xep0369; entity or a normal client is important for user experience. It may also be desirable to provide indicators on the type of client a contact is using (mobile or not).</p>
<p>The canonical way to do so has been issueing <cite>XEP-0030</cite> requests to the entities emitting presence. This, with the evergrowing featureset of XMPP, induces a lot of traffic for all involed parties, especially during startup. This is a waste of resources, as <cite>XEP-0030</cite> information rarely changes and even more, common client configurations and versions share exactly the same information.</p>
<p>&xep0115; has provided the XMPP ecosystem with a way to share this information with less bandwith. Entities using that protocol send a hash of their disco#info result along with presence or stream features. As those hashes can be cached, entities receiving these hashes only need to query the information for each hash once, greatly reducing the Service Discovery traffic.</p>
<li>The hash agility mechanism is underspecified. While it is possible to change the hash function, there is no clearly defined way to send multiple hashes at once to allow for a transition period. Even though it is technically not forbidden to send multiple <cite>XEP-0115</cite><c/> elements with different hashes at once, it is unclear how implementations behave when this happens. Possible issues lie in the use of caps optimization, as well as clients expecting only one <c/> element.</li>
<li>The algorithm to generate the input for the hash function has flaws as pointed out by Waqas Hussain &mlwaqas1;. Even though these flaws have partially been fixed and worked around, the fundamental problem that the structural information of the individual strings from the disco response is lost persists.</li>
<p>The ∩︀ protocol aims to satisfy the following requirements:</p>
<ol>
<li>Entities must be able to participate even if they support only &xmppcore;, &xmppim; and &xep0030;<note>While elements of XEP-0300 are re-used here, full support of XEP-0300 is not formally required to implement this specification.</note>.</li>
<li>Entities must be able to participate without connectivity to services except their own XMPP server and without connectivity to specialized XMPP services, including cached information from those services.</li>
<li>Entities should be able to learn Service Discovery information without actively querying for it.</li>
<li>The bandwidth consumption should be as minimal as possible, while reusing existing specifications.</li>
<li>It must be possible to write &xep0045; and &xep0369; implementations which can forward this protocol with negligible extra work.</li>
<li>Entities must be able to update their published information arbitrarily often in a single presence session.</li>
<li>Entities must be able to be confident that the information obtained from the broadcast is equivalent to the information which would be obtained from querying the generating entity directly at the time the broadcast was generated.</li>
<li>The protocol must be able to coexist (but not necessarily exchange information) with &xep0115;.</li>
<li>No special XML features beyond what is needed to implement <cite>XMPP Core</cite> itself should be required.</li>
<li>Obsoletion of hash functions should not need a new version of the specification.</li>
<di><dt>Capability Hash</dt><dd>A tuple of hash function and hash value generated as described in the <linkurl='#algorithm'>Hash Function Input</link> section.</dd></di>
<di><dt>Capability Hash Cache</dt><dd>A mapping which maps &hashes; to disco#info <query/> responses with an empty 'node' attribute.</dd></di>
<di><dt>Cabability Hash Node</dt><dd>The name of a <cite>XEP-0030</cite> 'node' for a given &hash;. See <linkurl='#algorithm-hashnodes'>Construction of Capability Hash Nodes</link>.</dd></di>
<di><dt>Capability Hash Set</dt><dd>A set of &hashes; which cover the same <cite>XEP-0030</cite> response, possibly in the form of a <c/> element with &xep0300;<hash/> children.</dd></di>
<di><dt>Generating Entity</dt><dd>An entity which emits a &hashset; to other entities.</dd></di>
<di><dt>Processing Entity</dt><dd>An entity which receives and processes a &hashset; from a &genent;.</dd></di>
<di><dt>Query Interception</dt><dd>Server-side processing of disco#info queries directed to a resource based on the &hashsets; published by that resource.</dd></di>
<di><dt>Gratuitous Capabilities</dt><dd>The sending of a &hashset; to a server before initial presence has been sent and without being asked by the server.</dd></di>
<p>The following algorithms provide data which is sent using this protocol.</p>
<section2topic='Hash Function Input'anchor='algorithm-input'>
<p>The input to this algorithm is a &xep0030; disco#info <query/> response. The output is an octet string which can be used as input to a hash function or an error.</p>
<li><p>The algorithm strongly distinguishes between character data (sequences of Unicode code points) and octet strings (sequences of 8-bit bytes). Whenever character data is encoded to octet strings in the following algorithm, the UTF-8 as specified in &rfc3629; encoding is used. Whenever octet strings are sorted in the following algorithm, the i;octet collation as specified in &rfc4790; is used.</p></li>
<li><p>The algorithm uses the <tt>xml:lang</tt> attribute. Implementations must take implicit values for the <tt>xml:lang</tt> attribute into account, for example those inherited from the disco#info, the IQ element, or from the root <stream> tag.</p></li>
<li>If the <query/> element contains any elements except <identity/>, <feature/> (both from the &xep0030; disco#info namespace) or &xep0128; data forms, abort with an error.</li>
<li>If any &xep0128;<x/> element contains a data form which contains a <reported/> or <item/> element, abort with an error.</li>
<li>If any &xep0128;<x/> element does not adhere to the "FORM_TYPE" protocol specified by &xep0068;, abort with an error.</li>
<li><p>Processing of <feature/> elements:</p>
<ol>
<li>For each <feature/> element: Encode the character data of the 'var' attribute and append an octet of value &sepl4;</li>
<li>Join the resulting octet strings together, ordered from lesser to greater.</li>
<li>Append an octet of value &sepl1;.</li>
</ol>
<p>The result of this step is referenced as <em>Features String</em> later.</p>
</li>
<li><p>Processing of <identity/> nodes:</p>
<ol>
<li><p>For each <identity/> node:</p>
<ol>
<li>Encode the character data of the 'category', 'type', 'xml:lang' and 'name' attributes.</li>
<li>Append an octet of value &sepl4; to each resulting octet string.</li>
<li>Join the resulting octet strings together, in the order of 'category', 'type', 'xml:lang' and 'name', resulting in a single octet string for the <identity/> node.</li>
<li>Append an octet of value &sepl3;.</li>
</ol>
</li>
<li>Join the resulting octet strings together, ordered from lesser to greater.</li>
<li>Append an octet of value &sepl1;.</li>
</ol>
<p>The result of this step is referenced as <em>Identities String</em> later.</p>
</li>
<li><p>Processing of &xep0128;<x/> elements:</p>
<ol>
<li><p>For each <x/> element:</p>
<ol>
<li><p>For each <field/> element:</p>
<ol>
<li>Encode the character data of each <value/> element and append an octet of value &sepl4;</li>
<li>Join the resulting octet strings together, ordered from lesser to greater.</li>
<li>Encode the character data of the 'var' attribute and append an octet of value &sepl4; and the result from the previous step.</li>
<li>Append an octet of value &sepl3;.</li>
</ol>
</li>
<li>Join the resulting octet strings together, ordered from lesser to greater.</li>
<li>Append an octet of value &sepl2;.</li>
</ol>
</li>
<li>Join the resulting octet strings together, ordered from lesser to greater.</li>
<li>Append an octet of value &sepl1;.</li>
</ol>
<p>The result of this step is referenced as <em>Extensions String</em> later.</p>
</li>
<li>Join the <em>Features String</em>, <em>Identities String</em> and <em>Extensions String</em> together, in this order. Return the resulting string as result of the algorithm.</li>
</ol>
</section2>
<section2topic='Construction of Capability Hash Sets'anchor='algorithm-hashsets'>
<p>The entity picks a set of hash functions it wishes to use. The set of hash functions MUST include at least one hash function which MUST be implemented according to &xep0300; and SHOULD NOT include any hash functions which MUST NOT be supported according to <cite>XEP-0300</cite>.</p>
<p>Using the algorithm from the previous subsection, the entity calculates the input for the hash functions. It then runs the input through each hash function individually. The resulting tuples of hash algorithm and hash values constitute the &hashset;.</p>
</section2>
<section2topic='Construction of Capability Hash Nodes'anchor='algorithm-hashnodes'>
<p>The &hashnode; is obtained from a &hash; with the following simple algorithm:</p>
<ol>
<li>To the namespace prefix "urn:xmpp:caps#", append the name of the hash function as per &xep0300;.</li>
<li>Split the string into the hash function and the Base64-encoded hash value at the position found in the previous step.</li>
</ol>
</section2>
<section2topic='Verification of a Capability Hash Set'anchor='algorithm-verify'>
<p>The algorithm takes a &hashset; as input and returns successfully if the hash matches and an error otherwise.</p>
<ol>
<li>Pick a &hash; from the &hashset;.</li>
<li>Query the &genent; for disco#info on the &hashnode; for the chosen hash like described above. If the entity returns an error, abort with an error.</li>
<li>Locally calculate the &hash; using the same hash function as in the input as described in the algorithm. If the algorithm exits with an error, abort with an error.</li>
<li>If the hashes do not match, abort with an error.</li>
<p>The two examples walk through the process of constructing a &hashset; for SHA-256 and SHA3-256. The full algorithm for generating the hash function input is explained.</p>
<p>The data from the example was the first entry in the &capsdb; hashes subdirectory which had no data forms at the time of writing. The features have been shuffled to show the sorting step in the algorithm.</p>
<p>The algorithm starts by constructing the <em>Features String</em>. For this, the values of the 'var' attributes of the feature nodes are encoded as UTF-8 and suffixed with &sepl4;. The first three of those features are shown as a hexdump below:</p>
<p>Note the appended 0x1f octet for each of the three strings. Now the strings are ordered using the i;octet collation and concatenated. The result is suffixed with &sepl1;, which gives the following hexdump of the final <em>Features String</em>:</p>
<p>For the <em>Identities String</em>, first the character data of the 'category', 'type', 'xml:lang' and 'name' attributes is encoded as UTF-8 and suffixed with &sepl4;. The resulting individual strings have the following hexdumps:</p>
<p>Normally, a sorting step would occur here. As the example only has a single string, the sorting and joining is a no-op. The string is now suffixed with &sepl1; to get the <em>Identities String</em>:</p>
<p>The <em>Extensions String</em> is simply the &sepl1; used to terminate it as no extensions are contained in the example. Thus, the final input for the hash function is, as hexdump:</p>
<p>The data from the example is the shortest entry from the &capsdb; hashes subdirectory which had data forms and multiple identities at the time of writing. The features have been shuffled to show the sorting step in the algorithm.</p>
<p>We skip over the process for the <em>Features String</em> and only present the final result encoded as base64 for reference:</p>
<p>In the previous example, it was already shown how the individual parts of each <identity/> element are combined. We get the following octet strings as hexdumps:</p>
00000010 ba d0 b0 d0 b1 d0 b1 d0 b5 d1 80 1f 1e |.............|
0000001d
00000000 63 6c 69 65 6e 74 1f 70 63 1f 65 6e 1f 54 6b 61 |client.pc.en.Tka|
00000010 62 62 65 72 1f 1e |bber..|
00000016</code>
<p>The second string is ordered before the first string in the i;octet collation and afterwards the strings are joined and the result is suffixed with &sepl1; to close the identities part of the input. The final <em>Identities String</em> is thus, as hexdump:</p>
<code>
00000000 63 6c 69 65 6e 74 1f 70 63 1f 65 6e 1f 54 6b 61 |client.pc.en.Tka|
00000010 62 62 65 72 1f 1e 63 6c 69 65 6e 74 1f 70 63 1f |bber..client.pc.|
<p>The example has a &xep0128; form. For each field, a string consisting of the 'var' attributes character data and the values is created as per the algorithm:</p>
00000000 6f 73 1f 57 69 6e 64 6f 77 73 1f 1e |os.Windows..|
0000000c
</code>
<p>The strings need to be sorted using i;octet and joined together. The result is suffixed with &sepl2;, which closes the form. As this is the only form, the resulting <em>Extensions String</em> is obtained by adding a &sepl1; to close the extensions section of the hash input:</p>
<p>Note the "os" field is now before the other fields but after "FORM_TYPE", due to the sorting.</p>
<p>The final hash function input is obtained by concatenating the <em>Features String</em>, <em>Identities String</em> and <em>Extensions String</em>:</p>
<p>When a connected client or peer server sends a service discovery information request to determine the entity capabilities of a server that advertises capabilities via the stream feature, the requesting entity MUST send the disco#info request to the server's JID as provided in the 'from' attribute of the response stream header. To enable this functionality, a server that advertises support for entity capabilities MUST provide a 'from' address in its response stream headers, in accordance with &rfc6120;.</p>
</section2>
<section2topic='Advertising Support of Caps Optimizations'anchor='usecases-support'>
<p>If a server supports Caps Optimizations, it MUST advertise the fact by returning a feature of "urn:xmpp:caps:optimize".</p>
<examplecaption='Response to a disco#info request'><![CDATA[
<p>The <hash/> element is specified by &xep0300; and is used to transport the &hashes;.</p>
</section2>
<section2topic='Service Discovery Query for a Specific Hash Value'anchor='usecases-hash-query'>
<p>To query the &xep0030; information for a specific &hash; value, an entity MUST query a Service Discovery node equal to the &hashnode;<note>As outlined in the Business Rules, this statement does not oblige an entity to actually perform this query.</note>.</p>
<p>An entity is free to choose for which &hash; of a &hashset; the request is sent.</p>
<examplecaption='Service Discovery request in response to a broadcast Capability Hash Set'><![CDATA[
<p>A server MAY support pushing of &hashes; from clients before sending initial presence. This allows servers to discover capabilities of clients before those have sent initial presence, which may be useful or important for some protocols (such as &xep0369;). This feature is called &gratcaps;.</p>
<p>To advertise support, the server publishes the <tt>urn:xmpp:caps:gratuitous</tt> feature:</p>
<examplecaption='Response to a disco#info request if the server supports Gratuitous Capabilities'><![CDATA[
<p>The server replies with an empty result on success.</p>
<p>The server MUST NOT broadcast the &hashes; submitted via &gratcaps; using presence.</p>
<p>Clients SHOULD NOT send &gratcaps; after they have sent initial presence; instead, they SHOULD re-send presence to update the &hashes;. Otherwise, entities subscribed to the presence will not receive the updated &hashes;.</p>
<section2topic='Rules for Generating Entities'anchor='rules-generating'>
<ul>
<li>Entities MUST respond to disco#info queries for all &hashnodes; of at least the most recent 3 &hashsets; emitted.</li>
<li>Entities MUST broadcast the &hashset; of the current disco#info it publishes in every non-directed "available" <presence/> they send and SHOULD do so for directed "available" <presence/>.</li>
<li>After initial presence has been sent, entities MUST re-broadcast the &hashset; after their disco#info response changes, but MAY limit the rate at which presences are emitted solely for the purpose of sending new &hashsets;.</li>
<li>Before initial presence has been sent and if the server supports &gratcaps;, entities SHOULD send &gratcaps; after their disco#info response changes, but MAY limit the rate at which &gratcaps; are sent. (For example, a client may load and enable additional functionality (thus changing its features) based on server support and only send &gratcaps; once all functionality has been set up, not after each individual feature.)</li>
<p>Instead of issuing a &xep0030; disco#info <query/> with absent 'node' attribute to a target entity, an entity MAY use a &hashcache; to obtain the response. To look up the disco#info response in the &hashcache;, an entity MUST use a hash from the &hashset; which was most recently received from the entity to which the <query/> would have been sent otherwise. If none of the most recently received &hashes; are found in the &hashcache;, the entity MUST fall back to sending the request.</p>
<p>An entity MUST ensure that implicit values for <tt>xml:lang</tt> attributes is preserved when disco#info data is cached. This can for example happen by making the implicit values explicit in the storage.</p>
<li>Servers MUST ensure that the first presence notification sent to each subscriber contains the most recent <c/> element, if any were sent in the current presence session.</li>
<li>Servers MUST ensure that every change in the <c/> element is sent to all subscribers.</li>
<li>Clients MAY omit the <c/> element if it has not changed since the last presence <em>iff</em> they determined that their server supports Caps Optimization.</li>
<li>Servers MAY answer disco#info requests for &hashnodes; on behalf of their and others clients if the disco#info response belonging to that &hash; is known to them.</li>
<p>Servers MAY implement &queryintercept; to further optimise bandwidth consumption. The idea is that servers intercept &xep0030; disco#info queries sent to clients if they already know the answer from &hashes; published by the client. The rules for &queryintercept; are the following (to be applied in this order):</p>
<ul>
<li>Servers MUST NOT intercept disco#info queries except those with empty <tt>node</tt> or a <tt>node</tt> which refers to a &hashnode; known to the server.</li>
<li>Servers MUST NOT intercept disco#info queries on behalf of the resource unless the query would be forwarded to the resource otherwise.</li>
<li>Servers MUST NOT intercept disco#info queries to resources which do not support ∩︀ (clients not implementing ∩︀ may legitimately use disco#info nodes matching the format of &hashnodes; for different purposes).</li>
<li>Servers SHOULD intercept disco#info queries with empty <tt>node</tt> and answer them with the disco#info of the most recent &hashset; published by the client.</li>
<li>Servers SHOULD intercept disco#info queries a valid &hashnode;<tt>node</tt>, if the server knows the disco#info for the &hashnode;. Otherwise, the query MUST be forwarded to the addressed resource. Note that it is valid for a sevrer to reply for &hashnodes; which have not been published by the resource.</li>
<p>It is RECOMMENDED that entities use the caching mechanisms outlined in <linkurl='#rules-processing-caches'>the Caching Business Rules</link>. Entities MAY share caches among connections and accounts.</p>
</section2>
<section2topic='Upgrading from XEP-0115'anchor='impl-upgrade'>
<p>&genents; are encouraged to also emit &xep0115;<c/> elements in their presence updates (as specified in <cite>XEP-0115</cite>) for a reasonable transition period.</p>
<p>When receiving a &hashset; along with <cite>XEP-0115</cite> capabilities, a &procent; MAY obtain the disco#info <query/> for verification from a <cite>XEP-0115</cite> based cache instead of querying the &genent; directly. A &procent; MUST NOT use disco#info data from a <cite>XEP-0115</cite> cache without verification if a ∩︀<c/> element is available.</p>
<p>The codepoints used for separating the different parts in the <linkurl='#algorithm-input'>Hash Function Input Algortihm</link> (&sepl1; through &sepl4;) are not allowed in well-formed XML character data. As entities are, per &xmppcore;, required to close a stream if non-well-formed XML data is received, these codepoints cannot occur in the input to the algorithm and their use as separators is safe.</p>
<p>If the algorithm for constructing the input to the hash function or the used hash function itself allow for cheap collisions, caching the hashes will become dangerous as it allows for cache poisoning. This in turn allows entities to effectively fake disco#info responses of other entities.</p>
<p>This was an issue with &xep0115; and has been addressed with a new algorithm for generating the hash function input which keeps the structural information of the disco#info input.</p>
<p>An entity MUST NOT ever use disco#info which has not been verified to belong to a &hash; obtained from a cache using that &hash;. Using cache contents from a trusted source (at the discretion of the entity) counts as verifying.</p>
<p>A malicious entity could send a large amount of &hashsets; in short intervals, while making sure that it provides matching disco#info responses. If a &procent; uses caching, this can overflow or thrash the caches. &procents; should be aware of this risk and apply proper rate-limiting for processing &hashsets;. To reduce the attack surface, an entity MAY choose to not cache &hashes; obtained from entities not in its roster.</p>
<p>As mentioned earlier, when storing disco#info data in a cache for later retrieval, implementations MUST ensure that implicit values for <tt>xml:lang</tt> attributes are reconstructed correctly when the disco#info is restored.</p>
<p>Entities MAY choose to not send &hashsets; with directed presence (for example to increase privacy). In that case, entities SHOULD also refuse direct &xep0030; queries.</p>
<p>The server replies to certain disco#info queries on behalf of the client. This means that the client has no choice on to whom they reply. Otherwise, a client could choose to reply with <tt><service-unavailable/></tt> to mask its existence. We consider two effects of this:</p>
<ul>
<li>
<p>A remote entity could attempt to detect that an entity exists behind a resource. For this, they send a disco#info query to the resource since nearly everyone implements disco#info. As the client responds with <tt><service-unavailable/></tt>, it looks as if no client was present at this resource.</p>
<p>With &queryintercept;, the server would reply on behalf of the client. However, the consensus in the community is that by measuring the difference between the reply from the server of the resource and the reply from the actual resource, it would generally be possible to detect the existence of a resource.</p>
</li>
<li>
<p>A remote entity can obtain the disco#info information of any resource which supports ∩︀ and of which the entity knows the resource.</p>
<p>This cannot be mitigated with &queryintercept;. The risk is deemed acceptable considering that resources should generally be chosen randomly.</p>
<p>A common way to canonicalize XML which could be used is &w3canon;. It was decided not to use <cite>Canonical XML</cite> for the following reasons:</p>
<ul>
<li>Implementing it is quite some effort and not all XML libraries come with an implementation.</li>
<li>It is sensitive to the relative ordering of the elements. The relative ordering of children in disco#info <query/> elements, however, does not matter.</li>
<li>Several children of &xep0128; data forms are deliberately ignored, like instructions and other descriptive text. The descriptive text is not relevant for the information is being conveyed.</li>
</ul>
<p>Thus, using <cite>Canonical XML</cite> would require additional, non-trivial software support and still require non-trivial additional canonicalization rules.</p>
<p>Thanks to the authors of &xep0115; for coming up with the original idea of using presence broadcast to convey service discovery information, as well as the optimization strategies.</p>
<p>The note below the example in <linkurl='#usecases-stream-feature'>Advertisement of Support and Capabilities by Servers</link> has been copied verbatimly from <cite>XEP-0115</cite>.</p>
<p>Thanks to Waqas Hussain for originally (to my knowledge) pointing out the security flaws in <cite>XEP-0115</cite> (see &mlwaqas1;).</p>