Merge branch 'feature/xep0390-0.2' of https://github.com/horazont/xeps into feature/xep-0390

This commit is contained in:
Jonas Wielicki 2017-08-23 16:25:45 +02:00
commit 062611e227
1 changed files with 91 additions and 4 deletions

View File

@ -20,6 +20,8 @@
<!ENTITY procents "<em>Processing Entities</em>">
<!ENTITY genent "<em>Generating Entity</em>">
<!ENTITY genents "<em>Generating Entities</em>">
<!ENTITY queryintercept "<em>Query Interception</em>">
<!ENTITY gratcaps "<em>Gratuitous Capabilities</em>">
<!ENTITY mlwaqas1 "<note>org.jabber.security Mailing List Archive: '[Security] Trivial preimage attack against the entity capabilities protocol)' from 2009-07-22, &lt;<link url='https://mail.jabber.org/pipermail/security/2009-July/000812.html'>https://mail.jabber.org/pipermail/security/2009-July/000812.html</link>&gt;.</note>">
<!ENTITY capsdb "<span class='ref'><link url='https://github.com/xnyhps/capsdb/'>capsdb</link></span> <note><link url='https://github.com/xnyhps/capsdb/'>https://github.com/xnyhps/capsdb/</link></note>">
]>
@ -51,6 +53,18 @@
<email>jonas@wielicki.name</email>
<jid>jonas@wielicki.name</jid>
</author>
<revision>
<version>0.2</version>
<date>2017-06-14</date>
<initials>jwi</initials>
<remark>
<ul>
<li>Clearly specify handling of xml:lang attributes.</li>
<li>Add Query Interception.</li>
<li>Add Gratuitous Caps.</li>
</ul>
</remark>
</revision>
<revision>
<version>0.1</version>
<date>2017-03-23</date>
@ -85,11 +99,12 @@
<li>The bandwidth consumption should be as minimal as possible, while reusing existing specifications.</li>
<li>It must be possible to write &xep0045; and &xep0369; implementations which can forward this protocol with negligible extra work.</li>
<li>Entities must be able to update their published information arbitrarily often in a single presence session.</li>
<li>Server infrastructure beyond <cite>XMPP Core</cite> and <cite>XMPP IM</cite> must not be required for this to work.</li>
<li>Server infrastructure beyond <cite>XMPP Core</cite> and <cite>XMPP IM</cite> must not be required for this to work (but may be beneficial).</li>
<li>Entities must be able to be confident that the information obtained from the broadcast is equivalent to the information which would be obtained from querying the generating entity directly at the time the broadcast was generated.</li>
<li>The protocol must be able to coexist (but not necessarily exchange information) with &xep0115;.</li>
<li>No special XML features beyond what is needed to implement <cite>XMPP Core</cite> itself should be required.</li>
<li>Obsoletion of hash functions should not need a new version of the specification.</li>
<li>Support for pushing Entity Capabilities to the clients server without sending presence.</li>
</ol>
</section1>
@ -101,6 +116,8 @@
<di><dt>Capability Hash Set</dt><dd>A set of &hashes; which cover the same <cite>XEP-0030</cite> response, possibly in the form of a &lt;c/&gt; element with &xep0300; &lt;hash/&gt; children.</dd></di>
<di><dt>Generating Entity</dt><dd>An entity which emits a &hashset; to other entities.</dd></di>
<di><dt>Processing Entity</dt><dd>An entity which receives and processes a &hashset; from a &genent;.</dd></di>
<di><dt>Query Interception</dt><dd>Server-side processing of disco#info queries directed to a resource based on the &hashsets; published by that resource.</dd></di>
<di><dt>Gratuitous Capabilities</dt><dd>The sending of a &hashset; to a server before initial presence has been sent and without being asked by the server.</dd></di>
</dl>
</section1>
@ -109,7 +126,11 @@
<section2 topic='Hash Function Input' anchor='algorithm-input'>
<p>The input to this algorithm is a &xep0030; disco#info &lt;query/&gt; response. The output is an octet string which can be used as input to a hash function or an error.</p>
<p>General remarks: The algorithm strongly distinguishes between character data (sequences of Unicode code points) and octet strings (sequences of 8-bit bytes). Whenever character data is encoded to octet strings in the following algorithm, the UTF-8 encoding as specified in &rfc3629; is used. Whenever octet strings are sorted in the following algorithm, the i;octet collation as specified in &rfc4790; is used.</p>
<div class="box"><p>General remarks:</p>
<ul>
<li><p>The algorithm strongly distinguishes between character data (sequences of Unicode code points) and octet strings (sequences of 8-bit bytes). Whenever character data is encoded to octet strings in the following algorithm, the UTF-8 as specified in &rfc3629; encoding is used. Whenever octet strings are sorted in the following algorithm, the i;octet collation as specified in &rfc4790; is used.</p></li>
<li><p>The algorithm uses the <tt>xml:lang</tt> attribute. Implementations must take implicit values for the <tt>xml:lang</tt> attribute into account, for example those inherited from the disco#info or the IQ element.</p></li>
</ul></div>
<ol>
<li>If the &lt;query/&gt; element contains any elements except &lt;identity/&gt;, &lt;feature/&gt; (both from the &xep0030; disco#info namespace) or &xep0128; data forms, abort with an error.</li>
<li>If any &xep0128; &lt;x/&gt; element contains a data form which contains a &lt;reported/&gt; or &lt;item/&gt; element, abort with an error.</li>
@ -714,6 +735,44 @@ cDp0aW1lHxw=</code>
</iq>
]]></example>
</section2>
<section2 topic='Gratuitous Capabilities' anchor='usecases-gratuitous'>
<p>A server MAY support pushing of &hashes; from clients before sending initial presence. This allows servers to discover capabilities of clients before those have sent initial presence, which may be useful or important for some protocols (such as &xep0369;). This feature is called &gratcaps;.</p>
<p>To advertise support, the server publishes the <tt>urn:xmpp:caps:gratuitous</tt> feature:</p>
<example caption='Response to a disco#info request if the server supports Gratuitous Capabilities'><![CDATA[
<iq from='montague.lit'
id='disco3'
to='romeo@montague.lit/chamber'
type='result'>
<query xmlns='http://jabber.org/protocol/disco#info'>
...
<feature var='urn:xmpp:caps'/>
<feature var='urn:xmpp:caps:gratuitous'/>
...
</query>
</iq>
]]></example>
<p>After determining server support, a client can send &hashes; via &gratcaps; before sending initial presence:</p>
<example caption='Sending Gratuitous Capabilities'><![CDATA[
<iq from='romeo@montague.lit/chamber'
to='montague.lit'
id='grat1'
type='set'>
<c xmlns="urn:xmpp:caps">
<hash xmlns="urn:xmpp:hashes:2" algo="sha-256">u79ZroNJbdSWhdSp311mddz44oHHPsEBntQ5b1jqBSY=</hash>
<hash xmlns="urn:xmpp:hashes:2" algo="sha3-256">XpUJzLAc93258sMECZ3FJpebkzuyNXDzRNwQog8eycg=</hash>
</c>
</iq>
<iq from='montague.lit'
to='romeo@montague.lit/chamber'
id='grat1'
type='result'>
</iq>
]]></example>
<p>The server replies with an empty result on success.</p>
<p>The server MUST NOT broadcast the &hashes; submitted via &gratcaps; using presence.</p>
<p>Clients SHOULD NOT send &gratcaps; after they have sent initial presence; instead, they SHOULD re-send presence to update the &hashes;. Otherwise, entities subscribed to the presence will not receive the updated &hashes;.</p>
</section2>
</section1>
<section1 topic='Business Rules' anchor='rules'>
@ -721,7 +780,8 @@ cDp0aW1lHxw=</code>
<ul>
<li>Entities MUST respond to disco#info queries for all &hashnodes; of at least the most recent 3 &hashsets; emitted.</li>
<li>Entities MUST broadcast the &hashset; of the current disco#info it publishes in every non-directed "available" &lt;presence/&gt; they send and SHOULD do so for directed "available" &lt;presence/&gt;.</li>
<li>Entities MUST re-broadcast the &hashset; after their disco#info response changes, but MAY limit the rate at which presences are emitted solely for the purpose of sending new &hashsets;.</li>
<li>After initial presence has been sent, entities MUST re-broadcast the &hashset; after their disco#info response changes, but MAY limit the rate at which presences are emitted solely for the purpose of sending new &hashsets;.</li>
<li>Before initial presence has been sent and if the server supports &gratcaps;, entities SHOULD send &gratcaps; after their disco#info response changes, but MAY limit the rate at which &gratcaps; are sent. (For example, a client may load and enable additional functionality (thus changing its features) based on server support and only send &gratcaps; once all functionality has been set up, not after each individual feature.)</li>
<li>Entities MAY assume that another entity supports &caps; after receiving a &hashset; from that entity.</li>
<li>Entities MAY also send &xep0115; capabilities to support legacy entities.</li>
</ul>
@ -739,6 +799,7 @@ cDp0aW1lHxw=</code>
<p>Instead of issuing a &xep0030; disco#info &lt;query/&gt; with absent 'node' attribute to a target entity, an entity MAY use a &hashcache; to obtain the response. To look up the disco#info response in the &hashcache;, an entity MUST use a hash from the &hashset; which was most recently received from the entity to which the &lt;query/&gt; would have been sent otherwise. If none of the most recently received &hashes; are found in the &hashcache;, the entity MUST fall back to sending the request.</p>
<p>An entity MUST NOT use &hashes; which were not included in the most recent &hashset; received from the target entity.</p>
<p>An entity MAY use external data sources to fill the &hashcache;.</p>
<p>An entity MUST ensure that implicit values for <tt>xml:lang</tt> attributes is preserved when disco#info data is cached. This can for example happen by making the implicit values explicit in the storage.</p>
</section3>
</section2>
@ -751,6 +812,17 @@ cDp0aW1lHxw=</code>
<li>Servers MAY answer disco#info requests for &hashnodes; on behalf of their and others clients if the disco#info response belonging to that &hash; is known to them.</li>
</ul>
</section2>
<section2 topic='Query Interception' anchor='rules-servers'>
<p>Servers MAY implement &queryintercept; to further optimise bandwidth consumption. The idea is that servers intercept &xep0030; disco#info queries sent to clients if they already know the answer from &hashes; published by the client. The rules for &queryintercept; are the following (to be applied in this order):</p>
<ul>
<li>Servers MUST NOT intercept disco#info queries except those with empty <tt>node</tt> or a <tt>node</tt> which refers to a &hashnode; known to the server.</li>
<li>Servers MUST NOT intercept disco#info queries on behalf of the resource unless the query would be forwarded to the resource otherwise.</li>
<li>Servers MUST NOT intercept disco#info queries to resources which do not support &caps; (clients not implementing &caps; may legitimately use disco#info nodes matching the format of &hashnodes; for different purposes).</li>
<li>Servers SHOULD intercept disco#info queries with empty <tt>node</tt> and answer them with the disco#info of the most recent &hashset; published by the client.</li>
<li>Servers SHOULD intercept disco#info queries a valid &hashnode; <tt>node</tt>, if the server knows the disco#info for the &hashnode;. Otherwise, the query MUST be forwarded to the addressed resource. Note that it is valid for a sevrer to reply for &hashnodes; which have not been published by the resource.</li>
</ul>
</section2>
</section1>
<section1 topic='Implementation Notes' anchor='impl'>
@ -773,11 +845,26 @@ cDp0aW1lHxw=</code>
<p>This was an issue with &xep0115; and has been addressed with a new algorithm for generating the hash function input which keeps the structural information of the disco#info input.</p>
<p>An entity MUST NOT ever use disco#info which has not been verified to belong to a &hash; obtained from a cache using that &hash;. Using cache contents from a trusted source (at the discretion of the entity) counts as verifying.</p>
<p>A malicious entity could send a large amount of &hashsets; in short intervals, while making sure that it provides matching disco#info responses. If a &procent; uses caching, this can overflow or thrash the caches. &procents; should be aware of this risk and apply proper rate-limiting for processing &hashsets;. To reduce the attack surface, an entity MAY choose to not cache &hashes; obtained from entities not in its roster.</p>
<p>As mentioned earlier, when storing disco#info data in a cache for later retrieval, implementations MUST ensure that implicit values for <tt>xml:lang</tt> attributes are reconstructed correctly when the disco#info is restored.</p>
</section2>
<section2 topic='Directed Presence' anchor='security-directed-presence'>
<p>Entities MAY choose to not send &hashsets; with directed presence (for example to increase privacy). In that case, entities SHOULD also refuse direct &xep0030; queries.</p>
</section2>
<section2 topic='Query Interception' anchor='security-query-interception'>
<p>The server replies to certain disco#info queries on behalf of the client. This means that the client has no choice on to whom they reply. Otherwise, a client could choose to reply with <tt>&lt;service-unavailable/&gt;</tt> to mask its existence. We consider two effects of this:</p>
<ul>
<li>
<p>A remote entity could attempt to detect that an entity exists behind a resource. For this, they send a disco#info query to the resource since nearly everyone implements disco#info. As the client responds with <tt>&lt;service-unavailable/&gt;</tt>, it looks as if no client was present at this resource.</p>
<p>With &queryintercept;, the server would reply on behalf of the client. However, the consensus in the community is that by measuring the difference between the reply from the server of the resource and the reply from the actual resource, it would generally be possible to detect the existence of a resource.</p>
</li>
<li>
<p>A remote entity can obtain the disco#info information of any resource which supports &caps; and of which the entity knows the resource.</p>
<p>This cannot be mitigated with &queryintercept;. The risk is deemed acceptable considering that resources should generally be chosen randomly.</p>
</li>
</ul>
</section2>
</section1>
<section1 topic='Design Considerations' anchor='design'>
@ -871,7 +958,7 @@ cDp0aW1lHxw=</code>
<p>Thanks to the authors of &xep0115; for coming up with the original idea of using presence broadcast to convey service discovery information, as well as the optimization strategies.</p>
<p>The note below the example in <link url='#usecases-stream-feature'>Advertisement of Support and Capabilities by Servers</link> has been copied verbatimly from <cite>XEP-0115</cite>.</p>
<p>Thanks to Waqas Hussain for originally (to my knowledge) pointing out the security flaws in <cite>XEP-0115</cite> (see &mlwaqas1;).</p>
<p>Thanks to Georg Lukas, Link Mauve, Sebastian Riese, Florian Schmaus and Sam Whithed for their input, editorial and otherwise.</p>
<p>Thanks to Dave Cridland, Georg Lukas, Link Mauve, Sebastian Riese, Florian Schmaus and Sam Whited for their input, editorial and otherwise.</p>
</section1>
</xep>