The input to this algorithm is a &xep0030; disco#info <query/> response. The output is an octet string which can be used as input to a hash function or an error.
-General remarks: The algorithm strongly distinguishes between character data (sequences of Unicode code points) and octet strings (sequences of 8-bit bytes). Whenever character data is encoded to octet strings in the following algorithm, the UTF-8 encoding as specified in &rfc3629; is used. Whenever octet strings are sorted in the following algorithm, the i;octet collation as specified in &rfc4790; is used.
+General remarks:
+The algorithm strongly distinguishes between character data (sequences of Unicode code points) and octet strings (sequences of 8-bit bytes). Whenever character data is encoded to octet strings in the following algorithm, the UTF-8 as specified in &rfc3629; encoding is used. Whenever octet strings are sorted in the following algorithm, the i;octet collation as specified in &rfc4790; is used.
The algorithm uses the xml:lang attribute. Implementations must take implicit values for the xml:lang attribute into account, for example those inherited from the disco#info or the IQ element.
A server MAY support pushing of &hashes; from clients before sending initial presence. This allows servers to discover capabilities of clients before those have sent initial presence, which may be useful or important for some protocols (such as &xep0369;). This feature is called &gratcaps;.
+To advertise support, the server publishes the urn:xmpp:caps:gratuitous feature:
+After determining server support, a client can send &hashes; via &gratcaps; before sending initial presence:
+The server replies with an empty result on success.
+The server MUST NOT broadcast the &hashes; submitted via &gratcaps; using presence.
+Clients SHOULD NOT send &gratcaps; after they have sent initial presence; instead, they SHOULD re-send presence to update the &hashes;. Otherwise, entities subscribed to the presence will not receive the updated &hashes;.
+Instead of issuing a &xep0030; disco#info <query/> with absent 'node' attribute to a target entity, an entity MAY use a &hashcache; to obtain the response. To look up the disco#info response in the &hashcache;, an entity MUST use a hash from the &hashset; which was most recently received from the entity to which the <query/> would have been sent otherwise. If none of the most recently received &hashes; are found in the &hashcache;, the entity MUST fall back to sending the request.
An entity MUST NOT use &hashes; which were not included in the most recent &hashset; received from the target entity.
An entity MAY use external data sources to fill the &hashcache;.
+An entity MUST ensure that implicit values for xml:lang attributes is preserved when disco#info data is cached. This can for example happen by making the implicit values explicit in the storage.
@@ -751,6 +812,17 @@ cDp0aW1lHxw=Servers MAY implement &queryintercept; to further optimise bandwidth consumption. The idea is that servers intercept &xep0030; disco#info queries sent to clients if they already know the answer from &hashes; published by the client. The rules for &queryintercept; are the following (to be applied in this order):
+This was an issue with &xep0115; and has been addressed with a new algorithm for generating the hash function input which keeps the structural information of the disco#info input.
An entity MUST NOT ever use disco#info which has not been verified to belong to a &hash; obtained from a cache using that &hash;. Using cache contents from a trusted source (at the discretion of the entity) counts as verifying.
A malicious entity could send a large amount of &hashsets; in short intervals, while making sure that it provides matching disco#info responses. If a &procent; uses caching, this can overflow or thrash the caches. &procents; should be aware of this risk and apply proper rate-limiting for processing &hashsets;. To reduce the attack surface, an entity MAY choose to not cache &hashes; obtained from entities not in its roster.
+As mentioned earlier, when storing disco#info data in a cache for later retrieval, implementations MUST ensure that implicit values for xml:lang attributes are reconstructed correctly when the disco#info is restored.
Entities MAY choose to not send &hashsets; with directed presence (for example to increase privacy). In that case, entities SHOULD also refuse direct &xep0030; queries.
The server replies to certain disco#info queries on behalf of the client. This means that the client has no choice on to whom they reply. Otherwise, a client could choose to reply with <service-unavailable/> to mask its existence. We consider two effects of this:
+A remote entity could attempt to detect that an entity exists behind a resource. For this, they send a disco#info query to the resource since nearly everyone implements disco#info. As the client responds with <service-unavailable/>, it looks as if no client was present at this resource.
+With &queryintercept;, the server would reply on behalf of the client. However, the consensus in the community is that by measuring the difference between the reply from the server of the resource and the reply from the actual resource, it would generally be possible to detect the existence of a resource.
+A remote entity can obtain the disco#info information of any resource which supports ∩︀ and of which the entity knows the resource.
+This cannot be mitigated with &queryintercept;. The risk is deemed acceptable considering that resources should generally be chosen randomly.
+Thanks to the authors of &xep0115; for coming up with the original idea of using presence broadcast to convey service discovery information, as well as the optimization strategies.
The note below the example in Advertisement of Support and Capabilities by Servers has been copied verbatimly from XEP-0115.
Thanks to Waqas Hussain for originally (to my knowledge) pointing out the security flaws in XEP-0115 (see &mlwaqas1;).
-Thanks to Georg Lukas, Link Mauve, Sebastian Riese, Florian Schmaus and Sam Whithed for their input, editorial and otherwise.
+Thanks to Dave Cridland, Georg Lukas, Link Mauve, Sebastian Riese, Florian Schmaus and Sam Whited for their input, editorial and otherwise.