<abstract>This document defines an XMPP protocol extension for broadcasting and dynamically discovering client, device, or generic entity capabilities. In order to minimize network impact, the transport mechanism is standard XMPP presence broadcast (thus forestalling the need for polling related to service discovery data), the capabilities information can be cached either within a session or across sessions. and the format has been kept as small as possible.</abstract>
<remark><p>In response to persistent security concerns over caps poisoning, redefined ver attribute to be a hash of the service discovery identity and features in a way that is backwards-compatible with the legacy format.</p></remark>
<remark><p>Added developer-friendly introduction; specified that ext names must be stable across application versions; further clarified examples; added stream feature use case; removed message example (send directed presence instead).</p></remark>
<remark><p>Added several items to the Security Considerations; clarified naming requirements regarding 'node', 'ver', and 'ext' attributes.</p></remark>
<remark><p>Specified that the protocol can be used whenever presence is used (e.g., by gateways); improved the XML schema; made several editorial adjustments.</p></remark>
<remark><p>Add more clarifying assumptions and requirements, make it clear that clients don't have to send capabilities every time if the server is optimizing.</p></remark>
<p>It is often desirable for an XMPP application (commonly but not necessarily a client) to take different actions depending on the capabilities of another application from which it receives presence information. Examples include:</p>
<p>In the past, after logging in some Jabber clients sent one &xep0030; and one &xep0092; request to each entity from which they received presence. That "disco/version flood" resulted in an excessive use of bandwidth and was impractical on a larger scale, particularly for users with large rosters. Therefore this document defines a more robust and scalable solution: namely, a presence-based mechanism <note>This proposal is not limited to clients, and can be used by any entity that exchanges presence with another entity, e.g., a gateway. However, this document uses the example of clients throughout.</note> for exchanging information about entity capabilities. Clients should not engage in the older "disco/version flood" behavior and instead should use Entity Capabilities as specified herein.</p>
<p>Imagine that you are a Shakespearean character named Juliet and one of your contacts, a handsome fellow named Romeo, becomes available. His client wants to publish its capabilities, and does this by adding to its presence packets a <c/> element with special attributes. As a result, your client receives the following presence packet:</p>
<p>The 'node' attribute represents the client software Romeo is using. The optional 'v' attribute represents the specific version of that client software (it is only an "FYI" and is not used further in entity capabilities). The 'ver' attribute is a specially-constructed string that represents the identity (see &DISCOCATEGORIES;) and supported features (see &DISCOFEATURES;) of the entity.</p>
<p>At this point, your client has no idea what the capabilities are of someone with a version string '8RovUdtOmiAjzj+xI7SK5BCw3A8='. Your client therefore sends a service discovery query to Romeo, asking what his client can do.</p>
<p>At this point, your client knows that anyone advertising a version string of '8RovUdtOmiAjzj+xI7SK5BCw3A8=' has a client that can do &xep0045; and the other features returned by Romeo's client (the string can be relied upon because of how it is generated and checked, as explained later in this document). Your client remembers this information, so that it does not need to explicitly query the capabilities of a contact with the same version string. For example, Benvolio may send you the following presence:</p>
<p>... you have no information about what this contact's client is capable of unless you have cached previous entity capabilities information; therefore you need to query for capabilities explicitly again via service discovery.</p>
<li>Members of a community tend to cluster around a small set of clients with a small set of capabilities. More specifically, multiple people in my roster use the same client, and they upgrade versions relatively slowly (commonly a few times a year, perhaps once a week at most, certainly not once a minute).</li>
<li>Clients must be able to participate even if they support only &xmppcore;, &xmppim;, and <cite>XEP-0030</cite>.</li>
<li>Clients must be able to participate even if they are on networks without connectivity to other XMPP servers, services offering specialized XMPP extensions, or HTTP servers.<note>These first two requirements effectively eliminated &xep0060; as a possible implementation of entity capabilities.</note></li>
<li>It must be possible to write a XEP-0045 server implementation that passes the given information along.</li>
<li>It must be possible to publish a change in capabilities within a single presence session.</li>
<li>Server infrastructure above and beyond that defined in <cite>XMPP Core</cite> and <cite>XMPP IM</cite> must not be required for this approach to work, although additional server infrastructure may be used for optimization purposes.</li>
<p>Entity capabilities are encapsulated in a <c/> element qualified by the 'http://jabber.org/protocol/caps' namespace. The attributes of the <c/> element are as follows.</p>
<td>A set of nametokens specifying additional feature bundles; this attribute is deprecated (see the <linkurl='#legacy'>Legacy Format</link> section of this document).</td>
<td>The hashing algorithm used to generate the 'ver' attribute; see <linkurl='#security-mti'>Mandatory-to-Implement Technologies</link> regarding supported hashing algorithms.</td>
<p>* Note: It is RECOMMENDED for the value of the 'node' attribute to be an HTTP URL at which a user could find further information about the software product, such as "http://psi-im.org/" for the Psi client; this enables a processing application to also determine a unique string for the generating application, which it could maintain in a list of known software implementations (e.g., associating the name received via the disco#info reply with the URL found in the caps data).</p>
<p>*** Note: Before version 1.4 of this specification, the 'ver' attribute was used to specify the released version of the software; while the values of the 'ver' attribute that result from use of the algorithm specified herein are backwards-compatible, applications SHOULD appropriately handle the <linkurl='#legacy'>Legacy Format</link>.</p>
<p>In order to help prevent poisoning of entity capabilities information, the value of the 'ver' attribute MUST be generated according to the following method.</p>
<p>Note: All sorting operations MUST be performed using "i;octet" collation as specified in Section 9.3 of &rfc4790;.</p>
<li>Sort the service discovery identities <note>A registry of service discovery identities is located at &DISCOCATEGORIES;.</note> by category and then by type (if it exists), formatted as 'category' '/' 'type'.</li>
<li>Compute ver by hashing S using the algorithm specified in in the 'hash' attribute (e.g., SHA-1 as defined in &rfc3174;). The hashed data MUST be generated with binary output and encoded using Base64 as specified in Section 4 of &rfc4648; (note: the Base64 output MUST NOT include whitespace and MUST set padding bits to zero). <note>The OpenSSL command for producing such output with SHA-1 is is "echo -n 'S' | openssl dgst -binary -sha1 | openssl enc -nopad -base64".</note></li>
<p>For example, consider an entity whose service discovery category is "client", whose service discovery type is "pc", and whose supported features are "http://jabber.org/protocol/disco#info", "http://jabber.org/protocol/disco#items", and "http://jabber.org/protocol/muc". Using the SHA-1 algorightm, the value of the 'ver' attribute would be generated as follows:</p>
<p>Each time a generating entity sends presence, it annotates that presence with an entity identifier ('node' attribute) and identity and feature identifier ('ver' attribute). So that servers can remember the last presence for use in responding to probes, a client SHOULD include entity capabilities with every presence notification it sends.</p>
<p>If the supported features change during a generating entity's presence session (e.g., a user installs an updated version of a client plugin), the application MUST recompute the 'ver' attribute and SHOULD send a new presence broadcast.</p>
<examplecaption='Presence with recomputed ver attribute'><![CDATA[
<p>An application (the "requesting entity") can learn what features another entity supports by sending a disco#info request (see <cite>XEP-0030</cite>) to the entity that generated the caps information (the "generating entity").</p>
<p>The disco#info request is sent by the requesting entity to the generating entity. The value of the 'to' attribute MUST be the exact JID of the generating entity, which in the case of a client will be the full JID (&FULLJID;).</p>
<p>The disco 'node' attribute MUST be included for backwards-compatibility. The value of the 'node' attribute SHOULD be generated by concatenating the value of the caps 'node' attribute (e.g., "http://code.google.com/p/exodus") as provided by the generating entity, the "#" character, and the value of the caps 'ver' attribute (e.g., "8RovUdtOmiAjzj+xI7SK5BCw3A8=") as provided by the generating entity.</p>
<p>The generating entity then returns all of the capabilities it supports.</p>
<p>The requesting entity MUST check the identities and supported features against the 'ver' value by calculating the hash as described under <linkurl='#ver'>Generation of the ver Attribute</link> and making sure that the values match. If the values do not match, the requesting entity MUST NOT accept or cache the 'ver' value as reliable and SHOULD check the service discovery identity and supported features of another generating entity who advertises that value (if any). This helps to prevent poisoning of entity capabilities information.</p>
<p>A server MAY include its entity capabilities in a stream feature element so that connecting clients and peer servers do not need to send service discovery requests each time they connect.</p>
<p>When a connected client or peer server sends a service discovery information request to determine the entity capabilities of a server that advertises capabilities via the stream feature, the requesting entity MUST send the disco#info request to the server's JID as provided in the 'from' attribute of the response stream header (the 'from' attribute was recommended by &rfc3920; and is required by &rfc3920bis;). To enable this functionality, a server that advertises support for entity capabilities MUST provide a 'from' address in its response stream headers, in accordance with <cite>rfc3921bis</cite>.</p>
<p>If an entity supports the entity capabilities protocol, it MUST advertise that fact by returning a feature of <strong>'http://jabber.org/protocol/caps'</strong> in response to a service discovery information request.</p>
<examplecaption="Service discovery information request"><![CDATA[
<p>If a server supports the <linkurl='#optimization'>Server Optimization</link> functionality, it MUST also return a feature of <strong>'http://jabber.org/protocol/caps#optimize'</strong> in response to service discovery information requests.</p>
<examplecaption="Service discovery information request"><![CDATA[
<p>An application should maintain a list of hashing algorithms it supports, which MUST include the <linkurl='#security-mti'>Mandatory-to-Implement Technologies</link>. If the application receives a caps notification that was generated using one of its supported hashing algorithms, then it SHOULD verify the hash and cache the value globally. If the application receives a caps notification generated using a hash it does not support, then it SHOULD NOT attempt to verify the hash but SHOULD cache it on a per-JID basis (i.e., it SHOULD send a service discovery information request to the JID and cache the results for that JID only).</p>
</section2>
<section2topic='Caching'anchor='impl-cache'>
<p>It is RECOMMENDED for an application that processes entity capabilities information to cache associations between the 'ver' attribute and discovered features within the scope of one presence session. This obviates the need for extensive service discovery requests within a session.</p>
<p>It is OPTIONAL for an application to cache associates across presence sessions. However, since this obviates the need for extensive service discovery requests at the beginning of a session, such caching is strongly encouraged, especially in bandwidth-constrained environments.</p>
<p>If two entities exchange messages but they do not normally exchange presence (i.e., via presence subscription), the entities MAY choose to send directed presence to each other, where the presence information SHOULD be annotated with the same capabilities information as each entity sends in broadcasted presence. Until and unless capabilities information has not been received from another entity, an application MUST assume that the other entity does not support capabilities.</p>
<p>A server that is managing an connected client's presence session MAY optimize presence notification traffic sent through the server by stripping off redundant capabilities annotations. Because of this, receivers of presence notifications MUST NOT expect an annotation on every presence notification they receive. If the server performs caps optimization, it MUST ensure that the first presence notification each subscriber receives contains the annotation. The server MUST also ensure that any changes in the caps infomration (e.g., an updated 'ver' attribute) are sent to all subscribers.</p>
<p>If a connected client determines that its server supports caps optimization, MAY choose to send the capabilities annotation only on the first presence packet, as well as whenever its capabilities change.</p>
<p>The 'name' attribute of the service discovery <identity/> element is not included in the hash generation method. The primary reason for excluding it is that it is human-readable text and therefore may be provided in different localized versions. As a result, its inclusion would needlessly multiply the number of possible hash values and thus the time and resources required to validate values of the 'ver' attribute.</p>
<p>The SHA-1 hashing algorithm is mandatory to implement. All implementations MUST support SHA-1.</p>
<p>An implementation MAY support other algorithms. Any such algorithm SHOULD be registered in the &ianahashes;.</p>
<p>In the future, the &COUNCIL; may, at its discretion, modify the mandatory-to-implement hashing algorithm if it determines that SHA-1 has become practically (not just theoretically) vulnerable to <linkurl='#security-preimage'>Preimage Attacks</link>. If the Council </p>
<p>Theoretically it may become possible to launch a "preimage" attack (see &rfc4270;) against the hashes used in the 'ver' attribute, i.e., if the hashing algorithm used is found to be vulnerable to such attacks. However, such attacks are not currently practical against the SHA-1 algorithm, and may not become practical in the foreseeable future. The XMPP Council shall monitor the state of cryptanalysis regarding the SHA-1 algorithm. If and when preimage attacks become practical against SHA-1, this specification shall be updated to change the mandatory-to-implement hashing algorithm from SHA-1 to a safer algorithm (e.g., SHA-256).</p>
<p>Adherence to the method defined in the <linkurl='#ver'>Generation of the ver Attribute</link> section of this document for both generation and checking of the 'ver' attribute helps to guard against poisoning of entity capabilities information by malicious or improperly implemented entities.</p>
<p>If the value of the 'ver' attribute is a hash as defined herein (i.e., if the 'ver' attribute is not generated according to the <linkurl='#legacy'>Legacy Format</link>), inclusion of the 'hash' attribute is REQUIRED. Knowing explicitly that the value of the 'ver' attribute is a hash enables the recipient to avoid spurious notification of invalid or poisoned hashes.</p>
<p>Use of entity capabilities might make it easier for an attacker to launch certain application-specific attacks, since the attacker would know what kind of more easily determine the type of client being used as well as its capabilities. However, since most clients respond to Service Discovery and Software Version requests without performing access control checks, there is no new vulnerability. Entities that wish to restrict access to capabilities information SHOULD use &xep0016; to define appropriate communications blocking (e.g., an entity MAY choose to allow IQ requests only from "trusted" entities, such as those with whom it has a presence subscription of "both"); note, however, that such restrictions may be incompatible with the recommendation regarding <linkurl='#directed'>Directed Presence</link>.</p>
<p>A client SHOULD enable a human user to disable inclusion of the 'v' attribute, which specifies a version of the software. If the 'v' attribute is not included, the receiver MUST assume that the version is intended to be private, and MUST NOT automatically send Software Version requests to the sender.</p>
<p>The XMPP Registrar shall include "http://jabber.org/protocol/caps#optimize" in its registry of service discovery features (see &DISCOFEATURES;).</p>
<p>Before Version 1.4 of this specification, the 'ver' attribute was generated differently, the 'ext' attribute was used more extensively, and the 'hash' and 'v' attributes were absent. For historical purposes, Version 1.3 of this specification is archived at <<linkurl='http://www.xmpp.org/extensions/attic/xep-0115-1.3.html'>http://www.xmpp.org/extensions/attic/xep-0115-1.3.html</link>>. For backwards-compatibility with the legacy format, the 'node' attribute is REQUIRED and the 'ext' attribute MAY be included.</p>
<p>An application can determine if the legacy format is in use by checking for the presence of the 'hash' attribute, which is REQUIRED in the current format.</p>
<p>If a caps-processing application supports the legacy format, it SHOULD check the 'node', 'ver', and 'ext' combinations as specified in the archived version 1.3 of this specification, and MAY cache the results.</p>
<p>If a caps-processing application does not support the legacy format, it SHOULD ignore the 'ver' value entirely (since the value cannot be verified) and SHOULD NOT cache it, since the application cannot validate the identity and features by checking the hash.</p>
<p>Thanks to Rachel Blackman, Dave Cridland, Richard Dobson, Olivier Goffart, Sergei Golovan, Justin Karneges, Ian Paterson, Kevin Smith, Tomasz Sterna, Michal Vaner, and Matt Yacobucci for comments and suggestions.</p>