%ents; ]>
Stream Management This specification defines an XMPP protocol extension for active management of an XML stream between two XMPP entities, including features for stanza acknowledgements, stream resumption, and throttling notifications. &LEGALNOTICE; 0198 Draft Standards Track Standards XMPP Core None None sm http://xmpp.org/schemas/sm.xsd &infiniti; &hildjj; &stpeter; &fabio; 1.1 2010-03-05 psa/jk

Corrected value of 'h' so that zero means no stanzas have yet been handled; clarified distinction between a cleanly closed stream and an unfinished stream.

1.0 2009-06-17 psa

Per a vote of the XMPP Council, advanced specification from Experimental to Draft.

0.10 2009-06-11 psa

Editorial review.

0.9 2009-06-03 psa
  • Specified that the value of the 'h' attribute starts at zero, not one.
  • Specified that the 'h' attribute is an unsignedInt and that it loops back to zero when reaching 2^32.
  • Added security consideration regarding session resumption and removed security consideration regarding proxies.
  • Clarified the meaning of handled as defining which entity has responsibility for a stanza.
  • Corrected schema and examples.
0.8 2009-04-09 ff/jk/jjh/psa
  • Added <t/> element for throttling notifications, including 'stanzas' attribute for dynamic adjustment of the stanzas window.
  • Simplified protocol by making the sequence number increment per stanza and removing the u attribute (which is now unnecessary).
  • Removed 'h' attribute from <r/> element.
  • Added 'max' and 'stanzas' attributes to both <enable/> and <enabled/> elements, and removed those attributes from the <sm/> element.
  • Incremented the protocol version from 1 to 2.
  • Added in-depth usage scenarios.
0.7 2009-03-30 jjh/psa

Removed pings (use XEP-0199, whitespace pings, or TCP keepalives instead); removed section on throttling, since it is unworkable.

0.6 2009-03-19 psa
  • Incremented protocol version from 0 to 1.
  • Changed attribute names from c (?) and b (?) to u (unacknowledged) and h (handled).
  • Added stanzas attribute to specify maximum number of stanzas between acking requests.
  • More clearly defined error handling using <failed/> element plus stanza error conditions.
  • Defined error handling for <ping/> element by allowing stanza error conditions in <pong/> element.
  • More clearly specified maximum reconnect time.
  • Added detailed scenarios for basic acking and for outbound and inbound throttling.
0.5 2008-09-29 psa

Removed recommendation to use namespace prefixes; modified namespace to incorporate namespace versioning.

0.4 2008-09-08 jjh/jk/psa

Added support for session resumption; re-organized the document; changed name to stream management; changed provisional namespace.

0.3 2007-10-03 jk

Updates per devcon discussion.

0.2 2007-04-05 jk

Require c attribute on <r/> element. Describe minimal implementation. Switch to standard temporary namespace.

0.1 2006-11-21 psa

Initial published version.

0.0.3 2006-11-08 jk New version, using sequence numbers. 0.0.2 2004-12-11 jk Further clarification, allow acking many stanzas at once. 0.0.1 2004-08-09 jk First draft.

&xmppcore; defines the fundamental streaming XML technology used by XMPP (i.e., stream establishment and termination including authentication and encryption). However, the core XMPP specification does not provide tools for actively managing a "live" XML stream.

The basic concept behind stream management is that the initiating entity (either a client or a server) and the receiving entity (a server) can exchange "commands" for active management of the stream. The following stream management features are of particular interest because they are expected to improve network reliability and the end-user experience:

Stream management implements these features using short XML elements at the root stream level. These elements are not "stanzas" in the XMPP sense (i.e., not &IQ;, &MESSAGE;, or &PRESENCE; stanzas as defined in &rfc3920;) and are not counted or acked in stream management, since they exist for the purpose of managing stanzas themselves.

Stream management is used at the level of an XML stream. To check TCP connectivity underneath a given stream, it is RECOMMENDED to use &xep0199;, whitespace keepalives (see Section 5.7.3 of &rfc3920bis;), or TCP keepalives. By constrast with stream management, &xep0079; and &xep0184; define acks that are sent end-to-end over multiple streams; these facilities are useful in special scenarios but are unnecessary for checking of a direct stream between two XMPP entities.

(Examples prepended by "C:" are sent by a client and examples prepended by "S:" are sent by a server. Stream management can be used server-to-server but most of the examples in this specification show its use between a client and a server.)

After negotiating use of TLS and authenticating via SASL, the receiving entity returns a new stream header to the intiating entity along with stream features, where the features include an <sm/> element qualified by the 'urn:xmpp:sm:2' namespace &VNOTE;.

The stream management feature MUST NOT be offered unless the initiating entity has been authenticated (e.g., by means of SASL, &xep0078;, or &xep0220;).

S: ]]>

To enable use of stream management, the initiating entity sends an <enable/> command to the receiving entity.

]]>

If the initiating entity wants to be allowed to resume the stream, it includes a boolean 'resume' attribute, which defaults to false &BOOLEANNOTE;. For information about resuming a previous session, see the Resumption section of this document.

The <enable/> element MAY include a 'max' attribute to specify the initiating entity's preferred maximum resumption time in seconds.

The <enable/> element MAY include a 'stanzas' attribute to specify the initiating entity's preferred number of stanzas between acks.

Upon receiving the enable request, the receiving entity MUST reply with an <enabled/> element or a <failed/> element qualified by the 'urn:xmpp:sm:2' namespace. The <failed/> element indicates that there was a problem establishing the stream management "session". The <enabled/> element indicates successful establishment of the stream management session.

]]>

The parties can then the use stream management features defined below.

If the receiving entity allows session resumption, it MUST include a 'resume' attribute set to a value of "true" or "1".

]]>

The <enabled/> element MAY include a 'max' attribute to specify the receiving entity's preferred maximum resumption time.

The <enabled/> element MAY include a 'stanzas' attribute to specify the receiving entity's preferred number of stanzas between acks.

For client-to-server connections, the client SHOULD NOT attempt to enable stream management until after it has completed Resource Binding unless it is resuming a previous session (see Resumption). The server MAY enforce this order and return a <failed/> element in response (see Error Handling).

]]>

After enabling stream management, the initiating or receiving entity can send ack elements at any time over the stream. An ack element is one of the following:

The following attribute is defined:

An <a/> element MUST possess an 'h' attribute.

An <r/> element SHOULD NOT possess any attributes.

Definition: Acknowledging a previously-received ack element indicates that the stanza(s) sent since then have been "handled" by the receiver. By "handled" we mean that the receiver has accepted responsibility for a stanza or stanzas (e.g., to process the stanza(s) directly, deliver the stanza(s) to a local entity such as another connected client on the same server, or route the stanza(s) to a remote entity at a different server); until a stanza has been affirmed as handled by the receiver, that stanza is the responsibility of the sender (e.g., to resend it or generate an error if it is never affirmed as handled by the receiver).

Note: The value of 'h' starts at zero before any stanzas are handled, is incremented to one for the first stanza handled, and is incremented again with each subsequent stanza handled. In the unlikely case that the number of stanzas handled during a stream management session exceeds the number of digits that can be represented by the unsignedInt datatype as specified in &w3xmlschema2; (i.e., 232), the value of 'h' shall be reset from 232-1 back to zero (rather than being incremented to 232).

The following example shows a message sent by the client, a request for acknowledgement, and an ack of the stanza.

I'll send a friar with speed, to Mantua, with my letters to thy lord. C: S: ]]>

When an <r/> element ("request") is received, the recipient MUST acknowledge it by sending an <a/> element to the sender containing a value of 'h' that is equal to the number of stanzas handled by the recipient of the <r/> element. The response SHOULD be sent as soon as possible after receiving the <r/> element, and MUST NOT be withheld for any condition other than a timeout. For example, a client with a slow connection might want to collect many stanzas over a period of time before acking, and a server might want to throttle incoming stanzas. The sender does not have to wait for an ack to continue sending stanzas. Because acks indicate stanza acceptance, a server that is throttling stanzas MUST delay the response until the client is no longer being penalized (but SHOULD notify the client that it is throttling incoming stanzas, as described under Throttling).

When a party returns an ack in response to an <r/> element or receives such an ack, it SHOULD keep a record of the 'h' value returned as the sequence number of the last handled stanza for the current stream (and discard the previous 'h' value).

If a stream ends and it is not resumed within the time specified in the original <enabled/> element, the sequence number and any associated state MAY be discarded by both parties. Before the session state is discarded, implementations SHOULD take alternative action regarding any unhandled stanzas (i.e., stanzas sent after the most recent 'h' value):

  • A server SHOULD treat unacknowledged stanzas in the same way that it would treat a stanza sent to an unavailable resource, by either returning an error to the sender or committing the stanza to offline storage.
  • A user-oriented client SHOULD try to silently resend the stanzas upon reconnection or inform the user of the failure via appropriate user-interface elements.

It can happen that an XML stream is terminated unexpectedly (e.g., because of network outages). In this case, it is desirable to quickly resume the former stream rather than complete the tedious process of stream establishment, roster retrieval, and presence broadcast.

To request that the stream will be resumable, when enabling stream management the initiating entity MUST add a 'resume' attribute to the <enable/> element with a value of "true" or "1" &BOOLEANNOTE;.

]]>

If the receiving entity will allow the stream to be resumed, it MUST include a 'resume' attribute set to "true" or "1" on the <enabled/> element and MUST include an 'id' attribute that specifies an identifier for the stream.

]]>

Definition: The 'id' attribute defines a unique identifier for purposes of stream management (an "SM-ID"). The SM-ID MUST be generated by the receiving entity (server). The initiating entity MUST consider the SM-ID to be opaque and therefore MUST NOT assign any semantic meaning to the SM-ID. The receiving entity MAY encode any information it deems useful into the SM-ID, such as the full JID &LOCALFULL; of a connected client (e.g., the full JID plus a nonce value). Any characters allowed in an XML attribute are allowed. The SM-ID MUST NOT be reused for simultaneous or subsequent sessions (but the server need not ensure that SM-IDs are unique for all time, only for as long as the server is continuously running). The SM-ID SHOULD NOT be longer than 4000 bytes.

If the stream is terminated unexpectedly, the initiating entity would then open a TCP connection to the receiving entity. The order of events is as follows:

  1. Initiating entity sends initial stream header.
  2. Receiving entity sends response stream header.
  3. Receiving entity sends stream features.
  4. Initiating entity sends STARTTLS request.
  5. Receiving entity informs initiating entity to proceed with the TLS negotiation.
  6. The parties complete a TLS handshake. (Note: When performing session resumption and also utilizing TLS, it is RECOMMENDED to take advantage of TLS session resumption to further optimize the resumption of the XML stream.)
  7. Initiating entity sends new initial stream header.
  8. Receiving entity sends response stream header.
  9. Receiving entity sends stream features, requiring SASL negotiation and offering appropriate SASL mechanisms. (Note: If the server considers the information provided during TLS session resumption to be sufficient authentication, it MAY offer the SASL EXTERNAL mechanism; for details, refer to &sasltls;.)
  10. The parties complete SASL negotiation.
  11. Initiating entity sends new initial stream header.
  12. Receiving entity sends response stream header.
  13. Receiving entity sends stream features, offering the SM feature.
  14. Initiating entity requests resumption of the former stream.

To request resumption of the former stream, the initiating entity sends a <resume/> element qualified by the 'urn:xmpp:sm:2' namespace. The <resume/> element MUST include a 'previd' attribute whose value is the SM-ID of the former stream and MAY include an 'h' attribute that identifies the sequence number of the last handled stanza sent over the former stream from the receiving entity to the initiating entity (if stream management was being used in both directions); if there is no such sequence number for the former stream, the 'h' attribute MUST NOT be included.

]]>

If the receiving entity can resume the former stream, it MUST return a <resumed/> element that includes a 'previd' attribute set to the SM-ID of the former stream. The <resumed/> element MAY also include an 'h' attribute set to the sequence number of the last handled stanza sent over the former stream from the initiating entity to the receiving entity; if there is no such sequence number for the former stream, the 'h' attribute MUST NOT be included.

]]>

If the receiving entity does not support session resumption, it MUST return a <failed/> element, which SHOULD include an error condition of &feature;. If the receiving entity does not recognize the 'previd' as an earlier session (e.g., because the former session has timed out), it MUST return a <failed/> element, which SHOULD include an error condition of ¬found;. In both of these failure cases, the receiving entity SHOULD allow the initiating entity to bind a resource at this point rather than forcing the initiating entity to restart the stream negotiation process and re-authenticate.

If the former stream is resumed and the receiving entity still has the stream for the previously-identified session open at this time, the old stream SHOULD be terminated.

When a session is resumed, the parties proceed as follows:

  • Both parties SHOULD retransmit any stanzas that were not handled during the previous session, based on the sequence number reported by the peer.
  • A reconnecting client SHOULD NOT request the roster, because any roster changes that occurred while the client was disconnected will be sent to the client after the stream management session resumes.
  • The client SHOULD NOT resend presence stanzas in an attempt to restore its former presence state, since this state will have been retained by the server.
  • Both parties SHOULD NOT try to re-establish state information (e.g., &xep0030; information).

When a server acts as a receiving entity for an XML stream, it might throttle the stream (i.e., impose rate limiting) if the initiating entity (a client or a server) attempts to send too much traffic over the stream (e.g., a very large number of stanzas, or a lesser number of stanzas that are relatively large). The formulas for determining when rate limiting shall be imposed are implementation-specific; however, nearly all XMPP server implementations include support for such throttling (often called "karma"). Therefore it would be helpful if the receiving entity could inform the initiating entity that the stream has been voluntarily throttled by the receiving entity. Some forms of "throttling" can occur naturally at the TCP layer without being voluntarily imposed by the receiving entity; the receiving entity cannot inform the initiating entity about such throttling. It can do so by sending a <t/> element to the receiving entity:

]]>

Note: Sending a throttling notification to the stream peer does not necessarily indicate that the entity is throwing away all stanzas, only that the entity has voluntarily slowed its processing of incoming stanzas.

The throttling notification MAY include a 'stanzas' attribute so that the receiving entity can inform the initiating entity of changes to the maximum number of stanzas between acks. This enables the receiving entity to dynamically adjust stanza acking in response to network conditions or stream usage.

]]>

If the number of unacknowledged stanzas is greater than or equal to the value of the 'stanzas' attribute, a throttled peer MUST NOT send any further stanzas.

If an entity sets the value of 'stanzas' to zero, it has stopped reading from the stream entirely.

For as long as the initiating entity is throttled, the receiving entity SHOULD periodically send a throttling notification to the initiating entity (e.g., every 30 seconds) to obviate the need for pings generated by the initating entity (which the receiving entity will ignore because the initiating entity is throttled).

If an error occurs with regard to an <enable/> or <resume/> element, the receiving entity MUST return a <failed/> element. This element SHOULD contain an error condition, which MUST be one of the stanza error conditions defined in &rfc3920bis;.

An example follows.

]]>

Stream management errors SHOULD be considered recoverable; however, misuse of stream management MAY result in termination of the stream.

A cleanly closed stream differs from an unfinished stream. If a client wishes to cleanly close its stream and end its session, it MUST send a </stream:stream> so that the server can send unavailable presence on the client's behalf.

If the stream is not cleanly closed then the server SHOULD consider the stream to be unfinished (even if the client closes its TCP connection to the server) and SHOULD maintain the session on behalf of the client for a limited amount of time. The client can send whatever presence it wishes before leaving the stream in an unfinished state.

The following scenarios illustrate several different uses of stream management. The examples are that of a client and a server, but stream management can also be used for server-to-server streams.

The Stream Management protocol can be used to improve reliability using acks without the ability to resume a session. A basic implementation would do the following:

  • As an initiating entity, send <enable/> with no attributes, and ignore the attributes on the <enabled/> response.
  • As a receiving entity, ignore the attributes on the <enable/> element received, and respond via <enabled/> with no attributes.
  • When receiving an <r/> element, immediately respond via an <a/> element where the value of 'h' returned is the sequence number of the last handled stanza.
  • Keep an integer X for this stream session, initially set to zero. When about to send a stanza, first put the stanza (paired with the current value of X) in an "unacknowleged" queue. Then send the stanza over the wire with <r/> to request acknowledgement of that outbound stanza, and increment X by 1. When receiving an <r/> or <a/> element with an 'h' attribute, all stanzas whose paired value (X at the time of queueing) is less than or equal to the value of 'h' can be removed from the unacknowledged queue.

This is enough of an implementation to minimally satisfy the peer, and allows basic tracking of each outbound stanza. If the stream connection is broken, the application has a queue of unacknowledged stanzas that it can choose to handle appropriately (e.g., warn a human user or silently send after reconnecting).

The following examples illustrate basic acking (here the client automatically acks each stanza it has received from the server, without first being prompted via an <r/> element).

First, after authentication and resource binding, the client enables stream management.

]]>

The server then enables stream management.

]]>

The client then retrieves its roster and immediately sends an <r/> element to request acknowledgement.

C: ]]>

The server immediately sends an <a/> element to acknowledge handling of the stanza and then returns the roster.

S: ]]>

The client then acknowledges receipt of the server's stanza, sends initial presence, and immediately sends an <r/> element to request acknowledgement, incrementing by one its internal representation of how many stanzas have been handled by the server.

C: C: ]]>

The server immediately sends an <a/> element to acknowledge handling of the stanza and then broadcasts the user's presence (including to the client itself as shown below).

S: ]]>

The client then acks the server's second stanza and sends an outbound message followed by an <r/> element.

C: ciao! C: ]]>

The server immediately sends an <a/> element to acknowledge handling of the stanza and then routes the stanza to the remote contact (not shown here because the server does not send a stanza to the client).

]]>

And so on.

The basic acking scenario is wasteful because the client requested an ack for each stanza. A more efficient approach is to periodically request acks (e.g., every 5 stanzas) in accordance with the 'stanzas' attribute value provided by the receiving entity on the <enabled/> element. This is shown schematically in the following pseudo-XML.

S: C: C: C: C: C: C: S:
C: C: C: C: C: C: S: ]]>

As mentioned, many servers will impose rate limiting on clients that send large amounts of traffic. In the following scenario, we assume that the first three messages sent by the client are rather large, so the server voluntarily throttles the client. The server then sends throttling notifications every 30 seconds, dynamically adjusting the maximum number of stanzas between acks as a rough indicator of how serious the throttling is.

S: C: C: C: [throttling kicks in] S: [client requests an ack for the first three messages, but does not send any more messages until throttling ends] C: [client still throttled, server ignores for now] [30 seconds go by] S: [30 seconds go by] S: [backlog starts to ease, server adjusts 'stanzas' value] S: [client sends another message just because it can] C: [server has handled the first 3 messages so it finally replies to ] S:
C: C: C: C: [client has sent 5 messages so requests an ack] C: [throttling is over, server replies to 2nd and sets 'stanzas' back to 10] S: S: ]]>

As noted, a receiving entity MUST NOT allow an initiating entity to resume a stream management session until after the initiating entity has authenticated (for some value of "authentication"); this helps to prevent session hijacking.

This XEP requires no interaction with &IANA;.

This specification defines the following XML namespace:

  • urn:xmpp:sm:2

The ®ISTRAR; includes the foregoing namespace in its registry at &NAMESPACES;, as described in Section 4 of &xep0053;.

&NSVER;

The XMPP Registrar includes 'urn:xmpp:sm:2' in its registry of stream features at &STREAMFEATURES;.

The protocol documented by this schema is defined in XEP-0198: http://www.xmpp.org/extensions/xep-0198.html ]]>

Thanks to Dave Cridland, Jack Erwin, Philipp Hancke, Curtis King, Tobias Markmann, Alexey Melnikov, Pedro Melo, Robin Redeker, Mickaël Rémond, and Matthew Wild for their feedback.