diff --git a/xep-0166.xml b/xep-0166.xml index 73534e54..3e241fc0 100644 --- a/xep-0166.xml +++ b/xep-0166.xml @@ -27,6 +27,20 @@ &robmcqueen; &seanegan; &hildjj; + + 0.35 + 2009-03-06 + psa + + + + 0.34 2009-02-17 @@ -322,10 +336,15 @@ -

The purpose of Jingle is to enable one-to-one, peer-to-peer media sessions between XMPP entities, where the negotiation occurs over the XMPP "channel" and the media is exchanged outside the XMPP channel using technologies such as the Real-time Transport Protocol (RTP; &rfc3550;), the User Datagram Protocol (UDP; &rfc0768;), and &ice;.

-

One target application for Jingle is simple voice and video chat (see &xep0167;). We stress the word "simple". The purpose of the core Jingle technology is not to build a full-fledged telephony application that supports call waiting, call forwarding, call transfer, hold music, IVR systems, find-me-follow-me functionality, conference calls, and the like. These features are of interest to some user populations, but adding support for them to the core Jingle layer would introduce unnecessary complexity into a technology that is designed for basic multimedia interaction.

-

In addition, the purpose of Jingle is not to supplant or replace existing Internet technologies based on the Session Initiation Protocol (SIP; &rfc3261;). Because dual-stack XMPP+SIP clients are difficult to build, Jingle was designed as a pure XMPP signalling protocol. However, Jingle is at the same time designed to interwork with SIP so that the millions of deployed XMPP clients can be added onto existing Voice over Internet Protocol (VoIP) networks, rather than limiting XMPP users to a separate and distinct network.

-

Jingle is designed in a modular way so that developers can easily add support for multimedia session types other than voice and video chat, such as application sharing, file transfer, collaborative editing, whiteboarding, and torrent broadcasting. The transport methods are also modular, so that Jingle implementations can use any appropriate media transport (including proprietary methods not standardized through the XMPP Standards Foundation).

+

The purpose of Jingle is to enable one-to-one, peer-to-peer media sessions between XMPP entities, where the negotiation occurs over the XMPP signalling channel and the media is exchanged over a data channel that is usually a dedicated non-XMPP transport. Jingle is designed in a modular way:

+ +

It is expected that most application types, transport methods, and security preconditions will be documented in specifications produced by the &XSF; or the &IETF;; however, developers can also define proprietary methods for custom functionality.

+

Although Jingle provides a general framework for session management, the original target application for Jingle was simple voice and video chat. We stress the word "simple". The purpose of Jingle was not to build a full-fledged telephony application that supports call waiting, call forwarding, call transfer, hold music, IVR systems, find-me-follow-me functionality, conference calls, and the like. These features are of interest to some user populations, but adding support for them to the core Jingle layer would introduce unnecessary complexity into a technology that is designed for simple but generalized session negotiation.

+

Furthermore, Jingle is not intended to supplant or replace existing Internet technologies based on the Session Initiation Protocol (SIP; &rfc3261;). Because dual-stack XMPP+SIP clients are difficult to build, Jingle was designed as a pure XMPP signalling protocol. However, Jingle is at the same time designed to interwork with SIP so that the millions of deployed XMPP clients can be added onto existing Voice over Internet Protocol (VoIP) networks, rather than limiting XMPP users to a separate and distinct network.

This section provides a friendly introduction to Jingle.

@@ -349,49 +368,50 @@ Romeo Juliet |---------------------------->| | | ]]> -

To illustrate the basic flow, we show a truncated example with a "stub" application format and transport method (skipping non-essential steps to enforce the most important concepts).

+

To illustrate the basic flow, we show a truncated example with a "stub" application format and transport method (skipping non-essential steps to enforce the most essential concepts and ignoring security preconditions for now).

- + action='session-initiate' initiator='romeo@montague.lit/orchard' sid='a73sjjvkla37jfea'> - + ]]> -

After the responder acknowledges receipt of the session-initiate and the parties negotiate some parameters (not shown here), the responder would eventually send a session-accept to the initiator.

+

In this example, the initiator (romeo@montague.lit/orchard) sends a session initiation offer to the responder (juliet@capulet.lit/balcony), where the session is defined as the exchange of "stub" media over a "stub" transport.

+

After the responding client acknowledges receipt of the session-initiate message, it prompts the responding user (if any) to choose whether she wants to proceed with the session (however, it does not need to prompt the user if for example she has configured her client to automatically accept session requests from this particular initiator). If she wants to proceed she selects the appropriate interface element and her client sends a session-accept message to the initiator.

- - + ]]> -

The initiator acknowledges receipt of the session-accept (not shown here) and the parties can exchange "stub" media data over the "stub" transport.

+

The initiating client acknowledges receipt of the session-accept message (not shown here) and the parties can exchange "stub" media data over the "stub" transport.

Eventually, one of the parties (here the responder) will terminate the session.

- @@ -401,14 +421,14 @@ Romeo Juliet ]]> -

The recipient acknowledges receipt of the session-terminate (not shown here) and the session is ended.

+

The initiator's acknowledges receipt of the session-terminate message (not shown here) and the session is ended.

We now "fill in the blanks" for the &DESCRIPTION; and &TRANSPORT; elements with a more complex example: a voice chat session, where application type is a Jingle RTP session (with several different codec possibilities) and the transport method is &xep0176;.

- + action='session-initiate' initiator='romeo@montague.lit/orchard' sid='a73sjjvkla37jfea'> @@ -451,20 +471,20 @@ Romeo Juliet ]]> -

Upon receiving the session-initiate stanza, the responder determines whether it can proceed with the negotiation. If there is no error, the responder acknowledges the session initiation request.

+

Upon receiving the session-initiate message, the responder determines whether it can proceed with the negotiation. If there is no error, the responder acknowledges the session initiation request.

]]> -

After successful transport negotiation (not shown here), the responder accepts the session by sending a session-accept action to the initiator (including the negotiated transport details and the subset of offered codecs that the responder supports).

+

When the responding user affirms that she would like to proceed with the session, the responding client sends a session-accept message to the initiator (including in this example the subset of offered codecs that the responding client supports and one or more transport candidates generated by the responder).

- ]]> -

And the initiator acknowledges session acceptance:

+

And the initiating client acknowledges session acceptance:

]]> -

The initiator and responder would then exchange media using any of the acceptable codecs.

+

Once the parties finish the transport negotiation, they would then exchange media using any of the acceptable codecs.

Eventually, one of the parties (here the responder) will terminate the session.

- @@ -518,7 +538,7 @@ Romeo Juliet

The other party then acknowledges termination of the session:

]]> @@ -530,15 +550,16 @@ Romeo Juliet
  • Make it possible to manage a wide variety of peer-to-peer sessions (including but not limited to voice and video) within XMPP.
  • When a peer-to-peer connection cannot be negotiated, make it possible to fall back to relayed communications.
  • Clearly separate the signalling channel (XMPP) from the data channel.
  • -
  • Clearly separate the application formats (e.g., audio) from the transport methods (e.g., RTP).
  • +
  • Clearly separate the application format (e.g., RTP audio) from the transport method (e.g., UDP).
  • Make it possible to add, modify, and remove both application types and transport methods in an existing session.
  • Make it relatively easy to implement support for the protocol in standard Jabber/XMPP clients.
  • Where communication with non-XMPP entities is needed, push as much complexity as possible onto server-side gateways between the XMPP network and the non-XMPP network.
  • This document defines the signalling protocol only. Additional documents specify the following:

      -
    • Various application formats (audio, video, etc.) and, where possible, mapping of those types to the Session Description Protocol (SDP; see &rfc4566;); examples include Jingle RTP Sessions and &xep0234;.

    • -
    • Various transport methods; examples include Jingle ICE-UDP Transport and &xep0177;.

    • +
    • Various application formats (audio, video, etc.) and, where possible, mapping of those types to the Session Description Protocol (SDP; see &rfc4566;); examples include Jingle RTP Sessions and Jingle File Transfer.

    • +
    • Various transport methods; examples include Jingle ICE-UDP Transport Method, Jingle Raw UDP Transport Method, Jingle In-Band Bytestreams Transport Method, and Jingle SOCKS5 Bytestreams Transport Method.

    • +
    • Various methods of securing the transport before using it to send application data; the only method defined so far is Transport Layer Security as described in &xmppe2e;.

    • Procedures for mapping the Jingle signalling protocol to existing signalling standards such as the IETF's Session Initiation Protocol (SIP) and the ITU's H.323 protocol (see &h323;); see for example &xmppsipmedia;.

    @@ -567,15 +588,15 @@ Romeo Juliet Transport Method - The method for establishing data stream(s) between entities. Possible transports might include ICE-UDP, ICE-TCP, Raw UDP, inband data, etc. This is the 'how' of the session. In Jingle XML syntax this is the namespace of the &TRANSPORT; element. The transport method defines how to transfer bits from one host to another. Each transport method MUST specify whether it is "datagram" or "streaming". + The method for establishing data stream(s) between entities. Possible transports might include ICE-UDP, ICE-TCP, Raw UDP, In-Band Bytestreams, SOCKS5 Bytestreams, etc. This is the 'how' of the session. In Jingle XML syntax this is the namespace of the &TRANSPORT; element. The transport method defines how to transfer bits from one host to another. Each transport method MUST specify whether it is "datagram" or "streaming" as described in the Transport Types sectio of this document.

    In diagrams, the following conventions are used:

    @@ -589,15 +610,15 @@ Romeo Juliet

    This document defines the semantics and syntax for overall session management. It also provides pluggable "slots" for application formats and transport methods, which are specified in separate documents.

    At the most basic level, the process for initial negotiation of a Jingle session is as follows:

      -
    1. One user (the "initiator") sends to another user (the "responder") a session initiation request with at least one content definition.
    2. -
    3. If the responder wants to proceed, it acknowledges the session initiation request by sending an IQ result.
    4. -
    5. The parties attempt to set up data transmission over the designated transport method as defined in the relevant specification, which might involve the exchange of transport-info actions.
    6. -
    7. Optionally, either party can add or remove content definitions, or change the direction of the media flow.
    8. -
    9. Optionally, either party can send session-info actions (to inform the other party that it is attempting transport negotiation, that its device is ringing, etc.).
    10. -
    11. As soon as the responder determines that data can flow over the designated transport, it sends to the initiator a session-accept action.
    12. -
    13. The parties start sending data over the transport.
    14. +
    15. One user (the "initiator") sends to another user (the "responder") a session-initiate message containing at least one content definition, each of which defines one application type, one transport method, and optionally one security precondition.
    16. +
    17. If the responder wishes to proceed, it sends a session-accept message to the initiator, optionally including one or more transport candidates (depending on the transport method specified in the session-initiate message).
    18. +
    19. The parties attempt to establish connectivity over the offered transport method as defined in the relevant specification, which might involve the exchange of transport-info messages for additional transport candidates; if connectivity cannot be established then the parties might attempt to fall back to another transport method using the transport-replace and transport-accept messages.
    20. +
    21. Optionally, the parties attempt to establish security for the transport method before using it to exchange application data.
    22. +
    23. Optionally, either party can add or remove content definitions, or change the direction of the media flow, using the content-add, content-remove, and content-modify messages.
    24. +
    25. Optionally, either party can send session-info messages (e.g., to inform the other party that its device is ringing).
    26. +
    27. As soon as the responder determines that data can flow over the negotiated transport (potentially only after a security precondition has been met), they start sending application data over the transport.
    -

    After the initial session negotiation has been completed and the session is in the ACTIVE state, the parties can adjust the session definition by sending additional Jingle actions, such as content-modify, content-remove, content-add, description-info, and transport-replace. In addition, certain transport methods allow continued sending of transport-info actions while in the ACTIVE state. And naturally the parties can send session-info actions at any time.

    +

    Even after application data is being exchanged, the parties can adjust the session definition by sending additional Jingle messages, such as content-modify, content-remove, content-add, description-info, security-info, session-info, and transport-replace.

    The state machine for overall session management (i.e., the state per Session ID) is as follows:

    @@ -649,7 +670,8 @@ PENDING o----------------------+ |
  • ACTIVE
  • ENDED
  • -

    The actions related to management of the overall Jingle session are as follows (detailed definitions are provided in under Action Attribute).

    +

    Note: While it is allowed to send all actions while in the PENDING state, typically the responder will send a session-accept message as quickly as possible in order to expedite the transport negotiation; see the Security Considerations section of this document regarding informatino exposure when the responder sends transport candidates to the initiator.

    +

    The actions related to management of the overall Jingle session are as follows (detailed definitions are provided in the Action Attribute section of this document).