1
0
mirror of https://github.com/moparisthebest/xeps synced 2024-12-21 23:28:51 -05:00

this is 0.11

git-svn-id: file:///home/ksmith/gitmigration/svn/xmpp/trunk@151 4b5297f7-1745-476d-ba37-a9c6900126ab
This commit is contained in:
Peter Saint-Andre 2006-10-31 22:48:05 +00:00
parent 51ab061ce8
commit ae0c987203

View File

@ -154,18 +154,23 @@
</revision>
</header>
<section1 topic='Introduction' anchor='intro'>
<p>There exists no widely-adopted standard for initiating and managing peer-to-peer (p2p) interactions (such as voice, video, or file sharing exchanges) from within Jabber/XMPP clients. Although several large service providers and Jabber/XMPP clients have written and implemented their own proprietary XMPP extensions for p2p signalling (usually only for voice), those technologies are not open and do not always take into account requirements to interoperate with the Public Switched Telephone Network (PSTN) or emerging SIP-based Internet voice networks. By contrast, the only existing open protocol has been &xep0111;, which made it possible to initiate and manage p2p sessions, but which did not provide enough of the key signalling semantics to be easily implemented in Jabber/XMPP clients. <note>It is true that TINS made it relatively easy to implement an XMPP-to-SIP gateway; however, in line with the long-time Jabber philosophy of "simple clients, complex servers", it would be better to force complexity onto the server-side gateway and to keep the client as simple as possible.</note></p>
<p>There exists no widely-adopted standard for initiating and managing peer-to-peer (p2p) interactions (such as voice, video, or file sharing exchanges) from within Jabber/XMPP clients. Although several large service providers and Jabber/XMPP clients have written and implemented their own proprietary XMPP extensions for p2p signalling (usually only for voice), those technologies are not open and do not always take into account requirements to interoperate with the Public Switched Telephone Network (PSTN) or Voice over Internet Protocol (VoIP) networks based on the IETF's Session Initiation Protocol (SIP) as specified in &rfc3261; and its various extensions.</p>
<p>By contrast, the only existing open protocol has been &xep0111;, which made it possible to initiate and manage p2p sessions, but which did not provide enough of the key signalling semantics to be easily implemented in Jabber/XMPP clients. <note>It is true that TINS made it relatively easy to implement an XMPP-to-SIP gateway; however, in line with the long-time Jabber philosophy of "simple clients, complex servers", it would be better to force complexity onto the server-side gateway and to keep the client as simple as possible.</note></p>
<p>The result has been an unfortunate fragmentation within the XMPP community regarding signalling protocols. There are, essentially, two approaches to solving the problem:</p>
<ol>
<li>Recommend that all client developers implement a dual-stack (XMPP + SIP) solution.</li>
<li>Define a full-featured protocol for XMPP signalling.</li>
</ol>
<p>Implementation experience indicates that a dual-stack approach may not be feasible on all the computing platforms for which Jabber clients have been written, or even desirable on platforms where it is feasible. <note>For example, one large ISP recently decided to switch to a pure XMPP approach after having implemented and deployed a dual-stack client for several years.</note> Therefore, it seems reasonable to define an XMPP signalling protocol that can provide the necessary signalling semantics while also making it possible to interoperate with existing Internet standards.</p>
<p>As a result of feedback received on <cite>XEP-0111</cite>, the second and fourth authors of this document began to define such a signalling protocol, code-named Jingle. Upon communication with members of the Google Talk team, <note>Google Talk is a messaging and voice chat service and client provided by Google; see &lt;<link url='http://www.google.com/talk/'>http://www.google.com/talk/</link>&gt;.</note> it was discovered that the emerging Jingle approach was conceptually (and even syntactically) quite similar to the signalling protocol used in the Google Talk application. Therefore, in the interest of interoperability and adoption, we decided to harmonize the two approaches. The signalling protocol specified therein is, therefore, substantially equivalent to the existing Google Talk protocol, with several adjustments based on feedback received from implementors as well as for publication within the Jabber Software Foundation's standards process.</p>
<p>Implementation experience indicates that a dual-stack approach may not be feasible on all the computing platforms for which Jabber clients have been written, or even desirable on platforms where it is feasible. Therefore, it seems reasonable to define an XMPP signalling protocol that can provide the necessary signalling semantics while also making it relatively straightforward to interoperate with existing Internet standards.</p>
<p>As a result of feedback received on <cite>XEP-0111</cite>, the original authors of this document (Joe Hildebrand and Peter Saint-Andre) began to define such a signalling protocol, code-named Jingle. Upon communication with members of the Google Talk team, <note>Google Talk is a messaging and voice chat service and client provided by Google; see &lt;<link url='http://www.google.com/talk/'>http://www.google.com/talk/</link>&gt;.</note> it was discovered that the emerging Jingle approach was conceptually (and even syntactically) quite similar to the signalling protocol used in the Google Talk application. Therefore, in the interest of interoperability and adoption, we decided to harmonize the two approaches. The signalling protocol specified herein is, therefore, substantially equivalent to the existing Google Talk protocol, with several adjustments based on feedback received from implementors as well as for publication within the Jabber Software Foundation's standards process.</p>
<p>The purpose of Jingle is not to supplant or replace SIP. Because dual-stack XMPP+SIP clients are difficult to build, given that they essentially have two centers of program control, <note>For example, one large ISP recently decided to switch to a pure XMPP approach after having implemented and deployed a dual-stack client for several years.</note> we have designed Jingle as a pure XMPP signalling protocol. However, Jingle is intended to interwork with SIP so that the millions of deployed XMPP clients can be added onto the existing open VoIP networks, rather than limiting XMPP users to a separate and distinct VoIP network.</p>
</section1>
<section1 topic='Requirements' anchor='reqs'>
<p>The protocol defined herein is designed to meet the following requirements:</p>
<ol>
<li>Make it possible to manage a wide variety of peer-to-peer sessions (not limited to voice and video) within XMPP. <note>Possible other content description formats include file sharing, application casting, application sharing, whiteboarding, torrent broadcasting, shared real-time editing, and distributed musical performance, to name but a few.</note>.</li>
<li>Clearly separate the signalling channel (XMPP) from the data channel (e.g., Real-time Transport Protocol as specified in &rfc3550;).</li>
<li>Clearly separate the content description formats (e.g., for voice chat) from the content transport methods (e.g., User Datagram Protocol as specified in &rfc0768;).</li>
<li>Make it possible to manage a wide variety of peer-to-peer sessions (not limited to voice and video) within XMPP. <note>Possible other content description formats include file sharing, application casting, application sharing, whiteboarding, torrent broadcasting, shared real-time editing, and distributed musical performance, to name but a few.</note></li>
<li>Make it possible to add, modify, and remove content types from an existing session.</li>
<li>Make it relatively easy to implement support for the protocol in standard Jabber/XMPP clients.</li>
@ -173,9 +178,9 @@
</ol>
<p>This document defines the signalling protocol only. Additional documents specify the following:</p>
<ul>
<li><p>Various content description formats (audio, video, etc.) and, where possible, mapping those types to the Session Description Protocol (SDP; see &rfc4566;); one example is &xep0167;.</p></li>
<li><p>Various content transport methods.</p></li>
<li><p>Procedures for mapping the Jingle signalling protocol to existing signalling standards such as the IETF's Session Initiation Protocol (SIP; see &rfc3261;) and the ITU's H.323 protocol (see &h323;).</p></li>
<li><p>Various content description formats (audio, video, etc.) and, where possible, mapping those types to the Session Description Protocol (SDP; see &rfc4566;); examples include &xep0167; and &xep0180;.</p></li>
<li><p>Various content transport methods; examples include &xep0176; and &xep0177;.</p></li>
<li><p>Procedures for mapping the Jingle signalling protocol to existing signalling standards such as the IETF's Session Initiation Protocol (SIP; see &rfc3261;) and the ITU's H.323 protocol (see &h323;); these documents are not yet written</p></li>
</ul>
</section1>
<section1 topic='Glossary' anchor='glossary'>
@ -186,15 +191,15 @@
</tr>
<tr>
<td>Session</td>
<td>A number of pairs of negotiated content transport methods and content description formats connecting two entities. It is delimited in time by an initiate request and session ending events. During the lifetime of a session, pairs of content descriptions and content transport methods can be added or removed.</td>
<td>A number of pairs of negotiated content transport methods and content description formats connecting two entities. It is delimited in time by an initiate request and session ending events. During the lifetime of a session, pairs of content descriptions and content transport methods can be added or removed. A session consists of at least one active negotiated content type at a time.</td>
</tr>
<tr>
<td>Content Type</td>
<td>A formal declaration of the purpose(s) of the session. Common sessions might include types such as "voice", both "voice" and "video", and "file sharing". A session consists of at least one active negotiated content type at a time. Depending on the content type and the content description, one content description may require multiple components to be communicated by the transport. This is the 'what' of the session. In Jingle XML syntax the content type is the namespace of the &DESCRIPTION; element.</td>
<td>The combination of one content description and at least one content transport method.</td>
</tr>
<tr>
<td>Content Description</td>
<td>The details of the content type being established. For instance, this might describe the acceptable codecs when establishing a voice conversation. The XML elements for the content description are qualified by the namespace of the content type. The content description defines the bits to be transferred.</td>
<td>The format of the content type being established, which formally declares one purpose of the session (e.g., "voice" or "video"). This is the 'what' of the session (i.e., the bits to be transferred), such as the acceptable codecs when establishing a voice conversation. In Jingle XML syntax the content type is the namespace of the &DESCRIPTION; element.</td>
</tr>
<tr>
<td>Transport Method</td>
@ -202,7 +207,7 @@
</tr>
<tr>
<td>Component</td>
<td>A component is a numbered stream of data which needs to be transmitted between the endpoints for a given content type in the context of a given session. It is up to the transport to negotiate the details of each component. For instance, the voice content type might use two components, one to transmit an RTP stream, and another to transmit RTCP timing information.</td>
<td>A component is a numbered stream of data which needs to be transmitted between the endpoints for a given content type in the context of a given session. It is up to the transport to negotiate the details of each component. Depending on the content type and the content description, one content description may require multiple components to be communicated (e.g., the audio content type might use two components: one to transmit an RTP stream and another to transmit RTCP timing information).</td>
</tr>
</table>
</section1>
@ -224,7 +229,6 @@
<li>The parties start sending media over the negotiated candidate.</li>
</ol>
<p>If the parties later discover a better candidate, they perform a "transport-modify" or "content-modify" negotiation and then switch to the better candidate. Naturally they can also modify various other parameters related to the session (e.g., adding video to a voice chat).</p>
>>>>>>> 1.6
<section2 topic='Overall Session Management' anchor='concepts-session'>
<p>The state machine for overall session management (i.e., the state per Session ID) is as follows:</p>
<code>
@ -266,7 +270,7 @@ PENDING o---------------------+ |
<li>ACTIVE</li>
<li>ENDED</li>
</ol>
<p>The actions related to management of the overall Jingle session are:</p>
<p>The actions related to management of the overall Jingle session are as follows:</p>
<ol start='1'>
<li>content-accept</li>
<li>content-add</li>
@ -326,7 +330,7 @@ PENDING o---------------------+ |
<li>MODIFYING</li>
<li>ENDED</li>
</ol>
<p>The actions related to management of content description formats are:</p>
<p>The actions related to management of content description formats are as follows:</p>
<ol>
<li>description-accept</li>
<li>description-decline</li>
@ -336,7 +340,7 @@ PENDING o---------------------+ |
<p>These actions are defined in greater detail under <link url='#actions'>Jingle Actions</link>.</p>
</section3>
<section3 topic='Content Transport Methods' anchor='concepts-transport'>
<p>As with the content description formats, the content transport methods are specified in separate specifications. Possible content transport methods include Real-time Transport Protocol (RTP) with Interactive Connectivity Establishment (ICE) and raw UDP. Those specifications will also define the state chart for the content transport method in question.</p>
<p>As with the content description formats, the content transport methods are defined in separate specifications. Possible content transport methods include Real-time Transport Protocol (RTP) with Interactive Connectivity Establishment (ICE) and raw UDP. The relevant specifications also define the state chart for the content transport method in question.</p>
<p>The generic state machine for any given content transport method is as follows:</p>
<code>
START
@ -373,7 +377,7 @@ PENDING o---------------------+ |
<li>MODIFYING</li>
<li>ENDED</li>
</ol>
<p>The actions related to management of content transport methods are:</p>
<p>The actions related to management of content transport methods are as follows:</p>
<ol>
<li>transport-accept</li>
<li>transport-decline</li>
@ -384,8 +388,8 @@ PENDING o---------------------+ |
</section3>
</section2>
</section1>
<section1 topic='Protocol' anchor='protocol'>
<section2 topic='Resource Determination' anchor='protocol-resource'>
<section1 topic='Session Flow' anchor='session'>
<section2 topic='Resource Determination' anchor='session-resource'>
<p>In order to initiate a Jingle session, the initiating entity must determine which of the receiver's XMPP resources is best for the desired content description format. If a contact has only one XMPP resource, this task MUST be completed using &xep0030; or the presence-based profile of service discovery specified in &xep0115;.</p>
<p>Naturally, instead of sending service discovery requests to every contact in a user's roster, it is more efficient to use <cite>Entity Capabilities</cite>, whereby support for Jingle and various Jingle content description formats and content transport methods is determined for a client version in general (rather than on a per-JID basis) and then cached. Refer to <cite>XEP-0115</cite> for details.</p>
<p>If a contact has more than one XMPP resource, it may be that only one of the resources supports Jingle and the desired content description format, in which case the user MUST initiate the Jingle signalling with that resource.</p>
@ -522,13 +526,13 @@ PENDING o---------------------+ |
<p>Both entities MUST now consider the original session to be in the ENDED state, and if the initiating entity wishes to initiate a session with the redirected address it MUST do so by sending a session initiation request to that address with a new session ID.</p>
</section2>
<section2 topic='Decline' anchor='protocol-decline'>
<p>In order to decline the session initiation request, the receiver MUST acknowledge receipt of the session initiation request, then terminate the session as described in the <link url='#protocol-terminate'>Termination</link> section of this document.</p>
<p>In order to decline the session initiation request, the receiver MUST acknowledge receipt of the session initiation request, then terminate the session as described under <link url='#session-terminate'>Termination</link>.</p>
</section2>
<section2 topic='Negotiation' anchor='protocol-negotiation'>
<section2 topic='Negotiation' anchor='session-negotiation'>
<p>In general, negotiation will be necessary before the parties can agree on an acceptable set of content types, content description formats, and content transport methods. The potential combinations of parameters to be negotiated are many, and not all are shown herein (some are shown in the relevant specifications for various content description formats and content transport methods).</p>
<p>One session-level negotiation is to <em>remove</em> a content types. For example, let us imagine that Juliet is having a bad hair day. She certainly does not want to include video in her Jingle session with Romeo, so she sends a "content-remove" request to Romeo:</p>
<example caption="Content Type Removal"><![CDATA[
<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='reduce1' type='set'>
<iq from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='reduce1' type='set'>
<jingle xmlns='http://jabber.org/protocol/jingle'
action='content-remove'
initiator='romeo@montague.net/orchard'
@ -544,7 +548,7 @@ PENDING o---------------------+ |
<p>If the reduction results in no more content types for the session, the entity that receives the session-reduce SHOULD send a session-terminate action to the other party (since a session with no content types is void).</p>
<p>Another session-level negotiation is to <em>add</em> a content type; however, this MUST NOT be done during while the session is in the PENDING state and is allowed only while the session is in the ACTIVE state (see below).</p>
</section2>
<section2 topic='Acceptance' anchor='protocol-acceptance'>
<section2 topic='Acceptance' anchor='session-acceptance'>
<p>If (after negotiation of content transport methods and content description formats) the receiver determines that it will be able to establish a connection, it sends a definitive acceptance to the initiating entity:</p>
<example caption="Receiver Definitively Accepts the Call"><![CDATA[
<iq type='set' from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='accept1'>
@ -585,12 +589,12 @@ PENDING o---------------------+ |
<p>Now the initiating entity and receiver can begin sending content over the negotiated connection.</p>
<p>If one of the parties cannot find a suitable content transport method, it SHOULD terminate the session as described below.</p>
</section2>
<section2 topic='Modifying an Active Session' anchor='protocol-modify'>
<section2 topic='Modifying an Active Session' anchor='session-modify'>
<p>In order to modify an active session, either party may send a "content-remove", "content-add", "content-modify", "description-modify", or "transport-modify" action to the other party. The receiving party then sends an appropriate "-accept" or "-decline" action, and may first send an appropriate "-info" action.</p>
<p>If both parties send modify messages at the same time, the modify message from the session initiator MUST trump the modify message from the recipient and the initiator SHOULD return an &unexpected; error to the other party.</p>
<p>One example of modifying an active session is to <em>add</em> a content type. For example, let us imagine that Juliet gets her hair in order and now wants to add video. She now sends a "content-add" request to Romeo:</p>
<example caption="Adding a Content Type"><![CDATA[
<iq to='juliet@capulet.com/balcony' from='romeo@montague.net/orchard' id='add1' type='set'>
<iq from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='add1' type='set'>
<jingle xmlns='http://jabber.org/protocol/jingle'
action='content-add'
initiator='romeo@montague.net/orchard'
@ -638,7 +642,7 @@ PENDING o---------------------+ |
<iq from='juliet@capulet.com/balcony' to='romeo@montague.net/orchard' id='add2' type='result'/>
]]></example>
</section2>
<section2 topic='Termination' anchor='protocol-terminate'>
<section2 topic='Termination' anchor='session-terminate'>
<p>In order to gracefully end the session (which MAY be done at any point after acknowledging receipt of the initiation request, including immediately thereafter in order to decline the request), either the receiver or the initiating entity MUST a send a "terminate" action to the other party:</p>
<example caption="Receiver Terminates the Session"><![CDATA[
<iq from='juliet@capulet.com/balcony'
@ -666,8 +670,8 @@ PENDING o---------------------+ |
]]></example>
<p>Naturally, in this case there is nothing for the initiating entity to acknowledge.</p>
</section2>
<section2 topic='Informational Messages' anchor='protocol-info'>
<p>At any point after initiation of a Jingle session, either entity MAY send an informational message to the other party, for example to change a content transport method or content description format parameter, inform the other party that a session initiation request is queued, that a device is ringing, or that a scheduled event has occurred or will occur. An information message MUST be an IQ-set containing a &JINGLE; element whose 'action' attribute is set to a value of "session-info", "description-info", or "transport-info"; the &JINGLE; element MUST further contain a payload child element (specific to the session, content description format, or content transport method) that specifies the information being communicated. If an empty "session-info" message is received for an active session, the receiving entity MUST send an empty IQ result. This way, an empty "session-info" message may be used as a "ping" to determine session vitality. (A future version of this specification will define payloads related to the "session-info" action.)</p>
<section2 topic='Informational Messages' anchor='session-info'>
<p>At any point after initiation of a Jingle session, either entity MAY send an informational message to the other party, for example to change a content transport method or content description format parameter, inform the other party that a session initiation request is queued, that a device is ringing, or that a scheduled event has occurred or will occur. An information message MUST be an IQ-set containing a &JINGLE; element whose 'action' attribute is set to a value of "session-info", "description-info", or "transport-info"; the &JINGLE; element MUST further contain a payload child element (speciific to the session, content description format, or content transport method) that specifies the information being communicated. If an empty "session-info" message is received for an active session, the receiving entity MUST send an empty IQ result. This way, an empty "session-info" message may be used as a "ping" to determine session vitality. (A future version of this specification will define payloads related to the "session-info" action.)</p>
</section2>
</section1>
@ -726,7 +730,6 @@ PENDING o---------------------+ |
</section2>
<section2 topic='transport-modify' anchor='actions-transport-modify'>
<p>This action enables a party to request a change to the content transport methods.</p>
>>>>>>> 1.6
</section2>
</section1>