<abstract>This document defines methods for negotiating Jingle audio sessions that use the Real-time Transport Protocol (RTP) for media exchange.</abstract>
<remark><p>Renamed to mention RTP as the associated transport; corrected negotiation flow to be consistent with SIP/SDP (each party specifies a list of the payload types it can receive); added profile attribute to content element in order to specify RTP profile in use.</p></remark>
<remark><p>Specified how to include SDP parameters and codec-specific parameters; clarified negotiation process; added Speex examples; removed queued info message.</p></remark>
<remark><p>Defined info message for busy; added info message examples; recommended use of Speex; updated schema and XMPP Registrar considerations.</p></remark>
<p>&xep0166; can be used to initiate and negotiate a wide range of peer-to-peer sessions. One session type of interest is audio chat. This document specifies a format for describing Jingle audio sessions.</p>
<p>A Jingle audio session is described by one or more encodings contained within a wrapper <description/> element. In the language of <cite>RFC 4566</cite> these encodings are payload-types; therefore, each <payload-type/> element specifies an encoding that can be used for the audio stream. In Jingle Audio, these encodings are used in the context of RTP. The most common encodings for the Audio/Video Profile (AVP) of RTP are listed in &rfc3551; (these "static" types are reserved from payload ID 0 through payload ID 95), although other encodings are allowed (these "dynamic" types use payload IDs 96 to 127) in accordance with the dynamic assignment rules described in Section 3 of <cite>RFC 3551</cite>.</p>
<p>Each <payload-type/> element MAY contain one or more child elements that specify particular parameters related to the payload. For example, as described in <cite>draft-ietf-avt-rtp-speex</cite><note>This Internet-Draft has expired; see <<linkurl='http://www.watersprings.org/pub/id/draft-ietf-avt-rtp-speex-00.txt'>http://www.watersprings.org/pub/id/draft-ietf-avt-rtp-speex-00.txt</link>> for an archived version.</note>, the "ebw", "eng", "mode", "sr", and "vbr" parameters may be specified in relation to usage of the Speex <note>See <<linkurl='http://www.speex.org/'>http://www.speex.org/</link>>.</note> codec. Where such parameters are encoded via the "fmtp" SDP attribute, they shall be represented in Jingle via the following format:</p>
<p>Note: The parameter names are effectively guaranteed to be unique, since &IANA; maintains a registry of SDP parameters (see <<linkurl='http://www.iana.org/assignments/sdp-parameters'>http://www.iana.org/assignments/sdp-parameters</link>>).</p>
<section1topic='Negotiating a Jingle Audio Session'anchor='negotiation'>
<p>When the initiator sends a session-initiate stanza to the receiver, the &DESCRIPTION; element includes all of the payload types that the initiator can receive for Jingle audio (each one encapsulated in a separate &PAYLOADTYPE; element):</p>
<p>Upon receiving the session-initiate stanza, the receiver determines whether it can provisionally accept the session and proceed with the negotiation. The general Jingle error cases are specified in <cite>XEP-0166</cite>. In addition, the receiver must determine if it supports any of the payload types advertised by the initiator; if it does not, it MUST reject the session by sending a <unsupported-codecs/> error:</p>
<examplecaption="Receiver Does Not Support Codecs"><![CDATA[
<p>The receiver then should send a list of the payload types that it can receive via a Jingle "content-accept" (or "session-accept") action. The list that the receiver sends MAY include any payload types (not a subset of the payload types sent by the initiator) but SHOULD retain the ID numbers and order specified by the initiator.</p>
<p>If the payload type is static (payload-type IDs 0 through 95 inclusive), it MUST be mapped to a media field defined in <cite>RFC 4566</cite>. The generic format for the media field is as follows:</p>
<p>In the context of Jingle audio sessions, the <media> is "audio", the <port> is the preferred port for such communications (which may be determined dynamically), the <transport> is whatever transport method is negotiated via the Jingle negotiation (e.g., "RTP/AVT"), and the <fmt list> is the payload-type ID.</p>
<p>If the payload type is dynamic (payload-type IDs 96 through 127 inclusive), it SHOULD be mapped to an SDP media field plus an SDP attribute field named "rtpmap".</p>
<p>As noted, if additional parameters are to be specified, they shall be represented as attributes of the <payload-type/> element of the child <parameter/> element, as in the following example.</p>
<p>Informational messages may be sent by either party within the context of Jingle to communicate the status of a Jingle audio session, device, or principal. The informational message MUST be an IQ-set containing a &JINGLE; element of type "session-info", where the informational message is a payload element qualified by the 'http://www.xmpp.org/extensions/xep-0167.html#ns-info' namespace; the following payload elements are defined: <note>A <trying/> element (equivalent to the SIP 100 Trying response code) is not necessary, since each session-level action is acknowledged via XMPP IQ semantics.</note></p>
<p>Note: Because the informational message is sent in an IQ-set, the receiving party MUST return either an IQ-result or an IQ-error (normally only an IQ-result to acknowledge receipt; no error flows are defined or envisioned at this time).</p>
<p>If an entity supports Jingle audio exchanges via RTP, it MUST advertise that fact by returning a feature of "http://www.xmpp.org/extensions/xep-0167.html#ns" &NSNOTE; in response to &xep0030; information requests.</p>
<examplecaption="Service Discovery Information Request"><![CDATA[
<p>When the Jingle Audio content is accepted via a 'content-accept' action, both initiator and responder SHOULD start listening for audio as defined by the negotiated transport method and audio description. For interoperability with telephony systems, each entity SHOULD both play any audio received and send a ringing tone at this time (i.e., before the receiver sends a 'session-accept' action).</p>
<p>In order to secure the data stream, implementations SHOULD use encryption methods appropriate to the transport method and media being exchanged; for example, in the case of UDP, that would include Datagram Transport Layer Security (DTLS) as specified in &rfc4347;. &sdpdtls; defines such methods for the Session Description Protocol; the relevant RTP profile (e.g., "UDP/TLS/RTP/AVP" for transporting the RTP stream over DTLS with UDP) shall be specified as the value of the &CONTENT; element's 'profile' attribute.</p>
<p>Until this specification advances to a status of Draft, its associated namespaces shall be "http://www.xmpp.org/extensions/xep-0167.html#ns" and "http://www.xmpp.org/extensions/xep-0167.html#ns-info"; upon advancement of this specification, the ®ISTRAR; shall issue permanent namespaces in accordance with the process defined in Section 4 of &xep0053;.</p>