<p>&xep0167; defines the &xep0166; signalling exchanges needed to establish voice chat and other audio sessions using the Real-time Transport Protocol &rfc3550;; however, it does not say which audio codecs are mandatory-to-implement, since the state of codec technologies is more fluid than the signalling interactions. This document fills that gap by providing guidance to Jingle developers regarding audio codecs.</p>
<p>Because codec technologies are typically subject to patents, the topics discussed here are controversial. This document attempts to steer a middle path between (1) specifying mandatory-to-implement technologies that realistically will not be implemented and deployed and (2) providing guidelines that, while realistic, do not encourage the implementation and deployment of patent-clear technologies.</p>
<di><dt>Quality</dt><dd>The encoding quality is acceptable for deployment among XMPP users.</dd></di>
<di><dt>Packetization</dt><dd>The specification of the codec clearly defines packetization of data for sending over RTP.</dd></di>
<di><dt>Availability</dt><dd>The codec can be implemented on a wide variety of computing platforms and is commonly used in Internet or other systems.</dd></di>
<di><dt>Patents</dt><dd>The codec is patent-clear. <note>The term patent-clear does not necessarily mean that no patents have ever been applied for or granted regarding a technology, or that the technology is completely free from patents (since such a judgment is nearly impossible to make, and is outside the purview of the XMPP developer community and the XMPP Standards Foundation); the term means only that those who implement the technology are generally understood to be relatively safe from the threat of patent litigation, either because any relevant patents have expired, were filed in a defensive manner, or are made available under suitable royalty-free licenses.</note> (Although most XMPP developers would prefer to implement codecs that are patent-clear, such options are not always widely implemented and deployed.)</dd></di>
<p>Unfortunately, not all codecs meet those criteria. In the remainder of this document we discuss the audio codecs that are most appropriate for implementation in Jingle RTP applications.</p>
<p>G.711 refers to the Pulse Code Modulation (PCM) codec defined in &ITU; recommendation G.711, which is widely used on the public switched telephone network (PSTN) and by many voice over Internet Protocol (VoIP) providers. There are two versions: the μ-law ("U-law") version is widely deployed in North America and in Japan, whereas the A-law version is widely deployed in the rest of the world. The following table summarizes the available information about G.711.</p>
<tablecaption='Codec Considerations for G.711'>
<tr>
<th>Quality</th>
<th>Packetization</th>
<th>Availability</th>
<th>Patents</th>
</tr>
<tr>
<td>Good quality; no wide-band mode.</td>
<td>See &rfc5391;.</td>
<td>Commonly deployed in both PSTN and VoIP systems.</td>
<p>The Opus codec is under development within the IETF's <linkurl='http://tools.ietf.org/wg/codec/'>Codec Working Group</link>. In essence it combines the best features of CELT (developed by Jean-Marc Valin, the creator of Speex) and SILK (created by and widely used in the Skype service). The following table summarizes the available information about Opus.</p>
<tablecaption='Codec Considerations for Opus'>
<tr>
<th>Quality</th>
<th>Packetization</th>
<th>Availability</th>
<th>Patents</th>
</tr>
<tr>
<td>Extremely high quality; can be used for wide-band audio.</td>
<td>Covered under IETF IPR rules, the intent is for the codec to be covered under a simplified BSD license. See <linkurl='http://tools.ietf.org/html/draft-ietf-codec-opus'>http://tools.ietf.org/html/draft-ietf-codec-opus</link> for details. Not commonly deployed yet, but the SILK codec on which it is partly based is very widely deployed.</td>
<td>Designed to be patent-clear, but IPR claims have been filed.</td>
<p>According to the speex.org website, the Speex codec is "an Open Source/Free Software patent-free audio compression format designed for speech". Speex was developed by Jean-Marc Valin and is maintained by the <linkurl='http://www.xiph.org/'>Xiph.org Foundation</link>. The following table summarizes the available information about Speex.</p>
<td>Good quality; optimized for voice; can be used for wide-band audio.</td>
<td>See &rfc5574;.</td>
<td>Freely downloadable under a revised BSD license at <<linkurl='http://speex.org/'>http://speex.org/</link>> and commonly deployed on Internet (VoIP) systems; not commonly deployed on non-Internet systems.</td>
<p>Given that both Speex and G.711 are patent-clear, freely implementable, and commonly deployed, implementers are encouraged to consider including support for both codecs in audio applications of Jingle RTP sessions. Discussion on the jingle@xmpp.org mailing list indicates a slight preference for G.711 because it is easily available and so widely deployed (e.g., in SIP networks and the PSTN). The Opus codec is not yet widely deployed (or even fully developed), but it might become the "codec of the future" for audio applications over the Internet.</p>
<p>As of June 2011, this document makes the following recommendations:</p>
<ol>
<li>Jingle clients MUST implement both PCMU and PCMA.</li>
<li>Gateways between Jingle networks and other networks (e.g., SIP networks and the PSTN) MUST implement either PCMU or PCMA (and preferably both).</li>
</ol>
<p>Naturally, clients and gateways can implement additional codecs, such as those listed in this document.</p>
<p>For security considerations related to Jingle RTP sessions, refer to <cite>XEP-0167</cite>. This document introduces no new security considerations. See also the security considerations described in the relevant codec specifications.</p>
<p>Thanks to Olivier Crête, Dave Cridland, Florian Jensen, Justin Karneges, Evgeniy Khramtsov, Marcus Lundblad, Tobias Markmann, Pedro Melo, Jack Moffitt, Jeff Muller, Jehan Pagès, Arc Riley, Kevin Smith, Remko Tronçon, Justin Uberti, and Paul Witty for their feedback.</p>