1
0
mirror of https://github.com/moparisthebest/xeps synced 2024-12-01 05:32:15 -05:00

XEP-0106: Update to use PRECIS and newer XMPP RFCs

This commit is contained in:
Sam Whited 2016-07-08 12:47:48 -05:00
parent ee6bd64538
commit d3e38e0426
2 changed files with 37 additions and 28 deletions

View File

@ -7,7 +7,7 @@
<xep> <xep>
<header> <header>
<title>JID Escaping</title> <title>JID Escaping</title>
<abstract>This specification defines a mechanism that enables the display in Jabber Identifiers (JIDs) of characters disallowed by the Nodeprep profile of stringprep. Although these characters -- space, double quote, ampersand, single quote, forward slash, colon, less than, greater than, and at-sign -- cannot be included in XMPP node identifiers, JID Escaping provides a native XMPP escaping mechanism for these characters so that the displayed version of a Jabber Identifier can appear to include these characters. This mechanism can also be used to translate non-XMPP addreses into XMPP syntax, for example when gatewaying between XMPP and a non-XMPP communications technology such as email.</abstract> <abstract>This specification defines a mechanism that enables the display in Jabber Identifiers (JIDs) of characters normally disallowed in localparts. Although these characters &mdash; spaces, double quote, ampersand, single quote, forward slash, colon, less than, greater than, and at-sign &mdash; cannot be included in XMPP localparts, JID Escaping provides a native XMPP escaping mechanism for these characters so that the displayed version of a Jabber Identifier can appear to include these characters. This mechanism can also be used to translate non-XMPP addreses into XMPP syntax, for example when gatewaying between XMPP and a non-XMPP communications technology such as email.</abstract>
&LEGALNOTICE; &LEGALNOTICE;
<number>0106</number> <number>0106</number>
<status>Draft</status> <status>Draft</status>
@ -22,6 +22,12 @@
<shortname>jid\20escaping</shortname> <shortname>jid\20escaping</shortname>
&hildjj; &hildjj;
&stpeter; &stpeter;
<revision>
<version>1.1.1</version>
<date>2016-07-08</date>
<initials>ssw</initials>
<remark><p>Update references to the node identifier to localpart, replace stringprep references with PRECIS, and update JID RFC references.</p></remark>
</revision>
<revision> <revision>
<version>1.1</version> <version>1.1</version>
<date>2007-06-18</date> <date>2007-06-18</date>
@ -78,9 +84,8 @@
</revision> </revision>
</header> </header>
<section1 topic='Introduction' anchor='intro'> <section1 topic='Introduction' anchor='intro'>
<p>&rfc3920; defines the Nodeprep profile of stringprep (&rfc3454;), which specifies that the following nine Unicode code points are disallowed in the node identifier portion of a Jabber Identifier (JID):</p> <p>&rfc7622; specifies that the following eight Unicode code points are disallowed in the localpart of a Jabber Identifier (JID):</p>
<ul> <ul>
<li>U+0020 (" ") <note>In fact all ASCII and non-ASCII space characters are disallowed, since the Nodeprep profile of stringprep prohibits all the characters specified in Appendices C.1.1 and C.1.2 of <cite>RFC 3454</cite>; however, all of these characters reduce to U+0020, also called SP.</note></li>
<li>U+0022 (")</li> <li>U+0022 (")</li>
<li>U+0026 (&amp;)</li> <li>U+0026 (&amp;)</li>
<li>U+0027 (')</li> <li>U+0027 (')</li>
@ -90,16 +95,17 @@
<li>U+003E (&gt;)</li> <li>U+003E (&gt;)</li>
<li>U+0040 (@)</li> <li>U+0040 (@)</li>
</ul> </ul>
<p>This restriction is an inconvenience for users who have one or more of these "disallowed characters" in their desired usernames, particularly in the case of the ' character, which is common in names like O'Hara and D'Artagnan. The restriction is a positive hardship if existing email addresses are mapped to JIDs, since some of the disallowed characters are allowed in the username portion of an email address (specifically, the characters &amp; ' / as described in Sections 3.2.4 and 3.2.5 of &rfc2822;).</p> <p>Furthermore, since localparts use the UsernameCaseMapped profile (&rfc7613;) of PRECIS any space character disallowed by category N (section 9.14) of the &rfc7564; IdentifierClass is also forbidden.</p>
<p>This restriction is an inconvenience for users who have one or more of these "disallowed characters" in their desired usernames, particularly in the case of the apostrophe character, which is common in names like O'Hara and D'Artagnan. The restriction is a positive hardship if existing email addresses are mapped to JIDs, since some of the disallowed characters are allowed in the username portion of an email address (specifically, the characters &amp; ' / as described in Sections 3.2.4 and 3.2.5 of &rfc2822;).</p>
<p>To overcome this restriction, we define a way to escape the disallowed characters in JIDs. An escaped JID contains none of the disallowed characters and therefore can be transported by native XMPP implementations without modification (e.g., existing XMPP servers do not require modification in order to handle escaped JIDs). The escaped JID is unescaped only for presentation to a human user (typically by an XMPP client) or for gatewaying to a non-XMPP system (such as an LDAP database or a messaging system that does not use XMPP).</p> <p>To overcome this restriction, we define a way to escape the disallowed characters in JIDs. An escaped JID contains none of the disallowed characters and therefore can be transported by native XMPP implementations without modification (e.g., existing XMPP servers do not require modification in order to handle escaped JIDs). The escaped JID is unescaped only for presentation to a human user (typically by an XMPP client) or for gatewaying to a non-XMPP system (such as an LDAP database or a messaging system that does not use XMPP).</p>
</section1> </section1>
<section1 topic='Requirements' anchor='reqs'> <section1 topic='Requirements' anchor='reqs'>
<p>This document addresses the following requirements:</p> <p>This document addresses the following requirements:</p>
<ol> <ol>
<li><p>The escaping mechanism shall apply to the node identitier portion of a JID only, and MUST NOT be applied to domain identifiers or resource identifiers.</p></li> <li><p>The escaping mechanism shall apply to the localpart of a JID only, and MUST NOT be applied to domainparts or resourceparts.</p></li>
<li><p>Escaped JIDs MUST conform to the definition of a Jabber ID as specified in <cite>RFC 3920</cite>, including the Nodeprep profile of stringprep. In particular this means that even after passing through Nodeprep, the JID MUST be valid, with the result that Unicode look-alikes like U+02BC (Modifier Letter Apostrophe) MUST NOT be used.</p></li> <li><p>Escaped JIDs MUST conform to the definition of a Jabber ID as specified in <cite>RFC 7622</cite>, including the UsernameCaseMapped profile of PRECIS. In particular this means that even after passing through the enforcement step of the UsernameCaseMapped profile, the JID MUST be valid, with the result that Unicode look-alikes like U+02BC (Modifier Letter Apostrophe) MUST NOT be used.</p></li>
<li><p>It MUST NOT be possible for clients to use this escaping mechanism to avoid the goal of stringprep; namely, that JIDs that look alike should have same character representation after being processed by stringprep. Therefore, this mechanism MUST NOT be applied to any characters other than the disallowed characters (with the exception that, in certain circumstances, the escaping character itself ("\") might also be escaped).</p></li> <li><p>It MUST NOT be possible for clients to use this escaping mechanism to avoid the goal of PRECIS; namely, that JIDs that look alike should have same character representation after being processed by PRECIS. Therefore, this mechanism MUST NOT be applied to any characters other than the disallowed characters (with the exception that, in certain circumstances, the escaping character itself ("\") might also be escaped).</p></li>
<li><p>Existing JIDs that include portions of the escaping mechanism MUST continue to be valid.</p></li> <li><p>Existing JIDs that include portions of the escaping mechanism MUST continue to be valid.</p></li>
<li><p>The escaping mechanism MUST NOT break commonly deployed Jabber/XMPP software implementations such as servers, components, gateways, and clients.</p></li> <li><p>The escaping mechanism MUST NOT break commonly deployed Jabber/XMPP software implementations such as servers, components, gateways, and clients.</p></li>
<li><p>The escaping mechanism SHOULD NOT place undue strain upon server implementations; implementations or deployments that do not need to unescape SHOULD be able to ignore the escaping mechanism.</p></li> <li><p>The escaping mechanism SHOULD NOT place undue strain upon server implementations; implementations or deployments that do not need to unescape SHOULD be able to ignore the escaping mechanism.</p></li>
@ -108,16 +114,16 @@
<section1 topic='Transformations' anchor='transforms'> <section1 topic='Transformations' anchor='transforms'>
<section2 topic='Concepts' anchor='concepts'> <section2 topic='Concepts' anchor='concepts'>
<p>This document specifies that each disallowed character shall be escaped as \hexhex -- where "hexhex" is the hexadecimal value of the Unicode code point in question, ignoring the leading "00" in the code point (e.g., 27 for the ' character, resulting in an escaping of \27).</p> <p>This document specifies that each disallowed character shall be escaped as \hexhex &mdash; where "hexhex" is the hexadecimal value of the Unicode code point in question, ignoring the leading "00" in the code point (e.g., 27 for the apostrophe character, resulting in an escaping of \27).</p>
<p>If the &amp; character had not been in the list of disallowed characters, then normal XML escaping conventions (as specified in &w3xml;) could have been used, with the result that D'Artagnan (for example) could have been rendered as D&amp;apos;artagnan [sic].</p> <p>If the &amp; character had not been in the list of disallowed characters, then normal XML escaping conventions (as specified in &w3xml;) could have been used, with the result that D'Artagnan (for example) could have been rendered as D&amp;apos;artagnan [sic].</p>
<p>It might have been desirable to use percent-encoding (e.g., %27 for the ' character) as specified in Section 2.1 of &rfc3986;. However, that approach was rejected since the % character is an often-used character in existing JIDs (e.g., to replace the @ character in gateway addresses) and the resulting ambiguity would have caused misdelivered or undeliverable messages.</p> <p>It might have been desirable to use percent-encoding (e.g., %27 for the apostrophe character) as specified in Section 2.1 of &rfc3986;. However, that approach was rejected since the % character is an often-used character in existing JIDs (e.g., to replace the @ character in gateway addresses) and the resulting ambiguity would have caused misdelivered or undeliverable messages.</p>
<p>To avoid the problems associated with using &amp; or % as the escaping character, this document specifies a new escaping mechanism that uses the backslash character ("\") followed by "hexhex" (the hexadecimal value of the Unicode code point in question). This escaping method is quite similar to that used for disallowed characters in LDAP distinguished names (see &rfc2253;) but is used only for the characters that are disallowed in XMPP node identifiers (as well as the escaping character itself in certain special situations).</p> <p>To avoid the problems associated with using &amp; or % as the escaping character, this document specifies a new escaping mechanism that uses the backslash character ("\") followed by "hexhex" (the hexadecimal value of the Unicode code point in question). This escaping method is quite similar to that used for disallowed characters in LDAP distinguished names (see &rfc2253;) but is used only for the characters that are disallowed in XMPP localparts (as well as the escaping character itself in certain special situations).</p>
<p>Here is an example of an escaped JID (this would be displayed but never natively transported as "d'artagnan@musketeers.lit"):</p> <p>Here is an example of an escaped JID (this would be displayed but never natively transported as "d'artagnan@musketeers.lit"):</p>
<code> <code>
d\27artagnan@musketeers.lit d\27artagnan@musketeers.lit
</code> </code>
<p>This document describes full escaping and unescaping transformations for all nine disallowed characters. In addition, escaping and unescaping transformations are shown for the \ character in case it also needs to be escaped when it occurs in a JID or non-XMPP address as part of a character sequence that corresponds to one of the escaped characters.</p> <p>This document describes full escaping and unescaping transformations for all disallowed characters. In addition, escaping and unescaping transformations are shown for the \ character in case it also needs to be escaped when it occurs in a JID or non-XMPP address as part of a character sequence that corresponds to one of the escaped characters.</p>
<p>Note: All transformations are exactly as specified below. CASE IS SIGNIFICANT. Lowercase was selected since Nodeprep will case fold to lowercase for US-ASCII characters such as A, C, E, and F.</p> <p>Note: All transformations are exactly as specified below. CASE IS SIGNIFICANT. Lowercase was selected since the Case-Mapping Rule of the UsernameCaseMapped profile will case fold to lowercase.</p>
</section2> </section2>
<section2 topic='Escaping Transformations' anchor='escaping'> <section2 topic='Escaping Transformations' anchor='escaping'>
<p>The escaping transformations are defined in the following table, whereas the rules that define when to apply these transformations are specified in the <link url='#bizrules'>Business Rules</link> section of this specification. Typically, escaping is performed only by a client that is processing information provided by a human user in unescaped form, or by a gateway to some external system (e.g., email or LDAP) that needs to generate a JID.</p> <p>The escaping transformations are defined in the following table, whereas the rules that define when to apply these transformations are specified in the <link url='#bizrules'>Business Rules</link> section of this specification. Typically, escaping is performed only by a client that is processing information provided by a human user in unescaped form, or by a gateway to some external system (e.g., email or LDAP) that needs to generate a JID.</p>
@ -134,7 +140,7 @@
<tr><td>@</td><td>\40</td></tr> <tr><td>@</td><td>\40</td></tr>
<tr><td>\</td><td>\5c</td></tr> <tr><td>\</td><td>\5c</td></tr>
</table> </table>
<p>* Note: The character sequence \20 MUST NOT be the first or last character of an escaped node identifier. <note>For a similar restriction, see Section 2.4 of <cite>RFC 2253</cite>.</note></p> <p>* Note: The character sequence \20 MUST NOT be the first or last character of an escaped localpart. <note>For a similar restriction, see Section 2.4 of <cite>RFC 2253</cite>.</note></p>
<p>In the following example, Porthos starts a chat with D'Artagnan, typing into his client the string "d'artagnan@musketeers.lit" (which is escaped by his client to "d\27artagnan@musketeers.lit").</p> <p>In the following example, Porthos starts a chat with D'Artagnan, typing into his client the string "d'artagnan@musketeers.lit" (which is escaped by his client to "d\27artagnan@musketeers.lit").</p>
<example caption="JID Escaping"><![CDATA[ <example caption="JID Escaping"><![CDATA[
<message <message
@ -175,12 +181,12 @@
<section2 topic='Native Processing' anchor='bizrules-processing'> <section2 topic='Native Processing' anchor='bizrules-processing'>
<p>The following processing rules apply to native XMPP implementations:</p> <p>The following processing rules apply to native XMPP implementations:</p>
<ol> <ol>
<li><p>A compliant client MUST render an escaped character as its unescaped equivalent when presenting it to a human user (e.g., present \27 as the ' character), but MAY provide a way for the user to view the escaped JID in its wire format (e.g., to compare two JIDs).</p></li> <li><p>A compliant client MUST render an escaped character as its unescaped equivalent when presenting it to a human user (e.g., present \27 as the apostrophe character), but MAY provide a way for the user to view the escaped JID in its wire format (e.g., to compare two JIDs).</p></li>
<li><p>A server or gateway MAY unescape an escaped character for communication with external systems (e.g. LDAP), but only <em>after</em> the Nodeprep profile of stringprep has been applied.</p></li> <li><p>A server or gateway MAY unescape an escaped character for communication with external systems (e.g. LDAP), but only <em>after</em> the UsernameCaseMapped profile of PRECIS has been applied.</p></li>
<li><p>An entity MUST unescape only the specified sequences and MUST NOT unescape sequences that do not match the specified sequences.</p></li> <li><p>An entity MUST unescape only the specified sequences and MUST NOT unescape sequences that do not match the specified sequences.</p></li>
<li><p>An entity MUST NOT include the unescaped version of a disallowed character over the wire in any XML stanzas sent to another entity (since by definition the unescaped version of a disallowed character violates Nodeprep).</p></li> <li><p>An entity MUST NOT include the unescaped version of a disallowed character over the wire in any XML stanzas sent to another entity.</p></li>
<li><p>An entity MUST NOT use the unescaped version of a disallowed character when comparing two JIDs.</p></li> <li><p>An entity MUST NOT use the unescaped version of a disallowed character when comparing two JIDs.</p></li>
<li><p>The character sequence \20 MUST NOT be the first or last character of an escaped node identifier.</p></li> <li><p>The character sequence \20 MUST NOT be the first or last character of an escaped localpart.</p></li>
<li><p>If there are any instances of character sequences that correspond to escapings of the disallowed characters (e.g., the character sequence "\27") or the escaping character (i.e., the character sequence "\5c") in the unescaped address, the leading backslash character MUST be escaped to the character sequence "\5c" (e.g., resulting in the character sequences "\5c27" or "\5c5c"). <note>It is possible that some existing JIDs already contain character sequences matching "\5chexhex" (where "hexhex" is the hexadecimal value of the Unicode code point for a disallowed character or the backslash character), which may result in confusion between escaped JIDs and their presentation in a client; however, a survey of one large XMPP deployment yielded no instances of such sequences or even of the character sequence "\5c".</note></p></li> <li><p>If there are any instances of character sequences that correspond to escapings of the disallowed characters (e.g., the character sequence "\27") or the escaping character (i.e., the character sequence "\5c") in the unescaped address, the leading backslash character MUST be escaped to the character sequence "\5c" (e.g., resulting in the character sequences "\5c27" or "\5c5c"). <note>It is possible that some existing JIDs already contain character sequences matching "\5chexhex" (where "hexhex" is the hexadecimal value of the Unicode code point for a disallowed character or the backslash character), which may result in confusion between escaped JIDs and their presentation in a client; however, a survey of one large XMPP deployment yielded no instances of such sequences or even of the character sequence "\5c".</note></p></li>
</ol> </ol>
</section2> </section2>
@ -194,7 +200,7 @@
</ol> </ol>
<p>While the fourth step should be clear from the foregoing text and the second step is necessary since XMPP addresses are not URIs, the meaning of the first and third steps may not be obvious.</p> <p>While the fourth step should be clear from the foregoing text and the second step is necessary since XMPP addresses are not URIs, the meaning of the first and third steps may not be obvious.</p>
<p>Regarding step one, many non-XMPP messaging systems use URIs to identify addresses (examples include the mailto:, sip:, sips:, im:, pres:, and wv: URI schemes) or follow some other encoding rules for an identifier (e.g., an LDAP distinguished name). Before transforming a non-XMPP address or identifier into a JID, the address or identifier MUST first be decoded according the rules specified for that type of address or identifier in order to ensure that the proper characters are transformed.</p> <p>Regarding step one, many non-XMPP messaging systems use URIs to identify addresses (examples include the mailto:, sip:, sips:, im:, pres:, and wv: URI schemes) or follow some other encoding rules for an identifier (e.g., an LDAP distinguished name). Before transforming a non-XMPP address or identifier into a JID, the address or identifier MUST first be decoded according the rules specified for that type of address or identifier in order to ensure that the proper characters are transformed.</p>
<p>Regarding step three, it is possible for some non-XMPP addresses to contain character sequences that correspond to JID-escaped characters (e.g., "\27"). Consider a Wireless Village address of &lt;wv:\3and\2is\5cool@example.com&gt; -- if that address were directly converted into a JID, the resulting XMPP address would be \3and\2is\5cool@example.com, which could be construed as :nd\2is\ool@example.com if JID escaping logic is applied. Therefore the leading \ character and the \ character before the character sequence 5c MUST be converted to the character sequence "\5c" during the transformation, leading to a JID of \5c3and\2is\5c5cool@example.com (which would be presented to a human user as \3and\2is\5cool@example.com).</p> <p>Regarding step three, it is possible for some non-XMPP addresses to contain character sequences that correspond to JID-escaped characters (e.g., "\27"). Consider a Wireless Village address of &lt;wv:\3and\2is\5cool@example.com&gt; &mdash; if that address were directly converted into a JID, the resulting XMPP address would be \3and\2is\5cool@example.com, which could be construed as :nd\2is\ool@example.com if JID escaping logic is applied. Therefore the leading \ character and the \ character before the character sequence 5c MUST be converted to the character sequence "\5c" during the transformation, leading to a JID of \5c3and\2is\5c5cool@example.com (which would be presented to a human user as \3and\2is\5cool@example.com).</p>
</section2> </section2>
<section2 topic='Exceptions' anchor='bizrules-exceptions'> <section2 topic='Exceptions' anchor='bizrules-exceptions'>
<p>In order to maintain as much backward compatibility as possible, partial escape sequences and escape sequences corresponding to characters not on the list of disallowed characters MUST be ignored (with the exception of the escaping character '\' itself in the rare case when the source address includes the sequence '\5c').</p> <p>In order to maintain as much backward compatibility as possible, partial escape sequences and escape sequences corresponding to characters not on the list of disallowed characters MUST be ignored (with the exception of the escaping character '\' itself in the rare case when the source address includes the sequence '\5c').</p>
@ -222,7 +228,7 @@
<li>LDAP distinguished names.</li> <li>LDAP distinguished names.</li>
</ul> </ul>
<section2 topic='Jabber Identifiers' anchor='examples-xmpp'> <section2 topic='Jabber Identifiers' anchor='examples-xmpp'>
<p>The following table shows user input, the escaped JID for sending over the wire, and client display (same as user input) for node identifiers that might possibly be used in native JIDs. The examples are numbered for easy reference. Naturally, a client that does not perform JID escaping would display the JIDs in their escaped form (e.g., "space\20cadet" instead of "space cadet").</p> <p>The following table shows user input, the escaped JID for sending over the wire, and client display (same as user input) for the localpart that might possibly be used in native JIDs. The examples are numbered for easy reference. Naturally, a client that does not perform JID escaping would display the JIDs in their escaped form (e.g., "space\20cadet" instead of "space cadet").</p>
<table caption='JID Examples'> <table caption='JID Examples'>
<tr><th>#</th><th>User Input</th><th>Escaped JID</th><th>Client Display</th></tr> <tr><th>#</th><th>User Input</th><th>Escaped JID</th><th>Client Display</th></tr>
<tr><td>1</td><td>space&#160;cadet@example.com</td><td>space\20cadet@example.com</td><td>space&#160;cadet@example.com</td></tr> <tr><td>1</td><td>space&#160;cadet@example.com</td><td>space\20cadet@example.com</td><td>space&#160;cadet@example.com</td></tr>
@ -241,7 +247,7 @@
</section2> </section2>
<section2 topic='Email Addresses' anchor='examples-email'> <section2 topic='Email Addresses' anchor='examples-email'>
<p>The address format for an Internet mailbox is specified in <cite>RFC 2822</cite>. The identifier of interest in this context is the "addr-spec" address and more particularly the "dot-atom-text" rule specified in Section 3.2.4, i.e., the email address shorn of angle brackets, display names, comments, quoted strings, and the like. Because some deployments of XMPP messaging systems may want to re-use existing email addresses as JIDs, it is helpful to define how to transform an email address into a JID.</p> <p>The address format for an Internet mailbox is specified in <cite>RFC 2822</cite>. The identifier of interest in this context is the "addr-spec" address and more particularly the "dot-atom-text" rule specified in Section 3.2.4, i.e., the email address shorn of angle brackets, display names, comments, quoted strings, and the like. Because some deployments of XMPP messaging systems may want to re-use existing email addresses as JIDs, it is helpful to define how to transform an email address into a JID.</p>
<p>In general, it is straightforward to transform an email address (i.e., a "dot-atom-text") into a JID, since traditional email addresses allow US-ASCII characters only rather than the nearly full range of Unicode code points allowed in a JID. <note>This specification does not cover recent efforts to define internationalized email addresses.</note> However, there are three characters allowed in the local-part of an email address that are not allowed in the node identifier portion of a JID: namely, the characters &amp; ' / as described in Sections 3.2.4 and 3.2.5 of <cite>RFC 2822</cite>. In order to transform these characters, a compliant implementation MUST use the methods specified herein.</p> <p>In general, it is straightforward to transform an email address (i.e., a "dot-atom-text") into a JID, since traditional email addresses allow US-ASCII characters only rather than the nearly full range of Unicode code points allowed in a JID. <note>This specification does not cover recent efforts to define internationalized email addresses.</note> However, there are three characters allowed in the localpart of an email address that are not allowed in the localpart portion of a JID: namely, the characters &amp; ' / as described in Sections 3.2.4 and 3.2.5 of <cite>RFC 2822</cite>. In order to transform these characters, a compliant implementation MUST use the methods specified herein.</p>
<example caption='An Email Address Containing JID-Disallowed Characters'><![CDATA[ <example caption='An Email Address Containing JID-Disallowed Characters'><![CDATA[
here's_a_wild_&_/cr%zy/_address@example.com here's_a_wild_&_/cr%zy/_address@example.com
]]></example> ]]></example>
@ -280,7 +286,7 @@ mailto:here%27s_a_wild_%26_%2Fcr%zy%2F_address@example.com
]]></example> ]]></example>
</section2> </section2>
<section2 topic='SIP Addresses' anchor='examples-sip'> <section2 topic='SIP Addresses' anchor='examples-sip'>
<p>As specified in &rfc3261;, a SIP address (i.e., a sip: or sips: URI) can be quite complex if URI parameters or headers are included. However, a basic SIP address (the combination of the optional "userinfo" and required "hostport" constructions) is essentially similar to an email address (e.g., the same characters &amp; ' / allowed in an email address but disallowed in an XMPP node identifier are also allowed in a basic SIP address).</p> <p>As specified in &rfc3261;, a SIP address (i.e., a sip: or sips: URI) can be quite complex if URI parameters or headers are included. However, a basic SIP address (the combination of the optional "userinfo" and required "hostport" constructions) is essentially similar to an email address (e.g., the same characters &amp; ' / allowed in an email address but disallowed in an XMPP localpart are also allowed in a basic SIP address).</p>
<example caption='A Basic sip: URI Containing JID-Disallowed Characters'><![CDATA[ <example caption='A Basic sip: URI Containing JID-Disallowed Characters'><![CDATA[
sip:here%27s_a_wild_%26_%2Fcr%zy%2F_address@example.com sip:here%27s_a_wild_%26_%2Fcr%zy%2F_address@example.com
]]></example> ]]></example>
@ -305,7 +311,7 @@ sip:here%27s_a_wild_%26_%2Fcr%zy%2F_address@example.com
]]></example> ]]></example>
</section2> </section2>
<section2 topic='IM and Presence Addresses' anchor='examples-im'> <section2 topic='IM and Presence Addresses' anchor='examples-im'>
<p>The im: and pres: URI schemes are specified in &rfc3860; and &rfc3859; respectively. With the exception of headers, an im: or pres: URI is simply a mailbox (as specified in <cite>RFC 2822</cite>) prepended with the im: or pres: scheme. Thus a basic IM or PRES address (not including optional headers) is essentially similar to an email address (e.g., the same characters &amp; ' / allowed in an email address but disallowed in an XMPP node identifier are also allowed in a basic IM or PRES address).</p> <p>The im: and pres: URI schemes are specified in &rfc3860; and &rfc3859; respectively. With the exception of headers, an im: or pres: URI is simply a mailbox (as specified in <cite>RFC 2822</cite>) prepended with the im: or pres: scheme. Thus a basic IM or PRES address (not including optional headers) is essentially similar to an email address (e.g., the same characters &amp; ' / allowed in an email address but disallowed in an XMPP localpart are also allowed in a basic IM or PRES address).</p>
<example caption='A Basic im: URI Containing JID-Disallowed Characters'><![CDATA[ <example caption='A Basic im: URI Containing JID-Disallowed Characters'><![CDATA[
im:here%27s_a_wild_%26_%2Fcr%zy%2F_address@example.com im:here%27s_a_wild_%26_%2Fcr%zy%2F_address@example.com
]]></example> ]]></example>
@ -331,7 +337,7 @@ pres:here%27s_a_wild_%26_%2Fcr%zy%2F_address@example.com
</section2> </section2>
<section2 topic='IMPS Addresses' anchor='examples-imps'> <section2 topic='IMPS Addresses' anchor='examples-imps'>
<p>The Instant Messaging and Presence Service (IMPS) protocol was originally defined by the Wireless Village consortium and is now maintained by the &OMA;. An IMPS address is formatted as a wv: URI, as specified in &wv-csp;. A basic address (not including a private resource) is of the form &lt;wv:user-id@domain&gt; and an address with a private resource is of the form &lt;wv:user-id/resource@domain&gt;.</p> <p>The Instant Messaging and Presence Service (IMPS) protocol was originally defined by the Wireless Village consortium and is now maintained by the &OMA;. An IMPS address is formatted as a wv: URI, as specified in &wv-csp;. A basic address (not including a private resource) is of the form &lt;wv:user-id@domain&gt; and an address with a private resource is of the form &lt;wv:user-id/resource@domain&gt;.</p>
<p>The "User-ID" construction is either a mobile phone number (beginning with "+1" for international numbers and a digit for national numbers) or an "Internet-Identity". An "Internet-Identity" may contain any US-ASCII character other than / @ + SP TAB and thus may include the following characters that are disallowed in the node identifier portion of a JID: " &amp; ' / : &lt; &gt; (which characters MUST be escaped when transforming an IMPS address into a JID). However, some of those characters are also reserved in URI syntax (namely the &amp; ' / characters) so those characters will be found in escaped form within a wv: URI.</p> <p>The "User-ID" construction is either a mobile phone number (beginning with "+1" for international numbers and a digit for national numbers) or an "Internet-Identity". An "Internet-Identity" may contain any US-ASCII character other than / @ + SP TAB and thus may include the following characters that are disallowed in the localpart of a JID: " &amp; ' / : &lt; &gt; (which characters MUST be escaped when transforming an IMPS address into a JID). However, some of those characters are also reserved in URI syntax (namely the &amp; ' / characters) so those characters will be found in escaped form within a wv: URI.</p>
<example caption='A Basic wv: URI Containing JID-Disallowed Characters'><![CDATA[ <example caption='A Basic wv: URI Containing JID-Disallowed Characters'><![CDATA[
wv:here%27s_a_wild_%26_%2Fcr%zy%2F_address_for%3A%3Cwv%3E%28%22IMPS%22%29@example.com wv:here%27s_a_wild_%26_%2Fcr%zy%2F_address_for%3A%3Cwv%3E%28%22IMPS%22%29@example.com
]]></example> ]]></example>
@ -344,7 +350,7 @@ here\27s_a_wild_\26_\2fcr%zy\2f_address_for\3a\3cwv\3e(\22IMPS\22)@example.com
<example caption='The JID as Presented to a User'><![CDATA[ <example caption='The JID as Presented to a User'><![CDATA[
here's_a_wild_&_/cr%zy/_address_for:<wv>("IMPS")@example.com here's_a_wild_&_/cr%zy/_address_for:<wv>("IMPS")@example.com
]]></example> ]]></example>
<p>Unlike the foregoing address types, IMPS addresses are allowed to contain backslashes. This implies that it is possible for an IMPS address to contain a character sequence that corresponds to one of the escaped character representations for code points that are disallowed in XMPP node identifiers. An example would be the IMPS address &lt;wv:\3and\2is\5cool@example.com&gt;, where the character sequence "\3a" could be interpreted as the : character (and the character sequence "\5c" as "\") if that IMPS address is directly converted into a JID. Therefore, the leading \ character MUST be transformed to "\5c" (and the source character sequence "\5c" to "\5c5c") in order to avoid possible ambiguity. Thus the transformed JID would be &lt;\5c3and\2is\5c5cool@example.com&gt;, which would be presented to a user as &lt;\3and\2is\5cool@example.com&gt;.</p> <p>Unlike the foregoing address types, IMPS addresses are allowed to contain backslashes. This implies that it is possible for an IMPS address to contain a character sequence that corresponds to one of the escaped character representations for code points that are disallowed in XMPP localparts. An example would be the IMPS address &lt;wv:\3and\2is\5cool@example.com&gt;, where the character sequence "\3a" could be interpreted as the : character (and the character sequence "\5c" as "\") if that IMPS address is directly converted into a JID. Therefore, the leading \ character MUST be transformed to "\5c" (and the source character sequence "\5c" to "\5c5c") in order to avoid possible ambiguity. Thus the transformed JID would be &lt;\5c3and\2is\5c5cool@example.com&gt;, which would be presented to a user as &lt;\3and\2is\5cool@example.com&gt;.</p>
<p>If an IMPS address contains a private resource, a gateway between XMPP and IMPS should process the resource and append it to the end of the JID; however, such gateway behavior is out of scope for this document.</p> <p>If an IMPS address contains a private resource, a gateway between XMPP and IMPS should process the resource and append it to the end of the JID; however, such gateway behavior is out of scope for this document.</p>
<p>The foregoing example showed how to transform a wv: URI into a JID. However, it also may be necessary to convert a JID into a wv: URI, as shown in the following example.</p> <p>The foregoing example showed how to transform a wv: URI into a JID. However, it also may be necessary to convert a JID into a wv: URI, as shown in the following example.</p>
<example caption='User Enters Address, Including Disallowed Characters'><![CDATA[ <example caption='User Enters Address, Including Disallowed Characters'><![CDATA[
@ -390,7 +396,7 @@ CN=D'Artagnan Saint-Andr&#xe9;,O=Example &amp; Company, Inc.,DC=example,DC=com
</example> </example>
</section2> </section2>
<section2 topic='IRC Addresses' anchor='examples-irc'> <section2 topic='IRC Addresses' anchor='examples-irc'>
<p>&rfc2812; defines the address format for Internet Relay Chat (IRC) entities, which can be servers, channels, or users. The "user" portion of an IRC address may contain any octet except NUL, CR, LF, SP, and "@"; this includes the characters " &amp; ' / : &lt; &gt; \ (which are disallowed in XMPP node identifiers and therefore MUST be escaped when transforming an IRC address into a JID).</p> <p>&rfc2812; defines the address format for Internet Relay Chat (IRC) entities, which can be servers, channels, or users. The "user" portion of an IRC address may contain any octet except NUL, CR, LF, SP, and "@"; this includes the characters " &amp; ' / : &lt; &gt; \ (which are disallowed in XMPP localparts and therefore MUST be escaped when transforming an IRC address into a JID).</p>
<example caption='A Basic IRC address Containing JID-Disallowed Characters'><![CDATA[ <example caption='A Basic IRC address Containing JID-Disallowed Characters'><![CDATA[
somenick!user"&'/:<>\3address@example.com somenick!user"&'/:<>\3address@example.com
]]></example> ]]></example>
@ -400,7 +406,7 @@ somenick!user\22\26\27\2f\3a\3c\3e\5c3address@example.com
<example caption='The JID as Presented to a User'><![CDATA[ <example caption='The JID as Presented to a User'><![CDATA[
somenick!user"&'/:<>\3address@example.com somenick!user"&'/:<>\3address@example.com
]]></example> ]]></example>
<p>Like IMPS addresses, IRC addresses are allowed to contain backslashes. This implies that it is possible for an IMPS address to contain a character sequence that corresponds to one of the escaped character representations for code points that are disallowed in XMPP node identifiers. An example is shown above.</p> <p>Like IMPS addresses, IRC addresses are allowed to contain backslashes. This implies that it is possible for an IMPS address to contain a character sequence that corresponds to one of the escaped character representations for code points that are disallowed in XMPP localparts. An example is shown above.</p>
</section2> </section2>
</section1> </section1>

View File

@ -46,6 +46,7 @@ THE SOFTWARE.
<!ENTITY copy "&#169;"> <!ENTITY copy "&#169;">
<!ENTITY reg "&#174;"> <!ENTITY reg "&#174;">
<!ENTITY sect "&#167;"> <!ENTITY sect "&#167;">
<!ENTITY mdash "&#x2014;">
<!-- shortcuts for stanza types and children --> <!-- shortcuts for stanza types and children -->
@ -645,9 +646,11 @@ THE SOFTWARE.
<!ENTITY rfc6920 "<span class='ref'><link url='http://tools.ietf.org/html/rfc6920'>RFC 6920</link></span> <note>RFC 6920: Naming Things with Hashes &lt;<link url='http://tools.ietf.org/html/rfc6920'>http://tools.ietf.org/html/rfc6920</link>&gt;.</note>" > <!ENTITY rfc6920 "<span class='ref'><link url='http://tools.ietf.org/html/rfc6920'>RFC 6920</link></span> <note>RFC 6920: Naming Things with Hashes &lt;<link url='http://tools.ietf.org/html/rfc6920'>http://tools.ietf.org/html/rfc6920</link>&gt;.</note>" >
<!ENTITY rfc7081 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7081'>RFC 7081</link></span> <note>RFC 7081: CUSAX: Combined Use of the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP) &lt;<link url='http://tools.ietf.org/html/rfc7081'>http://tools.ietf.org/html/rfc7081</link>&gt;.</note>" > <!ENTITY rfc7081 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7081'>RFC 7081</link></span> <note>RFC 7081: CUSAX: Combined Use of the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP) &lt;<link url='http://tools.ietf.org/html/rfc7081'>http://tools.ietf.org/html/rfc7081</link>&gt;.</note>" >
<!ENTITY rfc7572 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7572'>RFC 7572</link></span> <note>RFC 7572: Interworking between the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP): Instant Messaging &lt;<link url='http://tools.ietf.org/html/rfc7572'>http://tools.ietf.org/html/rfc7572</link>&gt;.</note>" > <!ENTITY rfc7572 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7572'>RFC 7572</link></span> <note>RFC 7572: Interworking between the Session Initiation Protocol (SIP) and the Extensible Messaging and Presence Protocol (XMPP): Instant Messaging &lt;<link url='http://tools.ietf.org/html/rfc7572'>http://tools.ietf.org/html/rfc7572</link>&gt;.</note>" >
<!ENTITY rfc7613 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7613'>RFC 7613</link></span> <note>RFC 7613: Preparation, Enforcement, and Comparison of Internationalized Strings Representing Usernames and Passwords&lt;<link url='http://tools.ietf.org/html/rfc7613'>http://tools.ietf.org/html/rfc7613</link>&gt;.</note>" >
<!ENTITY rfc7622 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7622'>RFC 7622</link></span> <note>RFC 7622: Extensible Messaging and Presence Protocol (XMPP): Address Format &lt;<link url='http://tools.ietf.org/html/rfc7622'>http://tools.ietf.org/html/rfc7622</link>&gt;.</note>" > <!ENTITY rfc7622 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7622'>RFC 7622</link></span> <note>RFC 7622: Extensible Messaging and Presence Protocol (XMPP): Address Format &lt;<link url='http://tools.ietf.org/html/rfc7622'>http://tools.ietf.org/html/rfc7622</link>&gt;.</note>" >
<!ENTITY rfc7395 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7395'>RFC 7395</link></span> <note>RFC 7395: An Extensible Messaging and Presence Protocol (XMPP) Subprotocol for WebSocket &lt;<link url='http://tools.ietf.org/html/rfc7395'>http://tools.ietf.org/html/rfc7395</link>&gt;.</note>" > <!ENTITY rfc7395 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7395'>RFC 7395</link></span> <note>RFC 7395: An Extensible Messaging and Presence Protocol (XMPP) Subprotocol for WebSocket &lt;<link url='http://tools.ietf.org/html/rfc7395'>http://tools.ietf.org/html/rfc7395</link>&gt;.</note>" >
<!ENTITY rfc7693 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7693'>RFC 7693</link></span> <note>RFC 7693: The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC) &lt;<link url='http://tools.ietf.org/html/rfc7693'>http://tools.ietf.org/html/rfc7693</link>&gt;.</note>" > <!ENTITY rfc7693 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7693'>RFC 7693</link></span> <note>RFC 7693: The BLAKE2 Cryptographic Hash and Message Authentication Code (MAC) &lt;<link url='http://tools.ietf.org/html/rfc7693'>http://tools.ietf.org/html/rfc7693</link>&gt;.</note>" >
<!ENTITY rfc7564 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7564'>RFC 7564</link></span> <note>RFC 7564: PRECIS Framework: Preparation, Enforcement, and Comparison of Internationalized Strings in Application Protocols &lt;<link url='http://tools.ietf.org/html/rfc7564'>http://tools.ietf.org/html/rfc7564</link>&gt;.</note>" >
<!ENTITY rfc7712 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7712'>RFC 7712</link></span> <note>RFC 7712: Domain Name Associations (DNA) in the Extensible Messaging and Presence Protocol (XMPP)&lt;<link url='http://tools.ietf.org/html/rfc7712'>http://tools.ietf.org/html/rfc7712</link>&gt;.</note>" > <!ENTITY rfc7712 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7712'>RFC 7712</link></span> <note>RFC 7712: Domain Name Associations (DNA) in the Extensible Messaging and Presence Protocol (XMPP)&lt;<link url='http://tools.ietf.org/html/rfc7712'>http://tools.ietf.org/html/rfc7712</link>&gt;.</note>" >
<!-- Internet-Drafts --> <!-- Internet-Drafts -->