git-svn-id: file:///home/ksmith/gitmigration/svn/xmpp/trunk@2137 4b5297f7-1745-476d-ba37-a9c6900126ab
This commit is contained in:
Peter Saint-Andre 2008-08-06 22:27:56 +00:00
parent 933875ec2c
commit 76d24423d4
1 changed files with 123 additions and 114 deletions

View File

@ -6,8 +6,8 @@
<?xml-stylesheet type='text/xsl' href='xep.xsl'?>
<xep>
<header>
<title>Data Element</title>
<abstract>This specification defines an XMPP protocol extension for including small bits of binary data in an XML stanza.</abstract>
<title>Bits of Binary</title>
<abstract>This specification defines an XMPP protocol extension for including or referring to small bits of binary data in an XML stanza.</abstract>
&LEGALNOTICE;
<number>0231</number>
<status>Proposed</status>
@ -26,6 +26,18 @@
<shortname>NOT_YET_ASSIGNED</shortname>
&stpeter;
&pavlix;
<revision>
<version>0.7</version>
<date>2008-08-06</date>
<initials>psa/ps</initials>
<remark><p>Simplified the protocol; removed fetch element because the cid: URI uniquely identifies the data; changed the name of the protocol to something more catchy.</p></remark>
</revision>
<revision>
<version>0.6</version>
<date>2008-08-06</date>
<initials>psa/ps</initials>
<remark><p>More clearly described recommended protocol and usage; added fetch element to diambiguate data from reference; cleaned up text throughout.</p></remark>
</revision>
<revision>
<version>0.5</version>
<date>2008-08-06</date>
@ -83,81 +95,65 @@
</header>
<section1 topic='Introduction' anchor='intro'>
<p>Sometimes it is desirable to include a small bit of binary data in an XMPP stanza. Typical use cases might be inclusion of an icon or emoticon in a message, a thumbnail in a file transfer request, a rasterized image in a whiteboarding session, or a small bit of media in a data form. At present, there is no lightweight method for including such data in an XMPP stanza, since existing methods (e.g., &xep0047;) are designed for larger blobs of data and therefore require some form of negotiation (e.g., via &xep0096; or &xep0234;). Therefore, this document specifies just such a lightweight method, using a &lt;data/&gt; element that provides semantics similar to the data: URL scheme defined in &rfc2397;.</p>
<p>Sometimes it is desirable to include a small bit of binary data in an XMPP stanza. Typical use cases might be to include icon or emoticon in a message, a thumbnail in a file transfer request, a rasterized image in a whiteboarding session, or a small bit of media in a data form. Currently, there is no lightweight method for including such data in an XMPP stanza, since existing methods (e.g., &xep0047;) are designed for larger blobs of data and therefore require some form of negotiation (e.g., via &xep0096; or &xep0234;).</p>
<p>This document specifies just such a lightweight method. The key building blocks are:</p>
<ol>
<li>A Content-ID ("cid") that uniquely identifies the data.</li>
<li>A &lt;data/&gt; element (similar to the data: URL scheme defined in &rfc2397;) that enables the sender and recipient to exchange the data identified by the cid.</li>
</ol>
</section1>
<section1 topic='Format' anchor='format'>
<p>The format for including binary data is straightforward: the data is encapsulated as the XML character data of a &lt;data/&gt; element qualified by the 'urn:xmpp:tmp:data-element' namespace &NSNOTE;, where the data MUST be encoded as Base64 in accordance with Section 4 of &rfc4648; (note: the Base64 output MUST NOT include whitespace and MUST set the number of pad bits to zero).</p>
<p>The &lt;data/&gt; element MUST be used only to encapsulate small bits of binary data and MUST NOT be used for large data transfers. Naturally the definitions of "small" and "large" are rather loose. In general, the data SHOULD NOT be more than 8 kilobytes, and dedicated file transfer methods (e.g., &xep0065; or &xep0047;) SHOULD be used for exchanging blobs of data larger than 8 kilobytes. Naturally, implementations or deployments may impose their own limits.</p>
<p>The following attributes are defined for the &lt;data/&gt; element.</p>
<table caption='Defined Attributes'>
<tr>
<th>Attribute</th>
<th>Description</th>
<th>Inclusion</th>
</tr>
<tr>
<td>cid</td>
<td>A Content-ID that can be mapped to a cid: URL as specified in &rfc2111;. The 'cid' value MUST be generated so that the local-part is a UUID as specified in &rfc4122; and the domain is the XMPP domain identifier portion of the sending entity's JabberID.</td>
<td>RECOMMENDED</td>
</tr>
<tr>
<td>max-age</td>
<td>A suggestion regarding how long (in seconds) to cache the data; the meaning matches the Max-Age attribute from &rfc2965;.</td>
<td>RECOMMENDED when sending a &lt;data/&gt; element containing XML character data</td>
</tr>
<tr>
<td>type</td>
<td>The value of the 'type' attribute MUST match the syntax specified in &rfc2045;. That is, the value MUST include a top-level media type, the "/" character, and a subtype; in addition, it MAY include one or more optional parameters (e.g., the "audio/ogg" MIME type in the example shown below includes a "codecs" parameter as specified in &rfc4281;). The "type/subtype" string SHOULD be registered in the &ianamedia;, but MAY be an unregistered or yet-to-be-registered value.</td>
<td>REQUIRED</td>
</tr>
</table>
<p>The following example illustrates the format (line endings are provided for readability only).</p>
<example caption='Data element format'><![CDATA[
<data xmlns='urn:xmpp:tmp:data-element'
cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'
max-age='86400'
type='image/png'>
iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP
C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA
AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J
REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
vr4MkhoXe0rZigAAAABJRU5ErkJggg==
</data>
]]></example>
<p>When the &lt;data/&gt; element is included in an XMPP &MESSAGE; or &PRESENCE; stanza, it SHOULD be included as a first-level child of the stanza.</p>
<p>When the &lt;data/&gt; element is included in an XMPP &IQ; stanza for data retrieval, it MUST be included as a first-level child of the stanza.</p>
<p>When the &lt;data/&gt; element is included in an XMPP &IQ; stanza to refer to the data, it MUST be included as a second-level child of the stanza.</p>
</section1>
<section1 topic='Protocol' anchor='proto'>
<section1 topic='Caching of Data' anchor='caching'>
<p>When one party sends a data element to another party, it SHOULD NOT include the data itself unless the data is particularly small (e.g., less than 1k) or is ephemeral and therefore will never be used again. Instead, it SHOULD send an empty &lt;data/&gt; element with a 'cid' attribute, then depend on the receiving party to retrieve the data if not cached.</p>
<p>As a hint regarding the suggested period for caching the data, the sending party SHOULD include a 'max-age' attribute whenever it sends a non-empty &lt;data/&gt; element. The semantics of the 'max-age' attribute exactly matches that of the Max-Age attribute from <cite>RFC 2965</cite>.</p>
<p>It is RECOMMENDED for the receiving entity to cache data; however, the receiving entity MAY opt not to cache data, for example because it runs on an a device that does not have sufficient space for data storage.</p>
<p>The default behavior is for the receiving entity to cache the data only for the life of the entity's application session (not a controlling user's presence session with the server or the controlling user's communication session with the contact from whom the user received the data); that is, the receiving entity would clear the cache when the application is terminated or restarted.</p>
<p>If it is not suggested to cache the data (e.g., because it is ephemeral), the value of the 'max-age' attribute MUST be "0" (the number zero).</p>
<p>A receiving entity MUST cache based on the JID of the sending entity; this helps to prevent certain data poisoning attacks.</p>
</section1>
<section2 topic='Data Exchange' anchor='exchange'>
<p>The RECOMMENDED approach is for the sender to include the cid when communicating with the recipient. The recipient SHOULD then check its cache of data to determine if the data identified by that cid is cached. If the data is cached, the recipient would then load its cached data. If the data is not cached, the recipient would then retrieve the data by sending an IQ-get to the sender (or potentially some other entity) containing an empty &lt;data/&gt; element whose 'cid' attribute specifies the data to be retrieved, to which the sender would reply with an IQ-result containing a &lt;data/&gt; element whose XML character data provides the binary data.</p>
<p>The &lt;data/&gt; element MUST be used only to encapsulate small bits of binary data and MUST NOT be used for large data transfers. Naturally the definitions of "small" and "large" are rather loose. In general, the data SHOULD NOT be more than 8 kilobytes, and dedicated file transfer methods (e.g., &xep0065; or &xep0047;) SHOULD be used for exchanging blobs of data larger than 8 kilobytes. However, implementations or deployments MAY impose their own limits.</p>
<p>If the data to be shared is particularly small (e.g., less than 1k), then the sender MAY send it directly by including a &lt;data/&gt; element directly in a &MESSAGE;, &PRESENCE;, or &IQ; stanza. The following rules apply:</p>
<ol>
<li>When the &lt;data/&gt; element is directly included in an XMPP &MESSAGE; or &PRESENCE; stanza, it SHOULD be a first-level child of the stanza.</li>
<li>When the &lt;data/&gt; element is directly included in an XMPP &IQ; stanza, it MUST be a child of the appropriate first-level child (since the IQ stanza must not include more than one first-level child).</li>
<li>When the &lt;data/&gt; element is used to retrieve the data from the sender as described under <link url='#retrieving'>Retrieving Uncached Data</link>, it MUST be a first-level child of the stanza.</li>
</ol>
</section2>
<section1 topic='Retrieving Uncached Data' anchor='retrieve'>
<p>Data can be requested and transferred using the XMPP &IQ; stanza type by making reference to the 'cid' of the data to be retrieved. In particular, the requesting entity can request data by sending an IQ-get containing an empty &lt;data/&gt; element with a 'cid' attribute.</p>
<example caption='Requesting data'><![CDATA[
<section2 topic='Referencing Data' anchor='referencing'>
<p>The sender can refer to data that it hosts by including a cid in the data it sends. The following example shows how to include the cid in &xep0071; but any appropriate format can be used, such as &xep0221;.</p>
<example caption='An XHTML-IM message with a cid'><![CDATA[
<message from='ladymacbeth@shakespeare.lit/castle'
to='macbeth@chat.shakespeare.lit'
type='groupchat'>
<body>Yet here's a spot.</body>
<html xmlns='http://jabber.org/protocol/xhtml-im'>
<body xmlns='http://www.w3.org/1999/xhtml'>
<p>
Yet here's a spot.
<img alt='A spot'
src='cid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'/></p>
</body>
</html>
</message>
]]></example>
<p>The recipient can then retrieve the data from the sender.</p>
</section2>
<section2 topic='Retrieving Uncached Data' anchor='retrieving'>
<p>Data is requested and transferred using the XMPP &IQ; stanza type by making reference to the cid. In particular, the recipient requests the binary data by sending an IQ-get containing an empty &lt;data/&gt; element with a 'cid' attribute that matches the cid URI previously communicated.</p>
<example caption='Requesting data'><![CDATA[
<iq from='doctor@shakespeare.lit/pda'
id='get-data-1'
to='gentlewoman@shakespeare.lit/phone'
to='ladymacbeth@shakespeare.lit/castle'
type='get'>
<data xmlns='urn:xmpp:tmp:data-element'
<data xmlns='urn:xmpp:tmp:bob'
cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'/>
</iq>
]]></example>
<p>The responding entity then would either return an error (e.g., &notfound; if it does not have data matching the Content-ID) or return the data.</p>
<example caption='Returning data'><![CDATA[
<iq from='gentlewoman@shakespeare.lit/phone'
]]></example>
<p>The recipient then would either return an error (e.g., &notfound; if it does not have data matching the Content-ID) or return the data.</p>
<example caption='Returning data'><![CDATA[
<iq from='ladymacbeth@shakespeare.lit/castle'
id='get-data-1'
to='doctor@shakespeare.lit/pda'
type='result'>
<data xmlns='urn:xmpp:tmp:data-element'
<data xmlns='urn:xmpp:tmp:bob'
cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'
max-age='86400'
type='image/png'>
@ -169,56 +165,63 @@
vr4MkhoXe0rZigAAAABJRU5ErkJggg==
</data>
</iq>
]]></example>
<p>This specification does not place limits on the entities from which data can be requested. In particular, such an entity need not be the "owner" of the data (e.g., it could be a peer in a chatroom or whiteboarding session, or the chatroom or whiteboarding service itself).</p>
<p>In addition, bits of data could be hosted by XMPP servers, distributed via &xep0060; nodes, or included in data collections that are available via HTTP (e.g., emoticon sets). Such data could be identified by the value of the 'cid' attribute, but methods for specifying those values are out of scope for this specification.</p>
</section1>
]]></example>
</section2>
<section2 topic='Caching Data' anchor='caching'>
<p>It is RECOMMENDED for the recipient to cache data; however, the recipient MAY opt not to cache data, for example because it runs on a device that does not have sufficient space for data storage.</p>
<p>The default behavior is for the recipient to cache the data only for the life of the entity's application session (not a client's presence session with the server or the controlling user's communication session with the contact from whom the user received the data); that is, the recipient would clear the cache when the application is terminated or restarted.</p>
<p>As a hint regarding the suggested period for caching the data, the sender MAY include a 'max-age' attribute whenever it sends a &lt;data/&gt; element. The meaning of the 'max-age' attribute exactly matches that of the Max-Age attribute from <cite>RFC 2965</cite>.</p>
<p>If it is not suggested to cache the data (e.g., because it is ephemeral), the value of the 'max-age' attribute MUST be "0" (the number zero).</p>
<p>A recipient MUST cache based on the JID of the sender; this helps to prevent certain data poisoning attacks.</p>
</section2>
<section2 topic='Format of the &lt;data/&gt; Element' anchor='format'>
<p>To exchange binary data, the data is encapsulated as the XML character data of a &lt;data/&gt; element qualified by the 'urn:xmpp:tmp:bob' namespace &NSNOTE;, where the data MUST be encoded as Base64 in accordance with Section 4 of &rfc4648; (note: the Base64 output MUST NOT include whitespace and MUST set the number of pad bits to zero).</p>
<p>The following attributes are defined for the &lt;data/&gt; element.</p>
<table caption='Attributes of the data Element'>
<tr>
<th>Attribute</th>
<th>Description</th>
<th>Inclusion</th>
</tr>
<tr>
<td>cid</td>
<td>A Content-ID that can be mapped to a cid: URL as specified in &rfc2111;. The 'cid' value MUST be generated so that the local-part is a UUID as specified in &rfc4122; and the domain is the XMPP domain identifier portion of the sender's JabberID.</td>
<td>REQUIRED</td>
</tr>
<tr>
<td>max-age</td>
<td>A suggestion regarding how long (in seconds) to cache the data; the meaning matches the Max-Age attribute from &rfc2965;.</td>
<td>RECOMMENDED</td>
</tr>
<tr>
<td>type</td>
<td>The value of the 'type' attribute MUST match the syntax specified in &rfc2045;. That is, the value MUST include a top-level media type, the "/" character, and a subtype; in addition, it MAY include one or more optional parameters (e.g., the "audio/ogg" MIME type in the example shown below includes a "codecs" parameter as specified in &rfc4281;). The "type/subtype" string SHOULD be registered in the &ianamedia;, but MAY be an unregistered or yet-to-be-registered value.</td>
<td>REQUIRED</td>
</tr>
</table>
<p>The following example illustrates the format (line endings are provided for readability only).</p>
<example caption='Data element format'><![CDATA[
<data xmlns='urn:xmpp:tmp:bob'
cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'
max-age='86400'
type='image/png'>
iVBORw0KGgoAAAANSUhEUgAAAAoAAAAKCAYAAACNMs+9AAAABGdBTUEAALGP
C/xhBQAAAAlwSFlzAAALEwAACxMBAJqcGAAAAAd0SU1FB9YGARc5KB0XV+IA
AAAddEVYdENvbW1lbnQAQ3JlYXRlZCB3aXRoIFRoZSBHSU1Q72QlbgAAAF1J
REFUGNO9zL0NglAAxPEfdLTs4BZM4DIO4C7OwQg2JoQ9LE1exdlYvBBeZ7jq
ch9//q1uH4TLzw4d6+ErXMMcXuHWxId3KOETnnXXV6MJpcq2MLaI97CER3N0
vr4MkhoXe0rZigAAAABJRU5ErkJggg==
</data>
]]></example>
</section2>
<section1 topic='Examples' anchor='examples'>
<p>As an example, consider the use of the &lt;data/&gt; element in conjunction with &xep0071;. Here the cid: URL scheme points to a data element within a &MESSAGE; stanza.</p>
<example caption='A message with included data'><![CDATA[
<message from='ladymacbeth@shakespeare.lit/castle'
to='macbeth@chat.shakespeare.lit'
type='groupchat'>
<body>Yet here's a spot.</body>
<html xmlns='http://jabber.org/protocol/xhtml-im'>
<body xmlns='http://www.w3.org/1999/xhtml'>
<p>
Yet here's a spot.
<img alt='A spot'
src='cid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'/>
</p>
</body>
</html>
<data xmlns='urn:xmpp:tmp:data-element'
cid='f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'
max-age='86400'
type='image/png'/>
</message>
]]></example>
<p>Once the data element is communicated, a subsequent message in the same session can refer to the data again (via a cid: URI) without including the data element itself.</p>
<example caption='A message with referenced data'><![CDATA[
<message from='ladymacbeth@shakespeare.lit/castle'
to='macbeth@chat.shakespeare.lit'
type='groupchat'>
<body>Out, damned spot!</body>
<html xmlns='http://jabber.org/protocol/xhtml-im'>
<body xmlns='http://www.w3.org/1999/xhtml'>
<p>
Out damned spot!
<img alt='A spot'
src='cid:f81d4fae-7dec-11d0-a765-00a0c91e6bf6@shakespeare.lit'/>
</p>
</body>
</html>
</message>
]]></example>
<p>If the receiving entity has not cached the data, it can request the data as described in the <link url='#retrieve'>Retrieving Data</link> section of this document.</p>
</section1>
<section1 topic='Security Considerations' anchor='security'>
<p>The ability to include arbitrary binary data implies that it is possible to send scripts, applets, images, and executable code, which may be potentially harmful. To reduce the risk of such exposure, an implementation MAY choose to not display or process such data but instead either completely ignore the data, show only the value of the 'alt' attribute, or prompt a human user for approval (either explicitly via user action or implicitly via a list of approved entities from whom the user will accept binary data without per-event approval).</p>
<p>The receiving entity SHOULD cache data based on the sender's JabberID; this helps to avoid data poisoning attacks.</p>
<p>The recipient MUST cache data based on the sender's JabberID; this helps to avoid data poisoning attacks.</p>
</section1>
<section1 topic='IANA Considerations' anchor='iana'>
@ -227,7 +230,7 @@
<section1 topic='XMPP Registrar Considerations' anchor='registrar'>
<section2 topic='Protocol Namespaces' anchor='ns'>
<p>Until this specification advances to a status of Draft, its associated namespace shall be "urn:xmpp:tmp:data-element"; upon advancement of this specification, the &REGISTRAR; shall issue a permanent namespace in accordance with the process defined in Section 4 of &xep0053;.</p>
<p>Until this specification advances to a status of Draft, its associated namespace shall be "urn:xmpp:tmp:bob"; upon advancement of this specification, the &REGISTRAR; shall issue a permanent namespace in accordance with the process defined in Section 4 of &xep0053;. The namespace 'urn:xmpp:bob' is requested and is thought to be unique per the XMPP Registrar's requirements.</p>
</section2>
</section1>
@ -237,15 +240,15 @@
<xs:schema
xmlns:xs='http://www.w3.org/2001/XMLSchema'
targetNamespace='urn:xmpp:tmp:data-element'
xmlns='urn:xmpp:tmp:data-element'
targetNamespace='urn:xmpp:tmp:bob'
xmlns='urn:xmpp:tmp:bob'
elementFormDefault='qualified'>
<xs:element name='data'>
<xs:complexType>
<xs:simpleContent>
<xs:extension base='xs:base64Binary'>
<xs:attribute name='cid' type='xs:string' use='optional'/>
<xs:attribute name='cid' type='xs:string' use='required'/>
<xs:attribute name='max-age' type='xs:nonNegativeInteger' use='optional'/>
<xs:attribute name='type' type='xs:string' use='required'/>
</xs:extension>
@ -253,6 +256,12 @@
</xs:complexType>
</xs:element>
<xs:simpleType name='empty'>
<xs:restriction base='xs:string'>
<xs:enumeration value=''/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
]]></code>
</section1>