<remark>More clearly specified that XHTML-IM content is intended for use only within message stanzas, that it may be included in IQ stanzas (usage undefined) and that it may be included in presence stanzas (usage: formatted version of status); specified security consideration regarding hyperlinks.</remark>
</revision>
<revision>
<version>1.0</version>
<date>2004-09-29</date>
<initials>psa</initials>
<remark>Per a vote of the Jabber Council, advanced status to Draft.</remark>
</revision>
<revision>
<version>0.19</version>
<date>2004-09-27</date>
<initials>psa</initials>
<remark>Per list discussion, removed recommendation to preserve whitespace in XHTML bodies (instead, use of <br/> and non-breaking spaces is recommended); noted that nesting of elements is not recommended except within <div/> elements; and switched from padding-left to margin-left for indentation.</remark>
</revision>
<revision>
<version>0.18</version>
<date>2004-09-15</date>
<initials>psa</initials>
<remark>Added recommendation to preserve whitespace in XHTML bodies.</remark>
</revision>
<revision>
<version>0.17</version>
<date>2004-09-08</date>
<initials>psa</initials>
<remark>Simplified the recommended profiles based on list discussion; changed the examples accordingly.</remark>
</revision>
<revision>
<version>0.16</version>
<date>2004-08-30</date>
<initials>psa</initials>
<remark>Specified the scope of the proposal; clarified reasons for the choice of technology; clarified one business rule; corrected several typographical errors.</remark>
</revision>
<revision>
<version>0.15</version>
<date>2004-07-29</date>
<initials>psa</initials>
<remark>Based on W3C feedback, added content model and refactored the text to ensure separation between the XHTML 1.0 Integration Set itself and the JSF's recommended profile of the Integration Set; also split the requirements out from the Concepts and Approach section, added several more examples, and showed renderings of the examples.</remark>
</revision>
<revision>
<version>0.14</version>
<date>2004-05-19</date>
<initials>psa</initials>
<remark>Clarified relationship between wrapper element and XHTML content.</remark>
</revision>
<revision>
<version>0.13</version>
<date>2004-05-18</date>
<initials>psa</initials>
<remark>Initial version of XHTML modularization.</remark>
</revision>
<revision>
<version>0.12</version>
<date>2004-03-10</date>
<initials>psa</initials>
<remark>Clarified and corrected several points in the text; improved and added to the examples.</remark>
</revision>
<revision>
<version>0.11</version>
<date>2003-12-05</date>
<initials>psa</initials>
<remark>Defined XHTML 1.0 Integration Set conformance; removed schema pending work on XHTML modularization with W3C.</remark>
</revision>
<revision>
<version>0.10</version>
<date>2003-11-25</date>
<initials>psa</initials>
<remark>Cleaned up the schema; added W3C considerations.</remark>
</revision>
<revision>
<version>0.9</version>
<date>2003-09-30</date>
<initials>psa</initials>
<remark>Changed status to Deferred pending discussion with the W3C regarding XHTML modularization.</remark>
</revision>
<revision>
<version>0.8</version>
<date>2003-09-16</date>
<initials>psa</initials>
<remark>Changed MUST to SHOULD for support of the Style Attribute Module; clarified relationship of XHTML-IM schema to XHTML schema; slight text cleanup.</remark>
</revision>
<revision>
<version>0.7</version>
<date>2003-08-19</date>
<initials>psa</initials>
<remark>Added the <code/> element.</remark>
</revision>
<revision>
<version>0.6</version>
<date>2003-06-24</date>
<initials>psa</initials>
<remark>Made image support recommended (not mandatory); removed references to conversation threads; fixed some issues in the schema; made small editorial changes throughout.</remark>
</revision>
<revision>
<version>0.5</version>
<date>2003-04-29</date>
<initials>psa</initials>
<remark>Fixed the schema, made several small editorial changes.</remark>
</revision>
<revision>
<version>0.4</version>
<date>2003-02-20</date>
<initials>psa</initials>
<remark>Brought back several content-based elements; added preliminary schema.</remark>
</revision>
<revision>
<version>0.3</version>
<date>2003-02-19</date>
<initials>psa</initials>
<remark>Defined the attributes and style properties required by this document.</remark>
</revision>
<revision>
<version>0.2</version>
<date>2003-02-17</date>
<initials>psa</initials>
<remark>Described the requirements more fully; added additional restrictions above and beyond the standard XHTML 1.0 Modules; added disco examples.</remark>
<p>This document defines methods for exchanging instant messages that contain lightweight text markup. In the context of this document, "lightweight text markup" is to be understood as a combination of minimal structural elements and presentational styles that can easily be rendered on a wide variety of devices without requiring a full rich-text rendering engine such as a web browser. Examples of lightweight text markup include basic text blocks (e.g., paragraphs), lists, hyperlinks, image references, and font styles (e.g., sizes and colors).</p>
<section1topic='Choice of Technology'anchor='tech'>
<p>In the past, there have existed several incompatible methods within the Jabber community for exchanging instant messages that contain lightweight text markup. The most notable such methods have included derivatives of &w3xhtml; as well as of &rtf;.</p>
<p>Although it is sometimes easier for client developers to implement RTF support (this is especially true on certain Microsoft Windows operating systems), there are several reasons (consistent with the &xep0134;) for the &JSF; to avoid the use of RTF in developing a protocol for lightweight text markup. Specifically:</p>
<li>RTF is not a structured vocabulary derived from SGML (as is &w3html;) or, more relevantly, from XML (as is XHTML 1.0).</li>
<li>RTF is under the control of the Microsoft Corporation and thus is not an open standard maintained by a recognized standards development organization; therefore the JSF is unable to contribute to or influence its development if necessary, and any protocol the JSF developed using RTF would introduce unwanted dependencies.</li>
</ol>
<p>Conversely, there are several reasons to prefer XHTML for lightweight text markup:</p>
<oltype='1'start='1'>
<li>XHTML is a structured format that is defined as an application of &w3xml;, making it especially appropriate for sending over Jabber/XMPP, which is at root a technology for streaming XML (see &xmppcore;).</li>
<li>XHTML is an open standard developed by the &W3C;, a recognized standards development organization.</li>
<p>Therefore, this document defines support for lightweight text markup in the form of an XMPP extension that encapsulates content defined by an XHTML 1.0 Integration Set that we label "XHTML-IM". The remainder of this document discusses lightweight text markup in terms of XHTML 1.0 only and does not further consider RTF or other technologies.</p>
<p>HTML was originally designed for authoring and presenting stuctured documents on the World Wide Web, and was subsequently extended to handle more advanced functionality such as image maps and interactive forms. However, the requirements for publishing documents (or developing transactional websites) for presentation by dedicated XHTML clients on traditional computers or small-screen devices are fundamentally different from the requirements for lightweight text markup of instant messages; for this reason, only a reduced set of XHTML features is needed for XHTML-IM. In particular:</p>
<oltype='1'start='1'>
<li><p>IM clients are not XHTML clients: their primary purpose is not to read pre-existing XHTML documents, but to read <em>and generate</em> relatively large numbers of fairly small instant messages.</p></li>
<li><p>The underlying context for XHTML content in Jabber/XMPP instant messaging is provided not by a full XHTML document, but by an XML stream, and specifically by a message stanza within that stream. Thus the <head/> element and all its children are unnecessary. Only the <body/> element and some of its children are appropriate for use in instant messaging.</p></li>
<li><p>The XHTML content that is read by one's IM client is normally generated on the fly by one's conversation partner (or, to be precise, by his or her IM client). Thus there is an inherent limit to the sophistication of the XHTML markup involved. Even in normal XHTML documents, fairly basic structural and rendering elements such as definition lists, abbreviations, addresses, and computer input handling (e.g., <kbd/> and <var/>) are relatively rare. There is little or no foreseeable need for such elements within the context of instant messaging.</p></li>
<li><p>The foregoing is doubly true of more advanced markup such as tables, frames, and forms (however, there exists an XMPP extension that provides an instant messaging equivalent of the latter, as defined in &xep0004;).</p></li>
<li><p>Although ad-hoc styles are useful for messaging (by means of the 'style' attribute), full support for &w3css; (defined by the <style/> element or a standalone .css file, and implemented via the 'class' attribute) would be overkill since many CSS1 properties (e.g., box, classification, and text properties) were developed especially for sophisticated page layout.</p></li>
<li><p>Background images, audio, animated text, layers, applets, scripts, and other multimedia content types are unnecessary, especially given the existence of XMPP extensions such as &xep0096;.</p></li>
<li><p>Content transformations such as those defined by &w3xslt; must not be necessary in order for an instant messaging application to present lightweight text markup to an end user.</p></li>
</ol>
<p>As explained below, some of these requirements are addressed by the definition of the XHTML-IM Integration Set itself, while others are addressed by a recommended "profile" for that Integration Set in the context of instant messaging applications.</p>
</section1>
<section1topic='Concepts and Approach'anchor='concepts'>
<p>This document defines an adaptation of XHTML 1.0 (specifically, an XHTML 1.0 Integration Set) that makes it possible to provide lightweight text markup of instant messages (mainly for Jabber/XMPP instant messages, although the Integration Set defined herein could be used by other protocols). This pattern is familiar from email, wherein the HTML-formatted version of the message supplements but does not supersede the text-only version of the message. <note>The XHTML is merely an alternative version of the message body or bodies, and the semantic meaning is to be derived from the textual message body or bodies rather than the XHTML version.</note></p>
<p>In Jabber/XMPP communications, the meaning (as opposed to markup) of the message MUST always be represented as best as possible in the normal <body/> child element or elements of the &MESSAGE; stanza qualified by the 'jabber:client' (or 'jabber:server') namespace. Lightweight text markup is then provided within an <html/> element qualified by the 'http://jabber.org/protocol/xhtml-im' namespace. <note>It might have been better to use an element name other than <html/> for the wrapper element; however, changing it would not be backwards-compatible with the older protocol and existing implementations.</note> However, this <html/> element is used solely as a "wrapper" for the XHTML content itself, which content is encapsulated via one or more <body/> elements qualified by the 'http://www.w3.org/1999/xhtml' namespace, along with appropriate child elements thereof.</p>
<p>The following example illustrates this approach.</p>
<examplecaption='A simple example'><![CDATA[
<message>
<body>hi!</body>
<htmlxmlns='http://jabber.org/protocol/xhtml-im'>
<bodyxmlns='http://www.w3.org/1999/xhtml'>
<pstyle='font-weight:bold'>hi!</p>
</body>
</html>
</message>
]]></example>
<p>Technically speaking, there are three aspects to the approach taken herein:</p>
<oltype='1'start='1'>
<li>Definition of the <html/> "wrapper" element, which functions as an XMPP extension within XMPP <message/> stanzas.</li>
<li>Definition of the XHTML-IM Integration Set itself in terms of supported XHTML 1.0 modules, using the concepts defined in &w3xhtmlmod;.</li>
<li>A recommended "profile" regarding the specific XHTML 1.0 elements and attributes to be supported from each XHTML 1.0 module.</li>
</ol>
<p>These three aspects are defined in the three document sections that follow.</p>
</section1>
<section1topic='Wrapper Element'anchor='wrapper'>
<p>The root element for including XHTML content within XMPP stanzas is <html/>. This element is qualified by the 'http://jabber.org/protocol/xhtml-im' namespace. From the perspective of XMPP, it functions as an XMPP extension element; from the perspective of XHTML, it functions as a wrapper for XHTML 1.0 content qualified by the 'http://www.w3.org/1999/xhtml' namespace. Such XHTML content MUST be contained in one or more <body/> elements qualified by the 'http://www.w3.org/1999/xhtml' namespace and MUST conform to the XHTML-IM Integration Set defined in the following section. If more than one <body/> element is included in the <html/> wrapper element, each <body/> element MUST possess an 'xml:lang' attribute with a distinct value, where the value of that attribute MUST adhere to the rules defined in &rfc3066;. A formal definition of the <html/> element is provided in the <linkurl="#schemas-wrapper">XHTML-IM Wrapper Schema</link>.</p>
<p>Note: The XHTML <body/> element is not to be confused with the XMPP <body/> element, which is a child of a &MESSAGE; stanza and is qualified by the 'jabber:client' or 'jabber:server' namespace as described in &xmppim;. The <html/> wrapper element is intended for inclusion only as a direct child element of the XMPP &MESSAGE; stanza and only in order to specify a marked-up version of the message &BODY; element or elements, but MAY be included elsewhere in accordance with the "extended namespace" rules defined in the <cite>XMPP IM</cite> specification.</p>
<p>Until and unless (1) additional integration sets are defined and (2) mechanisms are specified for discovering or negotiating which integration sets are supported, the XHTML markup contained within the <html/> wrapper element MUST NOT include elements and attributes that are not part of the XHTML-IM Integration Set defined in the following section, and any such elements and attributes MUST be ignored if received (where the meaning of "ignore" is defined by the conformance requirements of <cite>Modularization of XHTML</cite>, as summarized in the <linkurl="#w3c-conformance">User Agent Conformance</link> section of this document).</p>
<p>This section defines an XHTML 1.0 Integration Set for use in the context of instant messaging. Given its intended usage, we label it "XHTML-IM".</p>
<p><cite>Modularization of XHTML</cite> provides the ability to formally define subsets of XHTML 1.0 via the concept of "modularization" (which may be familiar from &w3xhtmlbasic;). Many of the defined modules are not necessary or useful in the context of instant messaging, and in the context of Jabber/XMPP instant messaging specifically some modules have been superseded by well-defined XMPP extensions. This document specifies that XHTML-IM shall be based on the following XHTML 1.0 modules:</p>
<p><cite>Modularization of XHTML</cite> defines many additional modules, such as Table Modules, Form Modules, Object Modules, and Frame Modules. None of these modules is part of the XHTML-IM Integration Set. If support for such modules is desired, it MUST be defined in a separate and distinct integration set.</p>
<p>The Structure Module is defined as including the following elements and attributes: <note>The 'style' attribute is specified herein where appropriate because the Style Attribute Module is included in the definition of the XHTML-IM Integration Set, whereas the event-related attributes (e.g., 'onclick') are not specified because the Implicit Events Module is not included.</note></p>
<tablecaption='Defined Structure Module Elements and Attributes'>
<p>Even within the restricted set of modules specified as defining the XHTML-IM Integration Set (see preceding section), some elements and attributes are inappropriate or unnecessary for the purpose of instant messaging; although such elements and attributes MAY be included in accordance with the XHTML-IM Integration Set, further recommended restrictions regarding which elements and attributes to include in XHTML content are specified below.</p>
<p>The intent of the protocol defined herein is to support lightweight text markup of XMPP message bodies only. Therefore the <head/>, <html/>, and <title/> elements are NOT RECOMMENDED to be generated by a compliant implementation, and SHOULD be ignored if received (where the meaning of "ignore" is defined by the conformance requirements of <cite>Modularization of XHTML</cite>, as summarized in the <linkurl="#w3c-conformance">User Agent Conformance</link> section of this document). However, the <body/> element is REQUIRED, since it is the root element for all XHTML content.</p>
<p>Not all of the Text Module elements are appropriate in the context of instant messaging, since the XHTML content that one views is generated by one's conversation partner in what is often a rapid-fire conversation thread. Only the following elements are RECOMMENDED in XHTML-IM:</p>
<ul>
<li><br/></li>
<li><p/></li>
<li><span/></li>
</ul>
<p>The other Text Module elements SHOULD NOT be generated by a compliant implementation, and MAY be ignored if received (where the meaning of "ignore" is defined by the conformance requirements of <cite>Modularization of XHTML</cite>, as summarized in the <linkurl="#w3c-conformance">User Agent Conformance</link> section of this document).</p>
<p>The only recommended attributes of the <a/> element are specified in the <linkurl="#profile-attributes">Recommended Attributes</link> section of this document.</p>
<p>Because it is unlikely that an instant messaging user would generate a definition list, only ordered and unordered lists are RECOMMENDED. Definition lists SHOULD NOT be generated by a compliant implementation, and MAY be ignored if received (where the meaning of "ignore" is defined by the conformance requirements of <cite>Modularization of XHTML</cite>, as summarized in the <linkurl="#w3c-conformance">User Agent Conformance</link> section of this document).</p>
<p>The only recommended attributes of the <img/> element are specified in the <linkurl="#profile-attributes">Recommended Attributes</link> section of this document. In addition, for security reasons or because of display constraints, a compliant client MAY choose to display 'alt' text only, not the image itself.</p>
<p>This module MUST be supported in XHTML-IM if possible; although clients written for certain platforms (e.g., console clients, mobile phones, and handheld computers) or for certain classes of users (e.g., text-to-speech clients) may not be able to support all of the recommended styles directly, they SHOULD attempt to emulate or translate the defined style properties into text or other presentation styles that are appropriate for the platform or user base in question.</p>
<p>A full list of recommended style properties is provided below.</p>
<p><cite>CSS1</cite> defines 42 "atomic" style properties (which are categorized into font, color and background, text, box, and classification properties) as well as 11 "shorthand" properties ("font", "background", "margin", "padding", "border-width", "border-top", "border-right", "border-bottom", "border-left", "border", and "list-style"). Many of these properties are not appropriate for use in text-based instant messaging, for one or more of the following reasons:</p>
<oltype='1'start='1'>
<li>The property applies to or depends on the inclusion of images other than those handled by the XHTML Image Module (e.g., the "background-image", "background-repeat", "background-attachment", "background-position", and "list-style-image" properties).</li>
<li>The property is intended for advanced document layout (e.g., the "line-height" property and most of the box properties, with the exception of "margin-left", which is useful for indenting text, and "margin-right", which can be useful when dealing with images).</li>
<li>The property is unnecessary since it can be emulated via user input or recommended XHTML stuctural elements (e.g., the "text-transform" property can be emulated by the user's keystrokes or use of the caps lock key)</li>
<li>The property is otherwise unlikely to ever be used in the context of rapid-fire conversations (e.g., the "font-variant", "word-spacing", "letter-spacing", and "list-style-position" properties).</li>
<li>The property is a shorthand property but some of the properties it includes are not appropriate for instant messaging applications according to the foregoing considerations (in fact this applies to all of the shorthand properties).</li>
</ol>
<p>Unfortunately, <cite>CSS1</cite> does not include mechanisms for defining profiles thereof (as does XHTML 1.0 in the form of XHTML Modularization). While there exist reduced sets of CSS2, these introduce more complexity than is desirable in the context of XHTML-IM. Therefore we simply provide a list of recommended CSS1 style properties.</p>
<p>XHTML-IM stipulates that only the following style properties are RECOMMENDED:</p>
<p>Although a compliant implementation MAY generate or process other style properties defined in CSS1, such behavior is NOT RECOMMENDED by this document.</p>
<p>Section 5.1 of <cite>Modularization of XHTML</cite> describes several "common" attribute collections: a "Core" collection ('class', 'id', 'title'), an "I18N" collection ('xml:lang', not shown below since it is implied in XML), an "Events" collection (not included in the XHTML-IM Integration Set because the Intrinsic Events Module is not selected), and a "Style" collection ('style'). The following table summarizes the recommended profile of these common attributes within the XHTML 1.0 content itself:</p>
<tablecaption='Recommended Usage of Common Attributes'>
<td>External stylesheets (which 'class' would typically reference) are not recommended.</td>
</tr>
<tr>
<td>id</td>
<td>NOT RECOMMENDED</td>
<td>Internal links and message fragments are not recommended in IM content, nor are external stylesheets (which also make use of the 'id' attribute).</td>
</tr>
<tr>
<td>title</td>
<td>NOT RECOMMENDED</td>
<td>Granting of titles to elements in IM content seems unnecessary.</td>
</tr>
<tr>
<td>style</td>
<td>REQUIRED</td>
<td>The 'style' attribute is required since it is the vehicle for presentational styles.</td>
</tr>
<tr>
<td>xml:lang</td>
<td>NOT RECOMMENDED</td>
<td>Differentiation of language identification should occur at the level of the <body/> element only.</td>
<p>Beyond the "common" attributes, certain elements within the modules selected for the XHTML-IM Integration Set are allowed to possess other attributes, such as eight attributes for the <a/> element and five attributes for the <img/> element. The recommended profile for such attributes is provided in the following table:</p>
<tablecaption='Recommended Usage of Specialized Attributes'>
<p>Other XHTML 1.0 attributes SHOULD NOT be generated by a compliant implementation, and SHOULD be ignored if received (where the meaning of "ignore" is defined by the conformance requirements of <cite>Modularization of XHTML</cite>, as summarized in the <linkurl="#w3c-conformance">User Agent Conformance</link> section of this document).</p>
</section3>
</section2>
<section2topic='Summary of Recommendations'anchor='profile-summary'>
<p>The following table summarizes the elements and attributes that are recommended within the XHTML-IM Integration Set.</p>
<tablecaption='Recommended Elements and Attributes'>
<tr>
<th>Element</th>
<th>Attributes</th>
</tr>
<tr>
<td><a/></td>
<td>href, style, type</td>
</tr>
<tr>
<td><body/></td>
<td>style, xml:lang <note>When contained within the <html xmlns='http://jabber.org/protocol/xhtml-im'> element, a <body/> element is qualified by the 'http://www.w3.org/1999/xhtml' namespace; naturally, this is a namespace declaration rather than an attribute per se, and therefore is not mentioned in the attribute enumeration.</note></td>
</tr>
<tr>
<td><br/></td>
<td>-none-</td>
</tr>
<tr>
<td><img/></td>
<td>alt, height, src, style, width</td>
</tr>
<tr>
<td><li/></td>
<td>style</td>
</tr>
<tr>
<td><ol/></td>
<td>style</td>
</tr>
<tr>
<td><p/></td>
<td>style</td>
</tr>
<tr>
<td><span/></td>
<td>style</td>
</tr>
<tr>
<td><ul/></td>
<td>style</td>
</tr>
</table>
<p>Any other elements and attributes defined in the XHTML 1.0 modules that are included in the XHTML-IM Integration Set SHOULD NOT be generated by a compliant implementation, and SHOULD be ignored if received (where the meaning of "ignore" is defined by the conformance requirements of <cite>Modularization of XHTML</cite>, as summarized in the <linkurl="#w3c-conformance">User Agent Conformance</link> section of this document).</p>
</section2>
</section1>
<section1topic='Business Rules'anchor='bizrules'>
<p>The following rules apply to the generation and processing of XHTML content by Jabber clients or other XMPP entities.</p>
<oltype='1'start='1'>
<li><p>XHTML-IM content is designed to provide a formatted version of the XML character data provided in the &BODY; of an XMPP &MESSAGE; stanza; if such content is included in an XMPP message, the <html/> element MUST be a direct child of the &MESSAGE; stanza and the XHTML-IM content MUST be understood as a formatted version of the message body. XHTML-IM content MAY be included within XMPP &IQ; stanzas (or children thereof), but any such usage is undefined. In order to preserve bandwidth, XHTML-IM content SHOULD NOT be included within XMPP &PRESENCE; stanzas; however, if it is so included, the <html/> element MUST be a direct child of the &PRESENCE; stanza and the XHTML-IM content MUST be understood as a formatted version of the XML character data provided in the &STATUS; element.</p></li>
<li><p>The sending client MUST ensure that, if XHTML content is sent, its meaning is the same as that of the plaintext version, and that the two versions differ only in markup rather than meaning.</p></li>
<li><p>XHTML-IM is a reduced set of <cite>XHTML 1.0</cite> and thus also of <cite>XML 1.0</cite>. Therefore all opening tags MUST be completed by inclusion of an appropriate closing tag.</p></li>
<li><p><cite>XMPP Core</cite> specifies that an XMPP &MESSAGE; MAY contain more than one <body/> child as long as each <body/> possesses an 'xml:lang' attribute with a distinct value. In order to ensure correct internationalization, if an XMPP &MESSAGE; stanza contains more than one <body/> child and is also sent as XHTML-IM, the <html/> element SHOULD also contain more than one <body/> child, with one such element for each <body/> child of the &MESSAGE; stanza (distinguished by an appropriate 'xml:lang' attribute).</p></li>
<li><p>Section 11.1 of <cite>XMPP Core</cite> stipulates that character entities other than the five general entities defined in Section 4.6 of the XML specification (i.e., &lt;, &gt;, &amp;, &apos;, and &quot;) MUST NOT be sent over an XML stream. Therefore implementations of XHTML-IM MUST NOT include predefined XHTML 1.0 entities such as &nbsp; -- instead, implementations MUST use the equivalent character references as specified in Section 4.1 of the XML specification (even in non-obvious places such as URIs that are included in the 'href' attribute).</p></li>
<li><p>For elements and attributes qualified by the 'http://www.w3.org/1999/xhtml' namespace, user agent conformance is guided by the requirements defined in <cite>Modularization of XHTML</cite>; for details, refer to the <linkurl="#w3c-conformance">User Agent Conformance</link> section of this document.</p></li>
<li><p>The use of structural elements is NOT RECOMMENDED where presentational styles are desired, which is why very few structural elements are specified herein. Implementations SHOULD use appropriate 'style' attributes (e.g., <span style='font-weight: bold'>this is bold</span> and <p style='margin-left: 5%'>this is indented</p>) rather than XHTML structural elements (e.g., <strong/> and <blockquote/>) wherever possible.</p></li>
<li><p>Nesting of block structural elements (<p/>) and list elements (<dl/>, <ol/>, <ul/>) is NOT RECOMMENDED, except within <div/> elements.</p></li>
<li><p>It is RECOMMENDED for implementations to replace line breaks with the <br/> element and to replace significant whitepace with the appropriate number of non-breaking spaces (via the NO-BREAK SPACE character or its equivalent), where "significant whitespace" means whitespace that makes some material difference (e.g., one or more spaces at the beginning of a line or more than one space anywhere else within a line), not "normal" whitespace separating words or punctuation.</p></li>
</ol>
</section1>
<section1topic='Examples'anchor='examples'>
<p>The following examples provide an insight into the inclusion of XHTML content in XMPP &MESSAGE; stanzas but are by no means exhaustive or definitive.</p>
<p>(Note: The examples may not render correctly in all web browsers, since not all web browsers comply fully with the XHTML 1.0 and CSS1 standards. Markup in the examples may include line breaks for readability. Example renderings are shown with a colored background to set them off from the rest of the text.)</p>
<examplecaption='Bold, italic, font colors'><![CDATA[
<message>
<body>OMG, I'm green with envy!</body>
<htmlxmlns='http://jabber.org/protocol/xhtml-im'>
<bodyxmlns='http://www.w3.org/1999/xhtml'>
<pstyle='font-size:large'>
<spanstyle='font-style: italic'>OMG</span>,
I'm <spanstyle='color:green'>green</span>
with <spanstyle='font-weight: bold'>envy</span>!
</p>
</body>
</html>
</message>
]]></example>
<p>This could be rendered as follows:</p>
<divclass='example'>
<pstyle='font-size: large'><spanstyle='font-style: italic'>OMG</span>, I'm <spanstyle='color:green'>green</span> with <spanstyle='font-weight: bold'>envy</span>!</p>
</div>
<examplecaption='Indentation'><![CDATA[
<message>
<body>As Emerson said in his essay Self-Reliance:
"A foolish consistency is the hobgoblin of little minds."
</body>
<htmlxmlns='http://jabber.org/protocol/xhtml-im'>
<bodyxmlns='http://www.w3.org/1999/xhtml'>
<p>As Emerson said in his essay <spanstyle='font-style: italic'>Self-Reliance</span>:</p>
<pstyle='margin-left: 5%'>
"A foolish consistency is the hobgoblin of little minds."
</p>
</body>
</html>
</message>
]]></example>
<p>This could be rendered as follows:</p>
<divclass='example'>
<p>As Emerson said in his essay <spanstyle='font-style: italic'>Self-Reliance</span>:</p>
<pstyle='margin-left: 5%'>
"A foolish consistency is the hobgoblin of little minds."
</p>
</div>
<examplecaption='An image and a hyperlink'><![CDATA[
<message>
<body>Hey, are you licensed to Jabber?
http://www.jabber.org/images/psa-license.jpg
</body>
<htmlxmlns='http://jabber.org/protocol/xhtml-im'>
<bodyxmlns='http://www.w3.org/1999/xhtml'>
<p>Hey, are you licensed to <ahref='http://www.jabber.org/'>Jabber</a>?</p>
<p>Hey, are you licensed to <linkurl='http://www.jabber.org/'>Jabber</link>?</p>
<p><imgsrc='http://www.jabber.org/images/psa-license.jpg'alt='A License to Jabber'height='261'width='537'/></p>
</div>
<p>Note the large size of the image. Including the 'height' and 'width' attributes is therefore quite friendly, since it gives the receiving application hints as to whether the image is too large to fit into the current interface (naturally, these are hints only and cannot necessarily be relied upon in determining the size of the image).</p>
<p>Rendering the 'alt' value rather than the image would yield something like the following:</p>
<divclass='example'>
<p>Hey, are you licensed to <linkurl='http://www.jabber.org/'>Jabber</link>?</p>
<p>How multiple bodies would best be rendered will depend on the user agent and relevant application. For example, a specialized Jabber client that is used in foreign language instruction might show two languages side by side, whereas a dedicated IM client might show content only in a human user's preferred language as captured in the client configuration.</p>
<examplecaption='Unrecognized Elements and Attributes'><![CDATA[
<message>
<body>
The XHTML user agent conformance requirements say to ignore
elements and attributes you don't understand, to wit:
4. If a user agent encounters an element it does
not recognize, it must continue to process the
children of that element. If the content is text,
the text must be presented to the user.
5. If a user agent encounters an attribute it does
not recognize, it must ignore the entire attribute
specification (i.e., the attribute and its value).
</body>
<htmlxmlns='http://jabber.org/protocol/xhtml-im'>
<bodyxmlns='http://www.w3.org/1999/xhtml'>
<p>The <acronym>XHTML</acronym> user agent conformance
requirements say to ignore elements and attributes
you don't understand, to wit:</p>
<oltype='1'start='4'>
<li><p>
If a user agent encounters an element it does
not recognize, it must continue to process the
children of that element. If the content is text,
the text must be presented to the user.
</p></li>
<li><p>
If a user agent encounters an attribute it does
not recognize, it must ignore the entire attribute
specification (i.e., the attribute and its value).
</p></li>
</ol>
</body>
</html>
</message>
]]></example>
<p>Let us assume that the recipient's user agent recognizes neither the <acronym/> element (which is discouraged in XHTML-IM) nor the 'type' and 'start' attributes of the <ol/> element (which, after all, were deprecated in HTML 4.0), and that it does not render nested elements (e.g., the <p/> elements within the <li/> elements); in this case, it could render the content as follows (note that the element value is shown as text and the attribute value is not rendered):</p>
<divclass='example'>
<p>The <acronym>XHTML</acronym> user agent conformance requirements say to ignore elements and attributes you don't understand, to wit:</p>
<ol>
<li>If a user agent encounters an element it does not recognize, it must continue to process the children of that element. If the content is text, the text must be presented to the user.</li>
<li>If a user agent encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value).</li>
</ol>
</div>
</section1>
<section1topic='Discovering Support for XHTML-IM'anchor='discovery'>
<p>This section describes methods for discovering whether a Jabber client or other XMPP entity supports the protocol defined herein.</p>
<p>If the queried entity supports XHTML-IM, it MUST return a <feature/> element with a 'var' attribute set to a value of "http://jabber.org/protocol/xhtml-im" in the IQ result.</p>
<examplecaption='Contact Returns Disco Info Results'><![CDATA[
<p>A Jabber user's client MAY send XML &MESSAGE; stanzas containing XHTML-IM extensions without first discovering if the conversation partner's client supports XHTML-IM. If the user's client sends a message that includes XHTML-IM markup and the conversation partner's client replies to that message but does not include XHTML-IM markup, the user's client SHOULD NOT continue sending XHTML-IM markup.</p>
<p>The exclusion of scripts, applets, and other multimedia elements reduces the risk of exposure to harmful or malicious objects caused by inclusion of XHTML content. Because of security concerns related to images, an implementation MAY choose not to show images but instead show only the 'alt' text. Because of security concerns related to hyperlinks, an implementation MAY choose not to make them clickable.</p>
</section1>
<section1topic='W3C Considerations'anchor='w3c'>
<p>The usage of XHTML 1.0 defined herein meets the requirements for XHTML 1.0 Integration Set document type conformance as defined in Section 3 ("Conformance Definition") of <cite>Modularization of XHTML</cite>.</p>
<section2topic='Document Type Name'anchor='w3c-doctype'>
<p>The Formal Public Identifier (FPI) for the XHTML-IM document type definition is:</p>
<code><![CDATA[
-//JSF//DTD Instant Messaging with XHTML//EN
]]></code>
<p>The fields of this FPI are as follows:</p>
<oltype='1'start='1'>
<li>The leading field is "-", which indicates that this is a privately-defined resource.</li>
<li>The second field is "JSF" (an abbreviation for Jabber Software Foundation), which identifies the organization that maintains the named item.</li>
<li>The third field contains two constructs:
<oltype='1'start='1'>
<li>The public text class is "DTD", which adheres to ISO 8879 Clause 10.2.2.1.</li>
<li>The public text description is "Instant Messaging with XHTML", which contains but does not begin with the string "XHTML" (as recommended for an XHTML 1.0 Integration Set).</li>
</ol>
</li>
<li>The fourth field is "EN", which identifies the language (English) in which the item is defined.</li>
<p>A user agent that implements this specification MUST conform to Section 3.5 ("XHTML Family User Agent Conformance") of <cite>Modularization of XHTML</cite>. Many of the requirements defined therein are already met by Jabber clients simply because they already include XML parsers.</p>
<p>However, "ignore" has a special meaning in XHTML modularization (different from its meaning in XMPP). Specifically, criteria 4 through 6 of Section 3.5 of <cite>Modularization of XHTML</cite> state:</p>
<oltype='1'start='4'>
<li>
<p><em>W3C TEXT:</em> If a user agent encounters an element it does not recognize, it must continue to process the children of that element. If the content is text, the text must be presented to the user.</p>
<p><em>JSF COMMENT:</em> This behavior is different from that defined by <cite>XMPP Core</cite>, and in the context of XHTML-IM implementations applies only to XML elements qualified by the 'http://www.w3.org/1999/xhtml' namespace as defined herein. This criterion MUST be applied to all XHTML 1.0 elements except those explicitly included in XHTML-IM as described in the <linkurl="#def">XHTML-IM Integration Set</link> and <linkurl='#profile'>Recommended Profile</link> sections of this document. Therefore, an XHTML-IM implementation MUST process all XHTML 1.0 child elements of the XHTML-IM <html/> element even if such child elements are not included in the XHTML 1.0 Integration Set defined herein, and MUST present to the recipient the XML character data contained in such child elements.</p>
</li>
<li>
<p><em>W3C TEXT:</em> If a user agent encounters an attribute it does not recognize, it must ignore the entire attribute specification (i.e., the attribute and its value).</p>
<p><em>JSF COMMENT:</em> This criterion MUST be applied to all XHTML 1.0 attributes except those explicitly included in XHTML-IM as described in the <linkurl="#def">XHTML-IM Integration Set</link> and <linkurl='#profile'>Recommended Profile</link> sections of this document. Therefore, an XHTML-IM implementation MUST ignore all attributes of elements qualified by the 'http://www.w3.org/1999/xhtml' namespace if such attributes are not explicitly included in the XHTML 1.0 Integration Set defined herein.</p>
</li>
<li>
<p><em>W3C TEXT:</em> If a user agent encounters an attribute value it doesn't recognize, it must use the default attribute value.</p>
<p><em>JSF COMMENT:</em> Since not one of the attributes included in XHTML-IM has a default value defined for it in <cite>XHTML 1.0</cite>, in practice this criterion does not apply to XHTML-IM implementations.</p>
<p>For information regarding XHTML modularization in XML schema for the XHTML 1.0 Integration Set defined in this specification, refer to the <linkurl="#schemas-driver">Schema Driver</link> section of this document.</p>
</section2>
<section2topic='W3C Review'anchor='w3c-review'>
<p>The XHTML 1.0 Integration Set defined herein has been reviewed informally by an editor of the XHTML Modularization in XML Schema specification but has not undergone formal review by the W3C; before this specification proceeds to a status of Final within the Jabber Software Foundation's standards process, it should undergo a formal review through communication with the Hypertext Coordination Group within the W3C.</p>
<p>The W3C is actively working on &w3xhtml2; and may produce additional versions of XHTML in the future. This specification addresses XHTML 1.0 only, but it may be superseded or supplemented in the future by a XMPP Extension Protocol specification that defines methods for encapsulating XHTML 2.0 content in XMPP.</p>
<p>This specification formalizes and extends earlier work by Jeremie Miller and Julian Missig on XHTML formatting of Jabber messages. Many thanks to Shane McCarron for his assistance regarding XHTML modularization and conformance issues. Thanks also to contributors on the Standards-JIG list for their feedback and suggestions.</p>