1
0
mirror of https://github.com/moparisthebest/xeps synced 2024-11-13 21:05:09 -05:00
xeps/inbox/realtimetext.xml

945 lines
95 KiB
XML
Raw Normal View History

2011-02-23 12:52:07 -05:00
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE xep SYSTEM 'xep.dtd' [
<!ENTITY % ents SYSTEM 'xep.ent'>
%ents;
]>
<?xml-stylesheet type='text/xsl' href='xep.xsl'?>
<xep>
<header>
<title>In-Band Real Time Text</title>
<abstract>This is a specification for real-time text transmitted in-band over an XMPP session. </abstract>
<legal>
<copyright>This XMPP Extension Protocol is copyright (c) 1999 - 2011 by the XMPP Standards Foundation (XSF).</copyright>
<permissions>Permission is hereby granted, free of charge, to any person obtaining a copy of this specification (the &quot;Specification&quot;), to make use of the Specification without restriction, including without limitation the rights to implement the Specification in a software program, deploy the Specification in a network service, and copy, modify, merge, publish, translate, distribute, sublicense, or sell copies of the Specification, and to permit persons to whom the Specification is furnished to do so, subject to the condition that the foregoing copyright notice and this permission notice shall be included in all copies or substantial portions of the Specification. Unless separate permission is granted, modified works that are redistributed shall not contain misleading information regarding the authors, title, number, or publisher of the Specification, and shall not claim endorsement of the modified works by the authors, any organization or project to which the authors belong, or the XMPP Standards Foundation.</permissions>
<warranty>## NOTE WELL: This Specification is provided on an &quot;AS IS&quot; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. In no event shall the XMPP Standards Foundation or the authors of this Specification be liable for any claim, damages, or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the Specification or the implementation, deployment, or other use of the Specification. ##</warranty>
<liability>In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall the XMPP Standards Foundation or any author of this Specification be liable for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising out of the use or inability to use the Specification (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if the XMPP Standards Foundation or such author has been advised of the possibility of such damages.</liability>
<conformance>This XMPP Extension Protocol has been contributed in full conformance with the XSF's Intellectual Property Rights Policy (a copy of which may be found at &lt;<link url='http://www.xmpp.org/extensions/ipr-policy.shtml'>http://www.xmpp.org/extensions/ipr-policy.shtml</link>&gt; or obtained by writing to XSF, P.O. Box 1641, Denver, CO 80201 USA).</conformance>
</legal>
<number>xxxx</number>
<status>ProtoXEP</status>
<type>Standards Track</type>
<sig>Standards</sig>
<approver>Council</approver>
<dependencies>
<spec>XMPP Core</spec>
<spec>XEP-0020</spec>
</dependencies>
<supersedes/>
<supersededby/>
<shortname>NOT_YET_ASSIGNED</shortname>
<author>
<firstname>Mark</firstname>
<surname>Rejhon</surname>
<email>markybox@gmail.com</email>
<jid>markybox@gmail.com</jid>
<org>Rejhon Technologies Inc.</org>
<uri>http://www.rejtech.com</uri>
</author>
<revision>
<version>0.0.1</version>
<date>2011-02-21</date>
<initials>MDR</initials>
<remark><p>First draft.</p></remark>
</revision>
</header>
<section1 topic='Introduction' anchor='intro'>
<p>This document introduces a specification for real-time text transmitted in-band over an XMPP session.</p>
<section2 topic='What is Real-Time Text?' anchor='intro-what'>
<p>Real-Time Text is text transmission that is sent as it is produced. In a real-time text chat conversation, the recipient can watch the sender type. Real-time text can be displayed in any suitable manner, for example in normal IM format, or as a split-screen chat.</p>
<p>Real-time text can lead to a more natural conversation format than the traditional instant messaging format, by allowing both parties to read each others' text "as written words are typed" -- much in the same way they can listen to a telephone conversation "as words are spoken". Although real-time text benefits everyone in many situations, it is also especially frequently favored by people dependent on text for communications (including deaf people who cannot use the telephone).</p>
</section2>
<section2 topic='Precedent' anchor='intro-precedent'>
<p>Real-time text is not new. It has been around for decades. Early chat software commonly utilized real-time text in a character-by-character format.</p>
<ul>
<li><p>The 'talk' command on UNIX systems has been using real-time text for many years, since the 1970's.</p></li>
<li><p>ICQ, the first major instant messaging application, had a split screen mode with real-time text back in 1996-1999 before this feature was removed.</p></li>
<li><p>Hobby BBS chat programs from the 1990's often utilized real-time text.</p></li>
<li><p>Recently, in 2008, it has been implemented as part of AOL AIM 6.8 and higher as <span class='ref'>AOL Real-Time IM</span> <note>AOL AIM Real Time Text: &lt;<link url='http://help.aol.com/help/microsites/microsite.do?cmd=displayKC&amp;externalId=223568'>http://help.aol.com/help/microsites/microsite.do?cmd=displayKC&amp;externalId=223568</link>&gt;</note></p></li>
<li><p>In SIP calls, real-time text is defined and used with presentation coding as specified in <span class='ref'>ITU-T T.140</span> <note>ITU-T T.140: Protocol for multimedia application text conversation.</note> and transport specified in <span class='ref'>IETF RFC 4103</span> <note>IETF RFC 4103: RTP Payload for Text Conversation.</note>.</p></li>
</ul>
<p>This specification re-introduces real-time text to modern instant messaging networks based on XMPP. It is fully backwards compatible with XMPP chat clients and servers and allows for alternate chat representations, including split-screen chat. Real-time text provides improved convenience for people preferring a more real-time form of text when communicating over XMPP networks.</p>
<p>In addition, the introduction of a real-time text format in XMPP sessions may enable interoperability with real-time text in SIP, including next generation emergency services such as NG 911 and NG 112. Real-time text is also useful for telecommunications relay services for the deaf where operators translate between text for the deaf user, and speech for the voice phone user in a call.</p>
</section2>
</section1>
<section1 topic='Requirements' anchor='reqs'>
<section2 topic='Goals' anchor='reqs-goals'>
<ol>
<li><p>Provide reliable real time text over today's XMPP networks.</p></li>
<li><p>Be backwards compatible with XMPP chat clients that do not support real time text.</p></li>
<li><p>Make it easy for developers/implementors to implement Real Time Text in steps, with a minimum of code.</p></li>
<li><p>Utilize in-band XMPP messaging, which makes delivery of real time text very simple.</p></li>
<li><p>Be compatible with multi-user chat, even when only some chat participants support real time text.</p></li>
<li><p>Allow multiple modes of chat, including traditional IM user interface, split-screen chat, and other modes.</p></li>
<li><p>Support real-time editing of chat text, since most non-real-time clients allow editing the message before sending.</p></li>
<li><p>Meet the quality requirements for real-time text. This is specified in <span class='ref'>ITU-T F.703</span> <note>ITU-T F.703: Multimedia conversational services.</note> with an end-to-end delay of less than two seconds and transmission loss of less than 0.2%.</p></li>
<li><p>Balance backwards compatibility and XMPP Server loading issues.</p></li>
</ol>
</section2>
<section2 topic='Data Efficiency' anchor='reqs-efficiency'>
<p>In-band real time text over XMPP is less data-efficient than out-of-band real-time text (such as using XEP-0166 &xep0166; to negotiate an out-of-band connection that uses RFC 4103 / T.140). Potentially, a whole XMPP &lt;message&gt; could be transmitted with every character typed to achieve the real-time text, which seem excessive for many chat network architectures. This specification includes several techniques to maximize the quality of real time text while minimizing the load on XMPP servers. It is much easier to implement real time text in an in-band manner completely within the scope of XMPP. It is desirable to improve the user experience of the IM service with the real-time performance. Experiments with a working implementation (see below) have shown that it is easy to extend preexisting code bases to handle the real-time text extension, often with no modification to preexisting XMPP libraries (including available open-source XMPP libraries) used by existing XMPP chat clients.</p>
</section2>
<section2 topic='Server Performance' anchor='reqs-serverperf'>
<p>Many XMPP servers are of sufficiently high-performance to do massive in-band data operations. Modern XMPP servers can handle in-band file-transfers and VoIP via rapid consecutive &lt;message&gt; transfers encoding the binary data in base64 format, despite its obvious inefficiency. The transmission of short fragments of text through separate &lt;message&gt; transfers at regular intervals is a relative light load factor on these modern XMPP servers (circa 2010).</p>
</section2>
<section2 topic='Real-Time Message Editing' anchor='reqs-editing'>
<p>Message editing is the process of editing the message before sending the message. Nearly all instant messaging clients allow you to edit your message before you send the message. The addition of real-time text capability requires that we also transmit the edits too. (In fact, AOL AIM's Real Time Text mode already does this). The recipient user can watch the</p>
<p>sender edit the message before it is officially sent. This includes watching the other person do backspacing, mid-message deletes, cut and paste, insertions of typed text in the middle of text, etc.</p>
<p>Real-time message editing is included in this specification in a platform-independent manner. It adds extra requirements to the specification however, it actually simplifies user interface modifications to preexisting instant messaging clients. This is because the preexisting chat entry text box can be used, without preventing the user from being able to edit their messages when real-time text is enabled. The same traditional instant-messaging user interface can continue to be retained, except that the last message from the remote user is continually refreshed with real time text typing and edits.</p>
</section2>
<section2 topic='Low Software Complexity' anchor='reqs-complexity'>
<p>A major baseline goal of this specification is to provide real-time text functionality with minimal complexity for software developers. Software developers are able to stay within one architecture (XMPP and XMPP servers) and one data format (XML). This standard is designed to allow developers to continue utilizing their preexisting XMPP libraries when extending chat software to support Real Time Text.</p>
</section2>
<section2 topic='Multi-User Chat' anchor='reqs-muc'>
<p>This specification is designed to be compatible with group chats using XEP-0045 &xep0045;. Support for XEP-0045 is OPTIONAL. Chat is supported between mixed clients (that is, some clients supporting real time text, some clients not supporting real time text). Situations warranting real-time text is mainly one-on-one chat, so clients supporting real-time text may be designed to support real-time-text only for one-on-one chats, even if the client supports multi-user chat. The full JID is used to identify between multiple real time text streams, including multiple logins by the same user. Please note, that at this time, support for multi-user chat is extremely experimental, especially in regards to Feature Negotiation behavior.</p>
</section2>
<section2 topic='User Interface Considerations' anchor='reqs-ui'>
<p>Chat software SHOULD provide a UI mechanism to enable/disable real-time text, since the user may not want to send their conversation typing and edits in a real-time manner. Clients that implement real-time-text should have an option to turn on/off the real-time feature globally and/or per-conversation. Clients that turn on real-time-text globally by default as an intrinsic feature of a specific chat software, should clearly indicate the real-time nature of the chat software to the user in order to set end user expectations.</p>
<p>There are several ways to present an instant messaging conversation. Special considerations are needed to incorporate real time text into an existing instant messaging conversation. Several display presentations have been found to be practical for XMPP conversations with real-time text:</p>
<ol>
<li><p>Traditional IM, also known as IRC-style.</p></li>
<li><p>Split screen.</p></li>
<li><p>Hybrid IM user interface, same as traditional but with an extra text box for other people's real time text.</p></li>
<li><p>Vertical side by side columns, with the vertical position of messages based on relative time of entry.</p></li>
</ol>
<p>Implementors of Real Time Text are not required to provide all presentations of chat and can continue to only use the traditional IM user interface, for programming simplicity.</p>
<p>The author of this specification, Mark Rejhon, has submitted a working demo Jabber chat client of this specification to help accelerate understanding and adoption of this specification and test experimental modifications to this specifications. It supports traditional IM, split screen, and hybrid modes of text display. This client runs successfully on many XMPP networks including Google Talk. For more information, see: [Web link TBD -- contact Mark Rejhon for demo of software]</p>
</section2>
</section1>
<section1 topic='Protocol' anchor='protocol'>
<section2 topic='Functional Goals of the Specification'>
<ol>
<li><p>Provide a method of real time text chat transmission over XMPP networks in an in-band manner.</p></li>
<li><p>Provide a method of real time editing of messages.</p></li>
<li><p>Provide a method of detection of lost and/or out-of-order delivery of messages.</p></li>
<li><p>Provide a method of negotiation of the real-time text feature.</p></li>
<li><p>Provide basic information for enabling interoperability with real-time text in other protocol environments.</p></li>
</ol>
</section2>
<section2 topic='Quick Start' anchor='protocol-quickstart'>
<p>Real-Time Text (RTT) is transmitted via the &lt;rtt&gt; element in &lt;message&gt; transmission. Each &lt;rtt&gt; element has a message number, a sequence number, and a type attribute that indicates a state such as start of a new real time message. The element is reasonably compact, while maintaining readability.</p>
<p>Example transmission of "Hello World!" in four real time blocks sent separately, at a typical three key presses per block:</p>
<code><![CDATA[
<message from='Alice' to='Bob' type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hel</t></rtt>
</message>
<message from='Alice' to='Bob' type='chat'>
<rtt msg='0' seq='1'><t>lo </t></rtt>
</message>
<message from='Alice' to='Bob' type='chat'>
<rtt msg='0' seq='2'><t>Wor</t></rtt>
</message>
<message from='Alice' to='Bob' type='chat'>
<rtt msg='0' seq='3'><t>ld!</t></rtt>
<body>Hello World!</body>
</message>
]]></code>
<p>Typical chat clients that do not support RTT will ignore the first three messages because they contain no &lt;body&gt; data. Clients that do not support real-time text will only receive the full message at once, when the whole message is sent at the end, with the widely-adopted &lt;body&gt; element in &lt;message&gt; transmissions. Receiving clients that DO support real-time text SHOULD show the text immediately in real-time as the rtt messages are received. The number of characters in each &lt;rtt&gt; element can vary, though three characters per rtt message is illustrated here as an example. Characters SHOULD be buffered and then transmitted at regular intervals. For senders, the default interval SHOULD be 1 second (1000 milliseconds) in order to balance user perceptions of real-time text and possible overloading of the XMPP network. For more information about intervals, please see section <link url='#protocol-interval'>Interval of Transmission of Real-Time Text</link></p>
</section2>
<section2 topic='Format of the &lt;rtt&gt; Element' anchor='protocol-format'>
<code><![CDATA[
<message from='Alice' to='Bob' id='12345678' type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello</t></rtt>
</message>
]]></code>
<p>The &lt;rtt&gt; element is transmitted at a regular interval, once per second by default, of all key presses buffered since the last &lt;rtt&gt; element. There MAY also be multiple &lt;rtt&gt; elements in a single &lt;message&gt;. If there are no additions or changes to the text during the transmission interval, then no &lt;rtt&gt; element is transmitted.</p>
<section3 topic='msg Attribute'>
<p><em>Format: 32-bit unsigned integer as a string, incrementing after a message is completed and sent.</em></p>
<p>The <strong>msg</strong> attribute is the current message number of a specific participant's messages in a chat/group conversation. When a user starts chat (or joins a group chat), the message attribute MUST be 0 for the first chat message from that participant, and increment every time a completed message is sent (i.e. via Enter/Send, where the full message is transmitted as a &lt;body&gt;).</p>
</section3>
<section3 topic='seq Attribute'>
<p><em>Format: 32-bit unsigned integer as a string, incrementing on every single &lt;rtt&gt; transmission.</em></p>
<p>The <strong>seq</strong> attribute is the session sequence number for &lt;rtt&gt; messages by the chat participant, since the chat session started. Every time a chat session starts, "seq" MUST be 0 for that participant, and MUST continually increment until the participant leaves chat.</p>
</section3>
<section3 topic='type Attribute'>
<p><em>Format: Predefined string value:</em><br/>
<em>'new' to indicate a new message, the first &lt;rtt&gt; element of a new message. (msg value increments)</em><br/>
<em>'reset' to clear the current message and starts over the current message. (msg value does not increment)</em></p>
<p>The <strong>type</strong> attribute defines a special type of &lt;rtt&gt; element. Upon encountering either attribute type='new' or type='reset', any old preexisting message in the real-time text buffer MUST be cleared to make way for the message. An attribute value of 'reset' behaves like a 'new' except that it replaces the current message instead of beginning the next message. The next &lt;rtt&gt; element in a subsequent &lt;message&gt; after a message containing a &lt;body&gt;, MUST have a type='new'. Other values for type may be defined in the future. To maintain backwards compatibility, an unrecognized value for the "type" attribute should be considered an out-of-sync condition until the next &lt;rtt&gt; element with type='new' or type='reset' is received. The purpose of 'reset' is to provide a means of error recovery from out-of-sync conditions, useful during long periods of typing in the same message, on an error-prone medium. Senders are NOT required to use type='reset'. Recipients MUST be able to process type='reset' when transmitting &lt;rtt&gt; with type='reset'</p>
</section3>
<section3 topic='Inner Text of &lt;rtt&gt; Element'>
<p>This is defined as the text fragment, akin to a small fragment of a &lt;body&gt;, normally to be appended at the end of the conversation text. The same coding rules apply for these fragments as for the &lt;body&gt;. The coding is specified to be of MIME type text/plain in UNICODE UTF-8 transform as specified in XEP-0071 &xep0071;. In order to allow backspacing, deletes and text editing in real time, as well as cursor movements, Edit Codes are used. The above example uses Edit Code &lt;t&gt; which represents Insert Text. For more information about Edit Codes, please see the following section <link url='#protocol-editcodes'>Edit Codes For Real-Time Text</link>. Further extensions at a later time may define how real-time text transmission of HTML coded messages can be supported.</p>
</section3>
<section3 topic='Summary Of Attribute Rules'>
<ul>
<li><p><strong>type='new'</strong> MUST be used for the first &lt;rtt&gt; element of a new message, to indicate the start of a new message.</p></li>
<li><p><strong>msg</strong> MUST increment every time after a completed message transmission (&lt;body&gt; transmitted in a &lt;message&gt;).<br/>
The first &lt;rtt&gt; element after a completed message transmission MUST also have a <strong>type='new'</strong></p></li>
<li><p><strong>seq</strong> MUST increment during every &lt;rtt&gt; transmission from the beginning of the XMPP session.</p></li>
<li><p><strong>type='reset'</strong> MUST clear the contents of the real time message buffer. The <strong>msg</strong> value must not change.</p></li>
<li><p><strong>type</strong> with an unrecognized value MUST be considered an "out of sync" condition. The easiest recovery technique is to simply pause real-time text until recovered with the next &lt;rtt&gt; with <strong>type='new'</strong> or <strong>type='reset'</strong>.</p></li>
<li><p><strong>seq</strong> MUST be monitored to detect a non-incrementing number, for missing or out-of-order &lt;rtt&gt; elements. When this happens, this indicates an "out of sync" condition, and to pause real-time text until recovered.</p></li>
<li><p><strong>seq</strong> MUST be used to ignore duplicate &lt;rtt&gt; elements. The easiest and minimal technique is to ignore &lt;rtt&gt; messages containing an identical or lesser sequence number than the previous &lt;rtt&gt; element.</p></li>
<li><p><strong>seq</strong> MAY be used to recover from out-of-order &lt;rtt&gt; elements, via OPTIONAL buffering and sorting of previously delivered &lt;rtt&gt; elements.</p></li>
<li><p><strong>msg</strong> MAY be used as supplementary info for missing-message detection, and to manipulate an array of messages by the user.</p></li>
</ul>
<p>For more information about improved recovery techniques for missing, duplicate, and out-of-order &lt;rtt&gt; elements, please see section <link url='#protocol-errorrecovery'>Error Recovery of Real Time Text</link>.</p>
</section3>
</section2>
<section2 topic='Usage of &lt;body&gt; and/or &lt;html&gt;' anchor='protocol-body'>
<p>In normal real-time-text transmissions, &lt;message&gt; transmissions containing &lt;rtt&gt; elements do not contain a &lt;body&gt; until the user actually commits/completes the message (i.e. by clicking a Send button or hitting Enter in a chat application). For a message considered "complete", there MUST be a &lt;body&gt; element containing the entire message of text.</p>
<p>Data in a &lt;body&gt; is essentially redundant due to all the fragments of text having been transmitted in real time in &lt;rtt&gt; elements. However, the transmission of full messages in traditional &lt;body&gt; and/or &lt;html&gt; elements, have several useful purposes:</p>
<ol>
<li><p>Be backwards compatible with chat clients that do not support this specification;</p></li>
<li><p>Allows group chats to function in a mixed manner (not all users in a group chat are required to have software that supports real-time text);</p></li>
<li><p>To signal to the recipient(s) that the message is complete;</p></li>
<li><p>Allow recovery from lost real-time-messages and out-of-sync condition;</p></li>
<li><p>Allow verification of the correctness of real-time messages for debugging purposes;</p></li>
<li><p>Allows continued compatibility with XEPP-0071: XHTML-IM, for all completed messages.</p></li>
</ol>
<p>Upon receipt of &lt;body&gt; (or &lt;html&gt;), the real-time text SHOULD be immediately be replaced by the final text within &lt;body&gt; (or &lt;html&gt;). In a functioning implementation with no loss of &lt;rtt&gt; messages, the final real time text string should be exactly the same as the text string embedded in &lt;body&gt;.</p>
<p>Although &lt;html&gt; can optionally be used along with &lt;body&gt;, the real-time portion &lt;rtt&gt; is essentially small fragments of the plain text &lt;body&gt; transmitted at regular intervals, and does not currently support HTML. Real time HTML may be added to this specification in the future as an OPTIONAL feature, negotiated via XEPP-0020 feature negotiation.</p>
</section2>
<section2 topic='Rules for &lt;rtt&gt; sent with &lt;message&gt;' anchor='protocol-rules'>
<ol>
<li><p>A &lt;message&gt; containing the final &lt;rtt&gt; update completing a message, MUST also contain a &lt;body&gt; element transmitting the full message.</p></li>
<li><p>An &lt;rtt&gt; element MAY also be empty, in which case, it should be considered to have a blank text string (no change to real time text). This situation can happen if sending a &lt;message&gt; with a &lt;body&gt; that has no real time text changes since the last &lt;rtt&gt; transmission.</p></li>
<li><p>A &lt;message&gt; SHOULD contain only one &lt;rtt&gt; element. For backwards compatibility with any potential future redundancy extension (not yet documented), clients MUST process only the first &lt;rtt&gt; element of a &lt;message&gt;.</p></li>
<li><p>For outgoing messages, software on both ends of a two-person conversation MUST keep track of their respective "msg" and "seq" values, as they are relative to the sending software's own messages. All &lt;rtt&gt; messages MUST contain both of these 2 attributes.</p></li>
<li><p>For incoming messages, software that implements this specification MUST process "seq", but is NOT REQUIRED to process "msg" attributes. "seq" is necessary to maintain sync accuracy. However, "msg" serves mainly to enhance the functionality and reliability of Real Time Text.</p></li>
<li><p>All numeric attributes ("msg" and "seq") SHOULD be 32-bit unsigned integers. In the unlikely event of hitting the maximum integer value (4294967295), they will wrap to 0. Software SHOULD check for and recover from this condition, however unlikely it would ever occur in a single conversation session between humans. In evaluations of relative values of attributes, values of counters recently wrapped around shall be considered higher than those approaching its maximum value.</p></li>
</ol>
<p>Legacy chat clients do not recognize and will ignore the &lt;rtt&gt; elements of &lt;message&gt; transmissions, and behave normally with the &lt;body&gt; element of &lt;message&gt; transmissions. That is, they will behave in the usual message-at-a-time manner.</p>
</section2>
<section2 topic='Error Recovery of Real Time Text' anchor='protocol-errorrecovery'>
<p>Overloaded XMPP servers are known to fail to deliver messages, potentially randomly. In addition, it has been observed that some XMPP servers deliver messages out-of-order. This happens when congestion occurs, and then consecutive messages get suddenly delivered all at once, with the same time stamp, and not all servers compensate for this. Therefore, this specification documents a method of detecting and optionally recovering from this condition.</p>
<section3 topic='Ignoring Duplicate &lt;rtt&gt; Elements'>
<p>An &lt;rtt&gt; element is considered a duplicate when the <strong>seq</strong> attribute is equal to, or less than, the last successfully processed &lt;rtt&gt; element. The duplicate should ignored and not displayed, and processing should resume on the next &lt;rtt&gt; element that is received.</p>
</section3>
<section3 topic='Entering an &quot;OUT OF SYNC&quot; Condition'>
<p>The real time text is considered "OUT OF SYNC" when any of the following happens with an &lt;rtt&gt; element</p>
<ol>
<li><p>(REQUIRED) The <strong>seq</strong> attribute increases by 2 or more. (missing message)</p></li>
<li><p>(REQUIRED) The <strong>type</strong> attribute contains an unrecognized value.</p></li>
<li><p>(RECOMMENDED) The <strong>msg</strong> attribute changes without an attribute <strong>type='new'</strong></p></li>
</ol>
</section3>
<section3 topic='Exiting the &quot;OUT OF SYNC&quot; Condition'>
<p>When "OUT OF SYNC" occurs, a client MUST immediately pause updating of real time text, until getting back into sync through any ONE of the following:</p>
<ol>
<li><p>(REQUIRED) The next &lt;rtt&gt; element containing a <strong>type='new'</strong> or <strong>type='reset'</strong>. This MUST clear the real-time text message buffer and replace it with the specified real-time text.</p></li>
<li><p>(OPTIONAL) Any &lt;rtt&gt; element of same <strong>msg</strong> value, that begins with a &lt;r/&gt; clear-message edit code. (This is considered an equivalent of a <strong>type='reset'</strong> in behaviour). This is because the whole message state is reset, and the message state is now determinate, regardless of missed &lt;rtt&gt; messages.</p></li>
<li><p>(OPTIONAL) A client MAY optionally buffer unprocessed &lt;rtt&gt; in an attempt to capture missing &lt;rtt&gt; messages (via collecting messages from out-of-order message delivery) and attempt to successfully catch up. While doing this, it should not preclude other methods of getting back into sync. If another method of getting back into sync occurs, this buffer should then be discarded immediately.</p></li>
</ol>
</section3>
<section3 topic='Client Guidelines For Visual Representation of an &quot;OUT OF SYNC&quot; Condition'>
<ol>
<li><p>During an "OUT OF SYNC" condition, the client MUST cease to visually display subsequent &lt;rtt&gt; messages until the "OUT OF SYNC" condition is resolved.</p></li>
<li><p>A client MAY optionally display an indicator or message to indicate an "OUT OF SYNC" condition. In this case, a client MUST subsequently clear the indicator, when the "OUT OF SYNC" condition is resolved for the message.</p></li>
<li><p>Upon receipt of the next &lt;message&gt; containing a &lt;body&gt;, the stalled real time text should be replaced with this complete message. This is a handy form of error-correction for lost real time text.</p></li>
</ol>
<p>In practice, in an "OUT OF SYNC" condition, the end-user of an XMPP chat client would simply experience the real-time-text being temporarily disabled until the full message is sent. This is usually harmless because the message is re-transmitted as a normal &lt;body&gt; when the sender hits Enter or clicks Send. In extreme situations, where conversation is significantly disrupted, it is already likely that regular messages containing a &lt;body&gt; will be dropped, even in conversations without real-time text.</p>
<p>For better reliability after an "OUT OF SYNC" condition, clients MAY optionally buffer unprocessed &lt;rtt&gt; elements, until the missing &lt;rtt&gt; elements are delivered, such as delayed or out-of-order &lt;message&gt; text. For simplicity of programming this is NOT required, as most well-written XMPP servers do not deliver messages out-of-order, and many high-performance XMPP servers such as those run by Google, are now reliable enough to handle real-time text.</p>
<p>For software that implements XEP-0184 &xep0184; in conjunction with this specification, it is NOT RECOMMENDED to use XEP-0184 for &lt;messages&gt; containing only &lt;rtt&gt; (and no other information) due to bandwidth and server loading considerations.</p>
</section3>
</section2>
<section2 topic='Feature Negotiation' anchor='protocol-negotiation'>
<p>The mechanism of feature negotiation is XEP-0020 &xep0020;. FORM_TYPE must be set to &quot;realtimetext&quot;.</p>
<section3 topic='Available Features'>
<table caption='Available Features' border='0'>
<tr>
<th>Field</th>
<th>Requirement</th>
<th>Entity To Query</th>
<th>Type</th>
<th>Default</th>
</tr>
<tr>
<td>enable</td>
<td>REQUIRED</td>
<td>Client</td>
<td>boolean</td>
<td>1</td>
</tr>
<tr>
<td>typingdelays</td>
<td>RECOMMENDED</td>
<td>Client</td>
<td>boolean</td>
<td>0</td>
</tr>
<tr>
<td>interval</td>
<td>OPTIONAL</td>
<td>Server,&nbsp;Client</td>
<td>text-single</td>
<td>1000</td>
</tr>
</table>
<section4 topic='enable'>
<p>This REQUIRED field is to initiate a real time text session. To save bandwidth and server resources, clients SHOULD NOT begin sending &lt;rtt&gt; elements, until a reply of 1 is received for 'enable'.</p>
<ul>
<li>
<p><em>Initiating a Real Time Text session</em><br/>
Clients that wish to initiate a real time text session, MUST send a value of 1.<br/>
Clients that wish to accept a sender's request to initiate real time text, MUST reply with a value of 1.<br/>
Cients that wish to reject a sender's request to initiate real time text, MUST reply with a value of 0.<br/>
Clients that do not support real time text will reply with an error (XEP-0020).</p>
</li>
<li>
<p><em>Terminating a Real Time Text session</em><br/>
Clients that wish to end a real time text session in progress, MUST send a value of 0.<br/>
Clients that acknowledges a request to end a real time text session, MUST reply with a value of 0.</p>
</li>
<li>
<p><em>Querying for Real Time Text support</em><br/>
Clients that wish to query whether real time text is supported, MUST send no value (send a blank string).<br/>
Client that support real time text, MUST reply with the current or preferred value.<br/>
Clients that do not support real time text will reply with an error (XEP-0020).<br/>
It is NOT REQUIRED to query first before attempting to initiate a real time text session.</p>
</li>
</ul>
</section4>
<section4 topic='typingdelays'>
<p>This RECOMMENDED field is to announce client support for <link url="#naturaltyping">Natural Typing Mode</link> (transmission of inter-character delays between key presses). Delay escape codes are used to transmit key press delays, for more natural playback of typing, even during long intervals between &lt;rtt&gt; elements.</p>
<ul>
<li><p>Clients that support the delay escape code, MUST respond with a &quot;true&quot;.</p></li>
<li><p>Unless a client responds with a &quot;true&quot;, it is RECOMMENDED that delay escape codes not be sent to this client.</p></li>
</ul>
</section4>
<section4 topic='interval'>
<p>This OPTIONAL field is for a server to announce a preferred &lt;rtt&gt; sending interval, if the interval is other than the default of 1000 milliseconds. A server being queried for this value MAY respond with a numeric value in milliseconds between rtt packets. Clients SHOULD not send messages containing only &lt;rtt&gt; elements more frequently than this interval value (in milliseconds). A client being queried for this value, MAY respond with a numeric value in milliseconds between rtt packets.</p>
<p>Different interval values could be used in different network scenarios, such as:</p>
<ul>
<li><p>Reduced intervals: LAN XMPP servers, specialized servers, vendors who operate high-performance servers.</p></li>
<li><p>Increased intervals: Clients running on mobile devices, Servers that are overloaded.</p></li>
<li><p>An interval of 0 is allowed, to allow clients to send every single key press in individual &lt;rtt&gt; elements, regardless of typing speed. This is useful for LAN-based XMPP servers that want to maximize responsiveness of real time text.</p></li>
<li><p>Intervals may be renegotiated by the server at any time during the middle of an XMPP conversation. For more information, please see section <link url='#protocol-interval'>Interval of Transmission of Real-Time Text</link>.</p></li>
</ul>
</section4>
</section3>
<section3 topic='Format of Query'>
<code><![CDATA[
<iq type='set' from='Alice' to='Bob' id='rttquery'>
<feature xmlns='http://jabber.org/protocol/feature-neg'>
<x xmlns='jabber:x:data' type='form'>
<field var='FORM_TYPE' type='hidden'>
<value>realtimetext</value>
</field>
<field var='enable' type='boolean'>
<value>1</value>
<required/>
</field>
<field var='typingdelays' type='boolean'/>
<value>0</value>
</field>
<field var='interval' type='text-single'/>
<value>1000</value>
</field>
</x>
</iq>
]]></code>
</section3>
<section3 topic='Format of Query Reply'>
<code><![CDATA[
<iq type='set' from='Alice' to='Bob' id='rttquery'>
<feature xmlns='http://jabber.org/protocol/feature-neg'>
<x xmlns='jabber:x:data' type='form'>
<field var='FORM_TYPE' type='hidden'>
<value>realtimetext</value>
</field>
<field var='enable' type='boolean'>
<value>1</value>
</field>
<field var='typingdelays' type='boolean'/>
<value>0</value>
</field>
<field var='interval' type='text-single'/>
<value>1000</value>
</field>
</x>
</iq>
]]></code>
</section3>
<section3 topic='Other Features Under Discussion'>
<ol>
<li><p>OPTIONAL negotiation may be done to permit retroactive real time editing of previously-sent messages of chat, i.e. allowing cursor up/down in a split-screen chat.</p></li>
<li><p>OPTIONAL negotiation may be done to enable retrieval of missing messages, for either error-recovery, or to re-populate the local chat buffer from the remote end after a disconnection/crash, or to allow late-joining participants of a group chat (which may be split-screen based) to read the contents of the other peoples' chat buffers (if retroactive history is enabled/permitted for that particular conversation/group/room)</p></li>
<li><p>OPTIONAL negotiation may be done to limit scope of editing. This may be needed to make interoperability with other forms of real-time text feasible. Some T.140 systems may only allow editing of the last 800 characters typed, and erasure of last character on the message is the only allowable edit operation of already transmitted text. (i.e. T.140, TTY and other legacy text telephones)</p></li>
</ol>
</section3>
</section2>
<section2 topic='Edit Codes For Real-Time Text' anchor='protocol-editcodes'>
<section3 topic='Purpose of Edit Codes'>
<p>With Real Time Text, you can watch the remote user edit their own message of chat text, including backspacing, deleting, inserting of text, and cursor movements.</p>
<p>Most chat clients, including both XMPP and proprietary instant messengers, allow you to edit your message before sending the message. The introduction of real time functionality to existing chat client software must not degrade a user's preexisting ability to edit their messages before sending.</p>
<p>Edit Codes are XML elements that define edit operations including backspacing, deleting of text, inserting of text, cursor movements, and inter-key press delays. For efficiency, Edit Codes are kept as short as possible with single character tags and attributes. XMPP Real Time Text is based on a series of zero or more Edit Codes. Even the mere typing of text at the end of a message, is an Edit Code &lt;t&gt; defining a single Insert Text operation. This is already illustrated in all examples previous to this page.</p>
</section3>
<section3 topic='Editing Bounds'>
<p>The message editing boundary is confined to the current instant message being composed. Backspaces at the beginning of the message are ignored, and do not concatenate the current message with the previous message. Cursor movements are constrained to the current message only and not to previous messages. Attributes within XML tags can be defined in any sequence.</p>
</section3>
<section3 topic='List of Edit Codes'>
<table caption='List of Edit Codes'>
<tr>
<th>Action</th>
<th>Code</th>
<th>Description</th>
</tr>
<tr>
<td><p>Insert&nbsp;Text</p></td>
<td><p>&lt;t&nbsp;p='#'&gt;text&lt;/t&gt;</p></td>
<td><p>Insert specified <strong>text</strong> at position <strong>p</strong> in message.<br/> <strong>p</strong> may be omitted to append text to the end of the message.</p>
</td>
</tr>
<tr>
<td><p>Backspace</p></td>
<td><p>&lt;e&nbsp;p='#'&nbsp;n='#'/&gt;</p></td>
<td width="374">
<p>This deletes <strong>n</strong> characters of text to the <u>left</u> of position <strong>p</strong> in message.<br/>
Either <strong>n</strong> and/or <strong>p</strong> may be omitted.<br/>
Default (<strong>n = 1)</strong> to backspace 1 character.<br/>
Default (<strong>p = length of message)</strong> to backspace from end of message.</p>
</td>
</tr>
<tr>
<td><p>Delete</p></td>
<td><p>&lt;d&nbsp;p='#'&nbsp;n='#'/&gt;</p></td>
<td width="374">
<p>This deletes <strong>n</strong> characters of text to the <u>right</u> of position <strong>p</strong> in message.<br/>
Attribute <strong>n</strong> may be omitted, while <strong>p</strong> is required.<br/>
Default (<strong>n = 1)</strong> to delete 1 character.</p>
</td>
</tr>
<tr>
<td><p>Reset</p></td>
<td><p>&lt;r/&gt;</p></td>
<td><p>Clears the entire message.</p></td>
</tr>
<tr>
<td><p>Cursor&nbsp;Position</p></td>
<td><p>&lt;c&nbsp;p='#'/&gt;</p></td>
<td><p>Move cursor to position <strong>p</strong> in message.</p></td>
</tr>
<tr>
<td><p>Delay</p></td>
<td><p>&lt;w&nbsp;n='#'/&gt;</p></td>
<td><p>Execute a pause of <strong>n</strong> hundredths of a second.</p></td>
</tr>
</table>
<p>Clients are REQUIRED to support and process all Edit Codes, except where otherwise indicated.</p>
<section4 topic='Insert Text'>
<p><code><![CDATA[<t p='#'>text</t>]]></code></p>
<p><strong>Behavior:</strong> Inserts the text between &lt;t&gt; and &lt;/t&gt; at character position number specified by the <em>p</em> attribute.<br/>
If <em>p</em> is omitted, the text is appended to the end of the current message.</p>
<p><strong>Detail:</strong> Real time text transmission of typing, even at one key press at a time. In most cases, typing occurs at the end of a message. However, during message editing, corrections and text insertions may be made in the middle of a message, which can be specified by attribute <em>p</em>. A <em>p</em> value of 0 represents the beginning of message.</p>
</section4>
<section4 topic='Backspace'>
<p><code><![CDATA[<e n='#' p='#'/>]]></code></p>
<p><strong>Behavior:</strong> Backspaces <em>n</em> times from position <em>p</em> in message. Either <em>n</em> and/or <em>p</em> can be omitted.<br/>
If <em>n</em> is omitted, erase 1 character from position <em>p</em> in message. If <em>p</em> is omitted, erase from the end of message.</p>
<p><strong>Detail:</strong> This specific code behaves like a backspace key, which erases text to the <u>left</u> of position <em>p</em>. Any text at position <em>p</em> and to the right, is dragged towards the left. Also, omitting both attributes and using a short code &lt;e/&gt; should erase 1 character from the end of the message. Excess backspaces are ignored if <em>p</em> is large.</p>
</section4>
<section4 topic='Delete'>
<p><code><![CDATA[<d n='#' p='#'/>]]></code></p>
<p><strong>Behavior:</strong> Deletes <em>n</em> characters from position <em>p</em> in message. If <em>n</em> is omitted, delete <em>1</em> character.</p>
<p><strong>Detail:</strong> This code behaves like the delete key rather than the backspace key, so text to the <u>right</u> of position <em>p</em> is erased. Excess deletes are ignored if <em>p</em> is large.</p>
</section4>
<section4 topic='Reset'>
<p><code><![CDATA[<r/>]]></code></p>
<p><strong>Behavior:</strong> Erases the whole message.</p>
<p><strong>Detail:</strong> This code clears the message without needing to know how long the message is.</p>
</section4>
<section4 topic='Cursor Position'>
<p><code><![CDATA[<c p='#'/>]]></code></p>
<p><strong>Behavior:</strong> Moves cursor (caret) to the character position <em>p</em> in message.</p>
<p><strong>Detail:</strong> This RECOMMENDED code allows receiving clients to optionally display a cursor in received real time text. None of the prior codes (Insert, Backspace, Delete, Reset) require knowledge of this cursor position. Recipients are RECOMMENDED to interpret this code. Recipients MUST ignore this code if not supported. Senders SHOULD send this code, when the cursor position is at a different position than from the last Edit Code.</p>
</section4>
<section4 topic='Delay'>
<p><code><![CDATA[<w n='#'/>]]></code></p>
<p><strong>Behavior:</strong> Executes a delay of <em>n</em> hundredths of a second. (Natural Typing mode)</p>
<p><strong>Detail:</strong> This RECOMMENDED code, used for <link url="#naturaltyping">Natural Typing Mode</link>, makes it possible to transmit original inter-character key press delays, to allow smooth display of of text, instead of a bursty display of text, even at long buffering intervals and/or on a high-latency connection. Recipients are RECOMMENDED to interpret this code. Recipients MUST ignore this code if not supported. Senders SHOULD send these codes to indicate the delay that occurred between keypresses and/or edit operations.</p>
</section4>
</section3>
<section3 topic='Rules for Edit Codes' anchor='rules-editcodes'>
<ol>
<li><p>Nesting of XML elements is not allowed. All Edit Codes must be embedded only 1 level deep in the &lt;rtt&gt; tree.</p></li>
<li><p>The Insert Text element &lt;t&gt; SHOULD NOT contain any nested XML elements.</p></li>
<li><p>Unrecognized XML elements within &lt;rtt&gt; MUST be ignored, including any nested text within these tags.</p></li>
<li><p>Unrecognized attributes within supported XML elements MUST be ignored, for future compatibility.</p></li>
<li><p>Attributes MUST be represented as 32-bit unsigned integers within a range of 0 through 4294967295.</p></li>
<li><p>Newline Control Codes are allowed in the inner text of &lt;t&gt;, and are treated as a single character that can be erased by either &lt;d&gt; or &lt;e&gt; codes. (For more information, see <link url='#rules-unicode-controlcodes'>Rules for Unicode Control Codes</link> below.)</p></li>
</ol>
</section3>
<section3 topic='Rules for Positions' anchor='rules-positions'>
<ol>
<li><p>All position values for real time text, MUST correspond to the character index into the string of the real time text in UTF-16 format, with all XML entities already decoded (i.e. &amp;lt; &amp;gt; &amp;amp; decoded into their corresponding characters). UTF-8 must be decoded to UTF-16 before determining cursor position. For more information, see <link url='#i18n'>Internationalization Considerations</link> in this specification.</p></li>
<li><p>Sender clients SHOULD send &lt;c n='#'&gt; cursor position updates at regular intervals, if the cursor moves, even if the message text does not change. This allows receivers to optionally visually display a cursor (caret).</p></li>
<li><p>Recipient clients MAY optionally display a cursor (caret) within the incoming real time text, but is not obligated to. The position of this cursor may be estimated based on the last <em>p</em> value in any Edit Code, in the absence of &lt;c n='#'&gt;.</p></li>
<li><p>None of the Edit Codes require knowledge of the previous cursor position. This is because the <em>p</em> attribute in all Edit Codes specifies absolute position for all edits that occur, independently of the previous cursor position.</p></li>
</ol>
</section3>
<section3 topic='Rules for Unicode Control Codes' anchor='rules-unicode-controlcodes'>
<ul>
<li><p>U+000A: Newline is allowed in real time text, and are considered as 1 character that can be deleted or inserted in messages. These are allowed exactly in the same way as in the &lt;body&gt; of messages.</p></li>
<li><p>U+0009: Tabs MUST be replaced with spaces by the sender before edit code processing and transmission. It is RECOMMENDED to use 8 spaces per tab. Different XML parsers have different behavior in tabs that affect cursor position for real time edit codes.</p></li>
<li><p>U+000D: Carriage returns MUST NOT be transmitted. Carriage returns MUST be ignored and stripped from text sent in &lt;rtt&gt; before processing edit codes (including cursor position codes), in order to ensure that newlines behave as 1 character.</p></li>
<li><p>Other control codes (U+0000 to U+001F and U+007F to U+009F) are NOT allowed. They MUST NOT be transmitted in real time text, and they MUST be stripped by the sender before real time text processing and transmission.</p></li>
</ul>
</section3>
<section3 topic='Natural Typing Mode (Delay Code)' anchor='naturaltyping'>
<p>The RECOMMENDED 'Delay' Edit Code &lt;w p='#'/&gt; defines a delay of <em>p</em> in hundredths of seconds. Transmitted in between any Edit Code, it represents the original delay between key presses, cursor movements, other text change events (including any block edit operations), independent of the RTT message transmission interval.</p>
<p>When both ends support Delay Code, the sending end records the typing including the inter-key press delay intervals, sends the transmission, and the receiving client plays back the typing. The original inter-key delays are transmitted in all of the individual Delay Codes that were added to the &lt;rtt&gt; transmission in between all the other Edit Codes. As a result, the typing looks completely natural; just like the sender was at a local keyboard. This naturalness is still preserved over satellite and long international connections with heavy packet-bursting tendencies, and even when XMPP packets are sent with large transmission intervals such as 3000ms.</p>
<p>The inclusion of support for the Delay Code is highly and strongly RECOMMENDED. Tests have shown excellent usability improvements to real time text compared to bursting of text. The immediate presentation of every received rtt message leads to an uncomfortable &quot;bursty&quot; look. With Delay Codes, tests have shown that people could hardly notice the difference between any intervals, since the key presses play back with original typing delays regardless of transmission interval. At longer transmission interval of 3000ms, even when only one XMPP message was transmitted every three seconds, the typing still continued to appear natural, with the exception that there was now a 3 second lag in the conversation. For this reason, it is RECOMMENDED to use no more than a 1000ms interval which is the default chosen as a compromise between XMPP server loading, versus the lag in real time text.</p>
<ul>
<li><p>It is highly and strongly RECOMMENDED that sending clients send the 'Delay' Edit Codes between all other Edit Codes, if there was a recorded delay. Senders MUST send the 'Delay' Edit Code if feature negotiation has enabled the &quot;typingdelays&quot; feature extension. (For more information, see section <link url='#protocol-negotiation'>Feature Negotiation</link>.)</p></li>
<li><p>Recipient clients that support the Delay Code MUST process these codes using non-blocking delay techniques such as timers or multi-threaded programming, in order to maintain full responsiveness of the client user interface.</p></li>
<li><p>Recipient clients that receive a fully composed message in a &lt;message&gt; containing &lt;body&gt; or &lt;html&gt; should immediately interrupt processing of any in-progress delays, discontinue processing any remaining Edit Codes, and then immediately display the fully composed message found in &lt;body&gt; or &lt;html&gt;. This is to ensure that Delay Codes do not introduce any lag to fully-finished messages (i.e. when a sender clicks &quot;Send&quot; or hits Enter.) In some cases, this may cause the typing playback to &quot;suddenly catch up&quot;. This is acceptable behavior that is mainly noticeable at large intervals (i.e. &gt;1000ms).</p></li>
<li><p>If for any reason, a client cannot be made to support the Delay Code: Receiving clients MAY ignore received Delay Codes. For incoming messages in clients without Delay Code support, it is RECOMMENDED to spread the display of each character in incoming text in time over the transmission interval, to reduce the &quot;bursting&quot; look.</p></li>
<li><p>In transmitting clients where it is not practical to implement Delay Code support, and when it is necessary to transmit multiple characters in one Insert Text &lt;t&gt; element, text SHOULD be sent at word break (spacebar press or punctuation) to improve presentation in recipients that has no Delay Code or any form of smoothing of the display of text. When using this technique of transmitting text on word breaks, this leads to a potentially variable &lt;rtt&gt; message transmission interval. In this case, the client MUST have an algorithm to limit the average interval between outgoing &lt;rtt&gt; messages evaluated over the last two transmissions in order to prevent violating the currently negotiated message interval. At the same time, text SHOULD still be transmitted anyway after a grace time period of one negotiated or default transmission interval, when typing has paused without any punctuation or word delimiter.</p></li>
</ul>
<p>Natural Typing Mode is implemented in the experimental Jabber client submitted with this specification. This can be used to study how Natural Typing Mode behaves, including over high-latency connections with large buffering delays, and to convince implementors to implement this feature in other Jabber clients including popular mass-market clients.</p>
</section3>
<section3 topic='Example of XML Edit Codes'>
<p>Edit codes codes MAY be stacked in any order, to execute multiple consecutive edits. Receiving clients MUST process these edits in the order as they are represented in &lt;rtt&gt;. An example of a string with stacked Edit Codes:</p>
<code><![CDATA[
<rtt msg='0' seq='0' type='new'><t>Helo</t><e/><t>lo...planet</t><e n='6'/>
<t> World</t><d n='3' p='5'/><t p='5'> there,</t><c p='18'></rtt>
]]></code>
<p>This above example, reformatted into a more human readable form:</p>
<code><![CDATA[
<rtt msg='0' seq='0' type='new'>
<t>Helo</t>
<e/>
<t>lo...planet</t>
<e n='6'/>
<t> World</t>
<d n='3' p='5'/>
<t p='5'> there,</t>
<c p='18'>
</rtt>
]]></code>
<p>Results in the final string of "Hello there, World" in the following series of steps:</p>
<table caption='Edit Codes Example'>
<tr>
<th>Fragment</th>
<th>Action</th>
<th>State&nbsp;of&nbsp;Real&nbsp;Time&nbsp;Text</th>
<th>Cursor&nbsp;Pos</th>
</tr>
<tr>
<td>&lt;t&gt;Helo&lt;/t&gt;</td>
<td>Output "Helo"</td>
<td>Helo</td>
<td>4</td>
</tr>
<tr>
<td>&lt;e/&gt;</td>
<td>Erase 1 character from end of line.</td>
<td>Hel</td>
<td>3</td>
</tr>
<tr>
<td>&lt;t&gt;lo...planet&lt;/t&gt;</td>
<td>Output "lo-planet"</td>
<td>Hello...planet</td>
<td>14</td>
</tr>
<tr>
<td>&lt;e&nbsp;n='6'/&gt;</td>
<td>Erase 6 characters from end of line</td>
<td>Hello...</td>
<td>8</td>
</tr>
<tr>
<td>&lt;t&gt;World&lt;/t&gt;</td>
<td>Output "World"</td>
<td>Hello...&nbsp;World</td>
<td>14</td>
</tr>
<tr>
<td>&lt;d&nbsp;n='3'&nbsp;p='5'/&gt;</td>
<td>Delete 3 characters at position 5</td>
<td>Hello World</td>
<td>5</td>
</tr>
<tr>
<td>&lt;t&nbsp;p='5'&gt;&nbsp;there,&lt;/t&gt;</td>
<td>Output " there," at position 5</td>
<td>Hello&nbsp;there&nbsp;World</td>
<td>12</td>
</tr>
<tr>
<td>&lt;c&nbsp;p='18'/&gt;</td>
<td>Move cursor to 18th character</td>
<td>Hello&nbsp;there,&nbsp;World</td>
<td>18</td>
</tr>
</table>
<p>Since the Edit Codes are stacked in the same string, implementors MAY execute these steps in a visually instantaneous manner. However, a form of text smoothing is recommended (please see section &quot; Implementors MAY choose to execute small delays between displayable characters (calculated based on a running average typing speed) in an attempt to approximate original fluid typing output in order to filter-out the 1-second-interval surges of text.</p>
</section3>
<section3 topic='Future Support for New Edit Codes and XHTML-IM'>
<ul>
<li><p>New Edit Codes may be added, if later discussion indicates a good reason for doing so.</p></li>
<li><p>A future version of this specification may include OPTIONAL negotiable support for real time editing of HTML messages compliant with the existing XEP-0071 XHTML-IM standard. The Edit Codes have been designed to not conflict with common one-character XHTML-IM tags such as &lt;b&gt; Bold and &lt;i&gt; Italics.</p></li>
</ul>
</section3>
</section2>
<section2 topic='Methods of Detecting Message Edits In a Client' anchor='protocol-detectingedits'>
<section3 topic='Edit Capture Methods'>
<p>There are several methods of capturing message edits. One of these methods MAY be used.</p>
<ol>
<li><p>RECOMMENDED - Method #1 - Do not monitor key presses. Instead, compare the current message string to the old message string (from the time of the previous &lt;rtt&gt; transmission) to calculate what edits took place, and then generate the appropriate compact &lt;rtt&gt; output including edit codes.</p></li>
<li><p>NOT RECOMMENDED - Method #2 - Monitor key presses and convert to edit codes.</p></li>
<li><p>NOT RECOMMENDED - Method #3 - Retransmit the whole message every time it changes (by always using type='reset' in the &lt;rtt&gt; message)</p></li>
</ol>
<p>Method #1 SHOULD be used in most cases, because it captures all possible edits that occurs in a typical text box:</p>
<ul>
<li><p>it captures cut &amp; paste operations, as well as edits made via a computer mouse, track pad and touchscreen.</p></li>
<li><p>it captures automatic text changes made by the operating system or environment, including autotext, autocorrect, spellchecker, text expansion macros, voice recognition, or by external devices such as stenotype machines, etc.</p></li>
<li><p>it properly captures dead-character typing (incompletely-typed accents between &lt;rtt&gt; intervals).</p></li>
<li><p>it makes no assumptions about the message editing behavior of a specific platform.</p></li>
<li><p>it is completely cross-platform portable when chatting between different devices (i.e. chats between PC, Mac, Linux, BlackBerry, iPhone, Android), all of which have different editing behaviors.</p></li>
<li><p>It efficiently skips the unnecessary transmission of backspaces if un-transmitted text is backspaced before being transmitted.</p></li>
</ul>
<p>Whilst Method #3 does have has all the advantages of Method #1, it wastes bandwidth by re-transmitting the entire updated message potentially for every single key press. It is acceptable as a starting point for experimentation as it is very simple. It is also possible to use Method #3 (occasionally doing a type='reset' in the &lt;rtt&gt; element) at acceptable regular intervals (e.g. every 30 seconds) as a method of text redundancy to guarantee correctness of the real-time text updates at the remote end, during long periods of real time text. Method #3 has the additional advantage of maximum compatibility with Unicode quirks such as combining characters and surrogate pairs, in the event they are ever used in XMPP conversations. (See section <link url='#i18n'>Internationalization Considerations</link> for more information.)</p>
<p>If Method #1 is used, it should not execute more than 100 times a second in order to prevent CPU spikes or denial-of-service attacks (intentional or inadvertent, such as third party software outputting copy-and-pastes one key press at a time, causing an overload of an algorithm in Method #1). Otherwise, &lt;rtt&gt; packets may become oversized and exceed XMPP message size limits (typically 8 kilobytes for a complete XML payload) if it consists of several hundred single-character Edit Code's during a copy and paste on systems that implement copy-and-paste operations via one simulated key press at a time.</p>
<p>It is also RECOMMENDED for client software to include logic for replacing oversized &lt;rtt&gt; messages before they are transmitted) with the 'Reset' Edit Code &lt;r/&gt; followed by the re-transmission of the whole message (Method #3) in situations, because this is more efficient when Method #3 ends up resulting in huge number of small edit codes that exceeds the message length limit of XMPP (typically 8 kilobytes).</p>
</section3>
<section3 topic='Recommended Method for Natural Typing Mode'>
<p>In order to use Method #1 with the Natural Typing Mode, it becomes necessary to execute Method #1 during every text box change event (or cursor movement event or selection change event), or at very short, rapid intervals. If polling the text box to see whether it has changed, it SHOULD be polled at approximately the typical screen refresh frequency (i.e. 50 times per second minimum), and no more often than 100 times per second (since the granularity of delay codes are 1/100<sup>th</sup> second). Every time that a poll or change listener notices a change in the text box that generates Edit Codes, a Delay code is inserted immediately before the Edit code, with the recorded time in hundredths of seconds between change events. After the interval is complete (i.e. 1000ms since the last &lt;rtt&gt; code), the whole collected series of Edit Codes are transmitted all at once in one &lt;rtt&gt; message transmission. The receiving end then plays back the Edit Codes in order, including the series of Delay Codes, to re-create the original typing look-and-feel, independently of the &lt;rtt&gt; intevals and network packet buffering/surging.</p>
</section3>
<section3 topic='Synchronization of Real-Time Text View'>
<p>To better ensure that the appearance of real-time-text is identical on all ends of a conversation (assuming no &lt;rtt&gt; elements are lost), Method (1) should be used where possible.</p>
<p>If for any reason, Method (1) cannot be used, Method (2) MAY be used instead. For a &lt;message&gt; contains a &lt;body&gt;, it MUST automatically replace the real-time text. This corrects for platform-specific editing behaviors that may inject side effects in the real time text transmission and cause it to diverge in appearance from the remote end. The &lt;body&gt; will auto-correct for this. However, Method (1) captures all possible plain-text edits in typical chat clients. For clients that support Natural Typing Mode (i.e. where feature negotiation sets &quot;typingdelays&quot; to 1), Method (1) can be done during every single key press (or text change event), adding the delay code, then buffering the resulting &lt;rtt&gt; before sending one concatenated transmission at every regular &lt;rtt&gt; transmission interval.</p>
</section3>
<section3 topic='Cursor Position Updates'>
<p>Regular cursor position updates SHOULD be transmitted if the cursor position changes, even if the text does not change. Cursor position can usually be detected via the "SelectionStart" property (or similar value), found in most "TextBox" controls (or similar) in most languages on most platforms. When text is being marked for copying and pasting, cursor position may become ambiguous. In this case, the end of the marked text in a text selection operation MAY be considered the cursor position. The activity of text marking operations are not visually transmitted over real-time text transmission.</p>
</section3>
<section3 topic='Emoticons and Graphics Considerations'>
<p>For chat clients that support automatic conversion of emoticons to graphics symbols, the string length of the emoticon (even when converted to a graphic) should always be maintained, since the individual characters of the emoticon are transmitted. For example, ":-)" is three characters long and is typically automatically converted to a single smiley graphic image in many typical chat programs. Cursor movements that move the cursor one position from the beginning to the end of the emoticon should increment an internal cursor position variable by 3 instead of 1, keeping in-sync with the character index of the original string (i.e. the text-based emoticon character). Chat clients that use "rich text format" edit controls (in order to display emoticons) may require extra programming logic to calculate the correct plain-text character position of the cursor (relative to the text that would be transmitted in a &lt;body&gt; element of a &lt;message&gt;).</p>
</section3>
</section2>
<section2 topic='Interval of Transmission of Real-Time Text' anchor='protocol-interval'>
<p>Experiments have shown that the usability of real time text, experienced by users, starts reducing at about an interval of 300 ms, to fall down to quite low usability at three second interval. Therefore, the one second interval is a highly RECOMMENDED compromise default for unknown network and server conditions. A different interval MAY optionally be specified by server vendors, see <link url='#protocol-negotiation'>Feature Negotiation</link> for more information. In addition, Mark Rejhon also developed a solution called &quot;Natural Typing Mode&quot; using the 'Delay' Edit Code. The 'Delay' Edit Code preserves identical typing look-and-feel regardless of the interval, even on high-latency and variable-latency connections.</p>
<p>The default interval between the transmission of real time text in an &lt;rtt&gt; SHALL be 1000ms.<br/>
The interval is approximately equal to the lag experienced in real time text conversation.<br/>
However, there are cases where different intervals may be negotiated.</p>
<ol>
<li><p>LAN chat, peer-to-peer XMPP, and custom niche-specific chat networks, may desire a shorter interval for less lag.</p></li>
<li><p>High performance XMPP servers by some vendors such as Google Talk's have demonstrated capable of true character-at-a-time granularity for single pairs of users. Vendors should be permitted to decide to use a different interval based on their own server architecture.</p></li>
<li><p>Real-time captioning services transmitted over XMPP require a much shorter interval, such as 300ms.</p></li>
<li><p>Rapid multiple-line pasting which may overload the server, and require a lower frequency of real time messages.</p></li>
<li><p>Servers that are overloaded may wish to negotiate a longer interval. An example is a multi-user group chat supporting real time text. In the event that the negotiated interval exceeds 1000ms, it is RECOMMENDED that the client announce the lag via an informational message, or via notification messages broadcast to the chat room.</p></li>
<li><p>Intelligent clients MAY use XMPP Pings (using XEPP-0199 standard) or message receipts (using XEP-0184 standard on messages containing &lt;body&gt;) to monitor latency and dynamically re-negotiate the transmission interval.</p></li>
<li><p>When interoperability with other real time text standards is desired, it is strongly RECOMMENDED to follow recommended intervals as some real time text standards are much more strict about intervals.</p></li>
</ol>
<p>XMPP messages were originally meant to be exchanged at 'once per sentence' resulting in one message every few seconds per participant. Assuming one message was transmitted in real time at an average of 10 &lt;rtt&gt; fragments per message, this may increase the server load per user by a factor of 10 unless the server was optimized for increased for &lt;rtt&gt; traffic. This would not apply to all XMPP traffic as real-time-text would be an &quot;opt-in&quot; service and Servers could be gradually upgraded when the need arises. Many XMPP Servers already implement other high-bandwidth services such as in-band file transfers (i.e. the Socks5 Bytestreams extension). Such services is likely to load an XMPP server more heavily than XMPP real-time-text, which means many servers are already well-equipped to handle the extra loading requirements of real-time text.</p>
<p>If Natural Typing Mode is turned off (i.e. 'delaycodes' are turned off), clients are RECOMMENDED to immediately transmit &lt;rtt&gt; during word boundaries (when spacebar is pressed) for better flow of typing. The client MUST throttle transmission of &lt;rtt&gt; so that the running average interval of the last 5 &lt;rtt&gt; transmissions is not less than the negotiated inteval (or 1000ms default). In the event that the average interval is at risk of being violated, the client SHOULD NOT transmit &lt;rtt&gt; until the next word boundary that does not violate the negotiated interval, based on the running average of the last 5 &lt;rtt&gt; transmissions. By following this rule, this would automatically result in the transmission of two or more words at a time, if multiple words are typed in less than the &lt;rtt&gt; transmission interval. If there is no change in text (i.e. no key press) for 1000ms, any un-transmitted text changes SHOULD be immediately transmitted, even if it is not at a word boundary.</p>
<p>Servers may be designed to dynamically adjust interval based on unusual server loads. If for any reason, longer intervals are unavoidable due to an overloaded server, a server may dynamically re-negotiate the interval as a last-resort mechanism where the alternative becomes lost XMPP messages that are impossible to deliver. In this event, it is RECOMMENDED that the user be warned of the lag in some manner, such as via a standard chat room notification message. Intervals longer than 1000ms SHOULD NOT be made a common use case or common load management practice. It is more acceptable for less mission critical situations such as IRC-style multi-user group chat rooms, which also happens to be the situation that is more prone to server overloads. Longer intervals than 1000ms SHOULD NOT be used for one-on-one communications, just in case the real time text is currently transmitted to an emergency 911 center, to an emergency responder, captioned telephony, or relay service for the deaf.</p>
<p>The possible use of XMPP Pings or message receipts, can be used to monitor fluctuations to latency. Slower network connections and overloaded servers would generally trend to a higher latency, which would signal the need to lengthen the interval between &lt;rtt&gt; messages. This would allow client software to preemptively throttle back the interval of real time text. This would provide graceful degradation of real-time-text upon a server overload condition, and also serve as a useful visual canary-in-the-mine indicator of the beginnings of abnormal conditions (i.e. server overload). This may even help server maintainers improve server capacity issues before server overloads start to interfere with even non-RTT chats, because it is presumable that real time text will degrade first before regular message body transmissions.</p>
</section2>
<section2 topic='Real Time Text Examples' anchor='examples'>
<section3 topic='EXAMPLE: Three backspaces.'>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello bcak</t><e/><e/><e/><t>ack</t></rtt>
</message>
]]></code>
<p>Resulting Real Time Text: "Hello back"</p>
<p>This code sends the misspelled "Hello bcak", then &lt;e/&gt;&lt;e/&gt;&lt;e/&gt; backspaces 3 times, then sends "ack".</p>
</section3>
<section3 topic='EXAMPLE: Three backspaces in one Edit Code.'>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello bcak</t><e n='3'/><t>ack</t></rtt>
</message>
]]></code>
<p>Resulting Real Time Text: "Hello back"</p>
<p>This code is the same as the previous example, demonstrating that &lt;e n='3'/&gt; does the same thing as &lt;e/&gt;&lt;e/&gt;&lt;e/&gt;.</p>
</section3>
<section3 topic='EXAMPLE: Segmented into multiple messages at regular intervals'>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello</t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='1'><t> bcak</t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='2'><t><e n='3'/></t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='3'><t>ack</t></rtt>
</message>
]]></code>
<p>Resulting Real Time Text: "Hello back"</p>
<p>This code results in the same final text as the previous two examples, segmented into four separate messages.</p>
</section3>
<section3 topic='EXAMPLE: Deleting text in the middle of a message'>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello Bob, this is Alice!</t><d n='4' p='5'/></rtt>
</message>
]]></code>
<p>Resulting Real Time Text: "Hello, this is Alice!"</p>
<p>This code outputs "Hello Bob, this is Alice!" then &lt;d n='4' p='5'/&gt; deletes 4 characters from position 5. (This erases the text &quot; Bob&quot; including the preceding space character).</p>
</section3>
<section3 topic='EXAMPLE: Inserting text in the middle of a message'>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello, this is Alice!</t><t p='5'> Bob</t></rtt>
</message>
]]></code>
<p>Resulting Real Time Text: "Hello Bob, this is Alice!"</p>
<p>This is because the code outputs "Hello, this is Alice!" then the &lt;t p='5'&gt; inserts the specified text " Bob" at position 5.</p>
</section3>
<section3 topic='EXAMPLE: Deleting and replacing text in the middle of a message'>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'>
<t>Hello Bob, tihsd is Alice!</t>
<d p='11' n='5'/>
<t p='11'>this</t>
</rtt>
</message>
]]></code>
<p>Resulting Real Time Text: "Hello Bob, this is Alice!"</p>
<p>This code outputs &quot;Hello Bob, tihsd is Alice!&quot;, then &lt;d p=&quot;11&quot; n=&quot;5&quot;/&gt; deletes 5 characters at position 11 in the string of text. (erases the mistyped word &quot;tihsd&quot;). Finally, &lt;t p=&quot;11&quot;&gt;this&lt;/t&gt; inserts the text "this" place of the original misspelled word.</p>
</section3>
<section3 topic='EXAMPLE: Same as above example, but in multiple separate intervals'>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello B</t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='1'><t>ob, tihsd</t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='2'><t> is Alice!</t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='3'><d p='11' n='5'/></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='4'><t p='11'>th</t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='5'><t p='13'>is</t></rtt>
</message>
]]></code>
<p>Resulting Real Time Text: "Hello Bob, this is Alice!"</p>
<p>&lt;rtt&gt; elements seq='0' through seq='2' constructs the original misspelled string "Hello Bob, tihsd is Alice!"<br/>
&lt;rtt&gt; element seq='3' with &lt;d p=&quot;11&quot; n=&quot;5&quot;/&gt; deletes 5 characters from position 11.<br/>
&lt;rtt&gt; element seq='5' with "&lt;t p=&quot;11&quot;&gt;th&lt;/t&gt;" inserts "th" beginning at position 11<br/>
&lt;rtt&gt; element seq='6' with "&lt;t p=&quot;13&quot;&gt;is&lt;/t&gt;" inserts "is" beginning at position 13 (completing the word "this")</p>
</section3>
<section3 topic='EXAMPLE: 3 Consecutive messages of text in a chat session'>
<p>Representing a short chat Session:</p>
<p>Bob says: "Hello Alice"<br/>
Bob says: "This is Bob"<br/>
Bob says: "How are you?"</p>
<code><![CDATA[
<message from=Bob to=Alice id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello</t></rtt>
</message>
<message from=Bob to=Alice id=... type='chat'>
<rtt msg='0' seq='1'><t> Alice</t></rtt>
<body>Hello Alice</body>
</message>
<message from=Bob to=Alice id=... type='chat'>
<rtt msg='1' seq='2' type='new'><t>This i</t></rtt>
</message>
<message from=Bob to=Alice id=... type='chat'>
<rtt msg='1' seq='3'><t>s Bob</t></rtt>
<body>This is Bob</body>
</message>
<message from=Bob to=Alice id=... type='chat'>
<rtt msg='2' seq='4' type='new'><t>How a</t></rtt>
</message>
<message from=Bob to=Alice id=... type='chat'>
<rtt msg='2' seq='5'><t>re yo</t></rtt>
</message>
<message from=Bob to=Alice id=... type='chat'>
<rtt msg='2' seq='6'><t>u?</t></rtt>
<body>How are you?</body>
</message>
]]></code>
<p>This example illustrates the following:</p>
<ul>
<li><p>The <strong>msg</strong> attribute increments for each message.</p></li>
<li><p>The <strong>type</strong> attribute equals 'new' for the start of every new message.</p></li>
<li><p>The <strong>seq</strong> attribute always increments.</p></li>
</ul>
</section3>
</section2>
<section2 topic='Message length limits for Real Time Text' anchor='protocol-messagelength'>
<p>There is no defined length limit for &lt;rtt&gt;. In practice, the maximum length of &lt;rtt&gt; element is the same as the maximum length of a &lt;body&gt;. As of December 2010, the Google Talk client within GMAIL use a 2,000 character limit for a message of chat text.</p>
<section3 topic='Rare Case of &lt;rtt&gt; Text Length Exceeding &lt;body&gt; Text Length Limit'>
<p>It is possible to stack multiple Edit Codes that creates an &lt;rtt&gt; element longer than the &lt;body&gt; length limit but which results in a final text string shorter than &lt;body&gt; length limit. This may happen if the user pastes a large maximum-length block of text in the middle of a very short message (i.e. pasting massive amount of text in the middle of a 2-character message). This results in extra edit codes added to the next &lt;rtt&gt; element that makes the element longer than a defined &lt;body&gt; length limit. In practice, this is harmless because XMPP servers include a safety margin above-and-beyond the maximum &lt;body&gt; length, for additional XML information included in a &lt;message&gt; transmission. No special handling is suggested in these situations.</p>
</section3>
<section3 topic='Long Real-Time Text Transmissions'>
<p>There are situations where the final &lt;rtt&gt; element combined with &lt;body&gt; element in the same &lt;message&gt;, results in a &lt;message&gt; significantly more than double the maximum &lt;body&gt; length. This may happen when lots of text is copy &amp; pasted and then immediately sent in a message. This may exceed the XMPP server's &lt;message&gt; length limit. To reduce or eliminate the possibility from this happening, it is permitted to send a separate &lt;message&gt; with the final &lt;rtt&gt; element, then a subsequent &lt;message&gt; with a &lt;body&gt; element (and an empty final &lt;rtt&gt; element).</p>
<p>Example of a combined &lt;message&gt; that may end up becoming more than double the size of the maximum &lt;body&gt; length:</p>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello World...In a Super Long Message! [etc]</t></rtt>
<body>Hello World...In a Super Long Message! [etc]</body>
</message>
]]></code>
<p>The message MAY be split into two separate message transmissions:</p>
<code><![CDATA[
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='0' type='new'><t>Hello World...In a Super Long Message! [etc]</t></rtt>
</message>
<message from='Alice' to='Bob' id=... type='chat'>
<rtt msg='0' seq='1'/>
<body>Hello World...In a Super Long Message! [etc]</body>
</message>
]]></code>
</section3>
</section2>
</section1>
<section1 topic='Interoperability Considerations' anchor='interop'>
<p>There are many environments where real-time text communication is implemented that may benefit from interoperability with XMPP Real Time Text. There are interoperability considerations relating to the session setup level, the media transport level, and presentation level.</p>
<p>For each environment where interoperability is supported, an interoperability specification should be documented that describes the features mentioned above i.e. addressing, session control, media negotiation and media transcoding.</p>
<section2 topic='RFC 4103 and T.140' anchor='interop-4103'>
<p>One environment for such interoperability considerations is SIP with real-time text (also called Text over IP, or ToIP) as specified in ITU-T T.140 and IETF RFC 4103. One reason for its importance is that this protocol combination is specified by IETF and by regional emergency service organizations to be the protocols supported for IP based real-time emergency calls that support real-time text. Another reason is that SIP is the currently dominating peering protocol between services, and many implementations of real-time text in SIP exist.</p>
<p>Interoperability implies addressing translation, media negotiation and translation, and media transcoding. For the media transcoding between this specification and T.140/RFC 4103, the real-time text media transcoding is straight forward, except the editing feature of this specification. Backwards positioning and insertion or deletion far back in the message can cause a large number of erase operations in T.140, that takes time and bandwidth to convey. Therefore, it is proposed that the scope of editing can be limited by negotiation to only allow backspacing when connecting to a system that only T.140 real-time message editing capabilities.</p>
<p>It should be noted that T.140 specifies use of ISO 6429 control codes for presentation characteristics such as text color etc, that are not possible to represent in plain text according to this specification. All control codes from both sides that cannot be presented on the other side of the conversion, must be filtered off in order to not disturb the presentation of text.</p>
<p>Note that a future version of this specification may support real time text with XHTML-IM. It is possible to transcode many of these ISO 6429 formatting codes to XEP-0071 XHTML-IM.</p>
</section2>
<section2 topic='Combination With Other Real Time Media' anchor='interop-combination'>
<p>In some cases, it may be beneficial in a real-time conversation situation to have simultaneous availability of multiple real-time media.</p>
<p>In the XMPP session environment, the Jingle protocol (XEPP-0166) is available for negotiation and transport of the more time-critical, real-time audio and video media. For clients that already support audio and/or video, it is RECOMMENDED to continue providing real-time text according to this specification, regardless of whether audio and/or video is negotiated.</p>
<p>It is noted there is also another real-time-text standard (IETF RFC 4103, <span class='ref'>IETF RFC 5194</span> <note>IETF RFC 5194: Framework for Real-Time Text over IP Using the Session Initiation Protocol (SIP)</note>), used for SIP messaging and real time text. In the situation where an implementor needs to decide which real time text standard to use, it is generally recommended to use the real time text extension of the specific instant messaging standard in use for that particular conversation. This varies from from implementation to implementation. For example, Google Talk network uses XMPP messaging for instant messages sent during audio/video conversations. Therefore, in this situation, it is recommended to use this XMPP extension document to add Real Time Text functionality. However, there are other situations where it is necessary to support multiple real time text standards, and to interoperate between the multiple real time text standards. For more information, see the next section.</p>
<p>Also, according to ITU-T F.703, &quot;Total Conversation&quot; defines the simultaneous use of audio, video, and real-time text. For convenience, some chat applications may be designed to have automatic negotiation of as many as possible of the three media preferred by the users.</p>
</section2>
<section2 topic='Backwards Compatibility With Future Extensions' anchor='interop-backwardcompat'>
<p>It is anticipated that real time text will evolve in the coming years. For example, real-time text editing of HTML and a negotiation mechanism for a different Real Time Text Editing mechanism that may be more efficient. Therefore, there are a number of guidelines to maintain interoperability:</p>
<ol>
<li><p>Unrecognized Edit Codes should be silently ignored</p></li>
<li><p>Properly follow the recommendations in &quot;Error Recovery of Real Time Text&quot;</p></li>
<li><p>Enhancements above and beyond this specification should be negotiated via feature negotiation.<br/>
For example, support for real-time editing of HTML.</p></li>
</ol>
</section2>
</section1>
<section1 topic='Internationalization Considerations' anchor='i18n'>
<p>Real-time text uses the Unicode format which is internationally accepted. XMPP Real Time Text works with all languages that can be transmitted in a standard &lt;message&gt;&lt;body&gt;...&lt;/body&gt;&lt;/message&gt; payload. XMPP transmission is typically in UTF-8 format, while string variable storage is typically in UTF-16 format in most programming languages (conversions normally being done at transmit/receive time).</p>
<p>There are, however, special considerations for real time Edit Codes (see section "Embedded Edit Codes For Real-Time Text&quot;) involving the cursor positioning codes, backspace, and delete. Code written to encode/decode real-time editing may introduce unforeseen ambiguities, such as Unicode combining characters and Unicode surrogate pairs and prefix/suffix codes used to generate accents and modified characters (in Chinese, Arabic and other languages). String manipulation routines on Unicode strings may vary between different platforms and return different character indexes or string lengths depending on the algorithm that they use to compute the length of a Unicode string. This has implications for implementors in their ability to process the edit codes used in real-time text editing. Therefore, the following guidelines are required:</p>
<section2 topic='General Guidelines' anchor='i18n-general'>
<ol>
<li><p>Real time text transmissions SHOULD NOT transmit any Unicode control codes.</p></li>
<li><p>Real time text transmissions SHOULD NOT transmit incompletely formed Unicode characters, such as standalone combining marks.</p></li>
<li><p>Cursor positions transmitted and processed in Edit Codes MUST be 0-based character indexes into the message string in UTF-16 format. This may not necessarily be relative to its visually displayed representation. (For example: Unicode combining mark, Unicode surrogate pairs, text emoticons to replaced graphics emoticons, etc). Implementations that display processed representations of the real time text MUST internally keep track of the same message in unprocessed UTF-16 format for manipulation by real-time text edit codes.</p></li>
<li><p>Upon receipt of a complete message (i.e. a &lt;message&gt; transmission containing a &lt;body&gt; or &lt;html&gt; element), a client MUST always clear the real-time text and then use the full transmitted message. This will automatically fix any flaws that may have happened during real-time text editing.</p></li>
<li><p>Always use method (1) in "Methods of Detecting Message Edits In a Client".</p></li>
</ol>
<p>Upon encountering ambiguity issues potentially found in certain locales, implementors MAY choose to clear the message with the real-time &lt;r/&gt; edit code, and retransmit the whole message of text inside the &lt;rtt&gt; element (Same text string as if sent in a &lt;body&gt; element) to eliminate ambiguities relating to real-time editing including backspace, delete, and cursor positioning.</p>
<p>For the purpose of real-time text Edit codes, cursor position 0 starts before the first character of the string, even though chat software may display right-to-left text (e.g. when using Arabic) beginning at the right edge of the screen, and some may display text bi-directionally (i.e. English with embedded Arabic quotes).</p>
<p>Further experimentation during the six months from publication of the first Experimental draft specification will lead to very solid real-time text Edit codes being specified.</p>
</section2>
</section1>
<section1 topic='Security Considerations' anchor='security'>
<p>The security considerations are mainly user interface related, which varies in implementation from client to client:</p>
<p>It is important for implementors of real-time text features to educate users about real-time text. Users of real-time text should be aware that their typing in the local input buffer is now visible to everyone in the current chat conversation. This may have security implications if users copy &amp; paste private information into their chat entry buffer (i.e. a shopping invoice) before editing out the private parts of the pasted text (i.e. a credit card number) before they hit Enter or click Send. With real-time editing, recipients can watch all text changes that occur in the sender's text.</p>
<p>Concern shall be taken so that the network and server load of XMPP based real-time text is not excessive to the degree that it causes congestion, and a pontential denial-of-service situation. Discussion is needed to determine a situation where the interval of 1 second is not desirable, and a different interval needs to be negotiated.</p>
<p>If a chat application has a logging feature, it SHOULD NOT log any text transmitted in &lt;rtt&gt; elements, and instead only log text transmitted inside a &lt;body&gt; or &lt;html&gt; element.</p>
</section1>
<section1 topic='IANA Considerations' anchor='iana'>
<p>This document requires no interaction with the Internet Assigned Numbers Authority (IANA).</p>
</section1>
<section1 topic='XMPP Registrar Considerations' anchor='registrar'>
<section2 topic='Protocol Namespaces' anchor='registrar-namespaces'>
<p>The XMPP Registrar will include "urn:xmpp:rtt" (suggested) in its registry of protocol namespaces (see &lt;<link url='http://xmpp.org/registrar/namespaces.html'>http://xmpp.org/registrar/namespaces.html</link>&gt;).</p>
</section2>
</section1>
<section1 topic='XML Schema' anchor='schema'>
<code><![CDATA[
<?xml version='1.0' encoding='UTF-8'?>
<xs:schema
xmlns:xs='http://www.w3.org/2001/XMLSchema'
targetNamespace='http://jabber.org/protocol/rtt'
xmlns='http://jabber.org/protocol/rtt'
elementFormDefault='qualified'>
<xs:annotation>
<xs:documentation>
The protocol documented by this schema is not yet defined on XMPP.org until submitted.
XEP-0292: http://www.xmpp.org/extensions/xep-0292.html
</xs:documentation>
</xs:annotation>
<xs:element name='rtt'>
<xs:complexType>
<xs:attribute name='msg' type='xs:unsignedInteger' use='required'/>
<xs:attribute name='seq' type='xs:unsignedInteger' use='required'/>
<xs:attribute name='type' type='xs:string' use='optional'/>
<xs:sequence>
<xs:element ref='t' minOccurs='0' maxOccurs='unbounded'/>
<xs:element ref='e' minOccurs='0' maxOccurs='unbounded'/>
<xs:element ref='d' minOccurs='0' maxOccurs='unbounded'/>
<xs:element ref='r' minOccurs='0' maxOccurs='unbounded'/>
<xs:element ref='c' minOccurs='0' maxOccurs='unbounded'/>
<xs:element ref='w' minOccurs='0' maxOccurs='unbounded'/>
</xs:sequence>
</xs:complexType>
</xs:element>
<xs:element name='t' type='xs:string'>
<xs:complexType>
<xs:attribute name='p' type='xs:unsignedInteger' use='optional'/>
</xs:complexType>
</xs:element>
<xs:element name='e' type='empty'>
<xs:complexType>
<xs:attribute name='p' type='xs:unsignedInteger' use='optional'/>
<xs:attribute name='n' type='xs:unsignedInteger' use='optional'/>
</xs:complexType>
</xs:element>
<xs:element name='d' type='empty'>
<xs:complexType>
<xs:attribute name='p' type='xs:unsignedInteger' use='required'/>
<xs:attribute name='n' type='xs:unsignedInteger' use='optional'/>
</xs:complexType>
</xs:element>
<xs:element name='r' type='empty'/>
<xs:element name='c' type='empty'>
<xs:complexType>
<xs:attribute name='p' type='xs:unsignedInteger' use='required'/>
</xs:complexType>
</xs:element>
<xs:element name='w' type='empty'>
<xs:complexType>
<xs:attribute name='n' type='xs:unsignedInteger' use='required'/>
</xs:complexType>
</xs:element>
<xs:simpleType name='empty'>
<xs:restriction base='xs:string'>
<xs:enumeration value=''/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
]]></code>
</section1>
<section1 topic='Acknowledgements' anchor='acknowledgements'>
<p>The author would like to thank Real Time Text Taskforce (R3TF) at &lt;<link url='http://www.realtimetext.org/'>www.realtimetext.org</link>&gt; for their contribution to the technology documented in this specification. Members of R3TF who have contributed to this document, including corrections and edits, include Gunnar Helstrom, Barry Dingle, Paul E. Jones, Anoud van Wijk, and Gregg Vanderheiden.</p>
<p>Natural Typing, a first in Internet text communications, is acknowledged as an invention by Mark Rejhon, who is deaf. This technology is provided to XMPP.org as part of this specification in compliance of the XSF's Intellectual Property Rights Policy at &lt;<link url='http://xmpp.org/extensions/ipr-policy.shtml'>http://xmpp.org/extensions/ipr-policy.shtml</link>&gt;. For more information, see <link url='#appendix-legal'>Appendix C: Legal Notices</link>.</p>
</section1>
</xep>