From 32f69a49e3bf7624e181424e2162d2aa20dc068b Mon Sep 17 00:00:00 2001 From: Peter Saint-Andre Date: Thu, 27 Jun 2013 19:03:41 -0600 Subject: [PATCH] 0.10 --- xep-0301.xml | 39 +++++++++++++++++++++++++-------------- 1 file changed, 25 insertions(+), 14 deletions(-) diff --git a/xep-0301.xml b/xep-0301.xml index 8c478ec9..b4932d87 100644 --- a/xep-0301.xml +++ b/xep-0301.xml @@ -38,6 +38,12 @@ gunnar.hellstrom@omnitor.se http://www.omnitor.se + + 0.10 + 2013-06-24 + MDR +

Changes from Last Call feedback.

+
0.9 2013-05-18 @@ -155,9 +161,10 @@
  1. Allow seamless integration of real-time text into instant messaging clients, with minimal user interface modifications.
  2. -
  3. Be able to function over intermittent and unreliable connections, including mobile phones.
  4. +
  5. Be able to function securely over intermittent and unreliable connections, including mobile phones.
  6. Allow use within gateways to interoperate with other real-time text protocols, including RFC 4103 and ITU-T T.140 ITU-T T.140: Protocol for multimedia application text conversation <http://www.itu.int/rec/T-REC-T.140>..
  7. +
  8. Be usable in an international setting.
@@ -177,6 +184,7 @@

real-time text – Text transmitted instantly while it is being typed or created, to allow recipient(s) to immediately read the sender's text as it is written, without waiting.

real-time message – Recipient's real-time view of the sender's message still being typed or created.

RTT – Acronym for real-time text.

+

simultaneous login – Multiple simultaneous sessions, on multiple clients, using the same login (Jabber Identifier).

@@ -214,7 +222,7 @@ -

This REQUIRED attribute is a counter to maintain the integrity of real-time text. Senders MUST increment this value by 1 for each subsequent edit to the same real-time message, including when appending new text. Recipients MUST monitor the seq value as an integrity check on received real-time text. The bounds of seq is 31-bits, the range of positive values for a signed 32-bit integer. See Keeping Real-Time Text Synchronized.

+

This REQUIRED attribute is a counter to maintain synchronization of real-time text. Senders MUST increment this value by 1 for each subsequent edit to the same real-time message, including when appending new text. Receiving clients MUST monitor this seq value as a lightweight verification on the synchronization of real-time text messages. The bounds of seq is 31-bits, the range of positive values for a signed 32-bit integer. See Keeping Real-Time Text Synchronized.

This attribute signals events for real-time text.

@@ -271,7 +279,7 @@
  • Initialize a new real-time message: <rtt event='new'/> and <rtt event='reset'/>
    - Sender clients MUST use either element as the first <rtt/> transmission of a new real-time message. Recipient clients MUST initialize a new blank real-time message for display, and then process all Action Elements (e.g. text insertions and deletions) included within the <rtt/> element. If a real-time message already exists from the same sender in the same chat session, its content MUST be seamlessly replaced (i.e. cleared prior to immediately processing action elements).

  • + Sender clients MUST use an <rtt/> element containing either event='new' or event='reset' in the first transmission of a new real-time message. Recipient clients MUST initialize a new blank real-time message for display, and then process all Action Elements (e.g. text insertions and deletions) included within the <rtt/> element. If a real-time message already exists from the same sender in the same chat session, its content MUST be seamlessly replaced (i.e. cleared prior to immediately processing action elements).

  • Both <rtt event='new'/> and <rtt event='reset'/> are logically identical to recipients, except for presentation:
    For recipients, these differ only for optional presentation purposes (e.g. highlighting newly started incoming messages). Senders SHOULD use event='new' when sending the first text of a new message (e.g. the first key presses), and only use event='reset' when doing Message Refresh or Simple Real-Time Text. See Keeping Real-Time Text Synchronized.

  • Sending modifications of a real-time message: Outgoing <rtt event='edit'/> or <rtt/>
    Sender clients SHOULD transmit this element at a regular Transmission Interval while the message is being modified. The seq attribute MUST increment by 1 for every consecutive modification transmitted. See Sending Real-Time Text.

  • @@ -282,7 +290,7 @@ Clients MAY use this value to signal activation of real-time text without first starting a real-time message, since the sender may not start composing immediately. The seq attribute is ignored by recipient clients. See Activating and Deactivating Real-Time Text.

  • Ending real-time text: <rtt event='cancel'/>
    Clients MAY use this value to signal deactivation of real-time text. Clients receiving this element SHOULD also discontinue sending <rtt/> elements for the remainder of the same one-to-one chat session (until event='init' is used again), and handle any unfinished real-time messages appropriately (e.g. clearing or saving the message). The seq attribute is ignored by recipient clients. See Activating and Deactivating Real-Time Text.

  • -
  • Starting value for seq attribute:
    Sender clients MAY use any new starting value for seq when initializing a real-time message using event='new' or event='reset'. Recipient clients receiving such elements MUST use this seq value as the new starting value. A random value is RECOMMENDED for improved integrity during Usage with Multi-User Chat and Simultaneous Logins.

  • +
  • Starting value for seq attribute:
    Sender clients MAY use any new starting value for seq when initializing a real-time message using event='new' or event='reset'. Recipient clients receiving such elements MUST use this seq value as the new starting value. A random starting value is RECOMMENDED to improve reliability of Keeping Real-Time Text Synchronized during Usage with Multi-User Chat and Simultaneous Logins.

@@ -390,7 +398,7 @@
  • During Multi-User Chat (e.g. participants joining/leaving while other participants are composing).
  • After message stanzas are lost in transit (e.g. Congestion Considerations).
  • -

    Recipient clients MUST keep track of separate real-time messages on a per-sender basis, including tracking independent seq values. For implementation simplicity, recipient clients MAY track incoming <rtt/> elements per &LOCALBARE; to keep only one real-time message per sender. Recipient client handling of conflicting <rtt/> elements (e.g. coming concurrently from separate Simultaneous Logins) is described in the remainder of this section. Alternatively, recipient clients MAY keep track of separate real-time messages per &LOCALFULL; and/or per <thread/> (&xep0201;).

    +

    Recipient clients MUST keep track of separate real-time messages on a per-sender basis, including tracking independent seq values. For implementation simplicity, recipient clients MAY track incoming <rtt/> elements per bare JID &LOCALBARE; to keep only one real-time message per sender. Recipient client handling of conflicting <rtt/> elements (e.g. coming concurrently from separate Simultaneous Logins) is described in the remainder of this section. Alternatively, recipient clients MAY keep track of separate real-time messages per full JID &LOCALFULL; and/or per <thread/> (&xep0201;).

    By following Processing Rules, the recipient client creates a new real-time message when receiving <rtt event='new'/> or <rtt event='reset'/>. Thereafter, when receiving text modifications (i.e. <rtt event='edit'/> or <rtt/> without an event attribute):

      @@ -400,7 +408,7 @@
    -

    Loss of sync occurs during receiving text modifications if the seq attributes do not increment by 1 as expected, or if no real-time message exists. In this case:

    +

    Loss of sync occurs during receiving text modifications if the seq attribute does not increment by 1 as expected, or if no real-time message exists. In this case:

    • Recipients MUST keep the real-time message unchanged (if any exists); and
    • Recipients MUST ignore subsequent text modifications (i.e. <rtt event='edit'/> or <rtt/> without an event attribute); and
    • @@ -420,7 +428,7 @@

      The message refresh SHOULD be transmitted regularly at an average interval of 10 seconds during active typing or composing. This interval is frequent enough to minimize user waiting time, while being infrequent enough to not cause a significant bandwidth overhead. This interval MAY vary, or be set to a longer time period, in order to reduce average bandwidth (e.g. long messages, infrequent or minor message changes). To save bandwidth, message refreshes SHOULD NOT occur continuously while the sender is idle. To allow quicker resumption of real-time text, sender clients MAY adjust the timing of the message refresh to occur right after any of the following additional events:

      • When the recipient starts sending messages from a different full JID (e.g. switched clients);
      • -
      • When the recipient becomes available (e.g. presence changes to 'chat');
      • +
      • When the recipient presence changes to a more available state (e.g. <show/> value of 'chat');
      • When the sender resumes composing after an extended pause (e.g. recipient may have cleared Stale Messages);
      • When the conversation is unlocked (e.g. section 5.1 of XMPP IM);
      @@ -491,7 +499,7 @@

      For high quality presentation of real-time text, the original look-and-feel of typing can be preserved independently of the transmission interval. This is achieved using Element <w/> – Wait Interval between other Action Elements. Sender clients can transmit the length of pauses between key presses, and send multiple key presses in a single <message/> stanza. Recipient clients that process <w/> elements are able to display the sender's typing smoothly without sudden bursts of text. See Examples of Key Press Intervals.

      -

      When key press intervals are preserved at high precision, all subtleties of typing are preserved, including the 'mood' (calm typing versus panicked or emphatic typing, etc.). Much as VoIP allows accurate packet transmission of sound, this spec allows accurate packet transmission of original typing look-and-feel. This enables the real-time feel of typing over virtually any network connection, without requiring frequent transmission intervals. Look and feel of typing is also preserved over variable latency connections including &xep0206;, mobile phone, satellite and long international connections with heavy packet-bursting tendencies.

      +

      When key press intervals are preserved at high precision, all subtleties of typing are preserved, including the 'mood' (calm typing versus panicked or emphatic typing, etc.). Much as Voice over IP (VoIP) allows accurate packet transmission of sound, this spec allows accurate packet transmission of original typing look-and-feel. This enables the real-time feel of typing over virtually any network connection, without requiring frequent transmission intervals. Look and feel of typing is also preserved over variable latency connections including &xep0206;, mobile phone, satellite and long international connections with heavy packet-bursting tendencies.

      There are specialized situations such as live transcriptions and captioning (e.g. transcription service, closed captioning provider, captioned telephone, Communication Access Realtime Translation (CART), relay services) that demand low latency transmission. Such systems typically use voice recognition and/or stenotype machines, which output text in word or phrase bursts rather than a character at a time. It can be acceptable for senders with bursty output to immediately transmit word or phrase bursts of text without buffering, as long as the average stanza rate is not excessive. This eliminates any lag caused by the Transmission Interval. It is not necessary to transmit Element <w/> – Wait Interval for real-time transcription.

      @@ -529,7 +537,7 @@
      -

      Recipient clients can choose to display a separate cursor/caret indicator within incoming real-time messages. This can improve usability of real-time text, since it becomes easier for a recipient to observe the sender's real-time message edits. For clients that do not implement a remote cursor, skip this section.

      +

      Recipient clients can choose to display a remote cursor within incoming real-time messages. A remote cursor is a separate cursor/caret indicator within incoming real-time messages, separate of the user's local cursor for outgoing messages. This can improve usability of real-time text, since it becomes easier for a recipient to observe the sender's real-time message edits. For clients that do not implement a remote cursor, skip this section.

      Action Elements use only absolute positioning (relative positions are not used by this specification), so clients do not need to remember the position value from previous action elements. Recipient software can calculate the remote cursor position as follows:

      • Upon receiving Element <t/> – Insert Text, the cursor position is the p attribute plus the length of the text being inserted. The cursor position is put at the end of the inserted text.
        @@ -592,7 +600,7 @@

      • Upon receiving Action Elements in incoming <rtt/> elements, they are added to a queue in the order they are received. This provides immunity to variable network conditions, since the queueing action will smooth out incoming transmission (e.g. receiving new <rtt/> while still processing action elements from a delayed <rtt/>).

      • The recipient client processes action elements in the queue in sequential order, including pauses from Element <w/> – Wait Interval, if supported. This is equivalent to playing back the sender's original typing.

      • -

        If Element <w/> – Wait Interval] is supported, excess lag in incoming real-time text can occur when delayed <rtt/> elements get delivered (e.g. congestion, intermittent wireless reception). To avoid delayed presentation of real-time text, the recipient client needs to speed up processing of action elements. This can be accomplished through a variety of techniques, such as shortening the pauses (n value) in <w/> elements, ignoring excess <w/> elements, immediately outputting action elements that are still queued, and/or keeping action elements from a limited number of <rtt/> elements queued (immediately outputting any prior action elements). This allows lagged real-time text to catch up more quickly.

        +

        If Element <w/> – Wait Interval is supported, excess lag in incoming real-time text can occur when delayed <rtt/> elements get delivered (e.g. congestion, intermittent wireless reception). To avoid delayed presentation of real-time text, the recipient client needs to speed up processing of action elements. This can be accomplished through a variety of techniques, such as shortening the pauses (n value) in <w/> elements, ignoring excess <w/> elements, immediately outputting action elements that are still queued, and/or keeping action elements from a limited number of <rtt/> elements queued (immediately outputting any prior action elements). This allows lagged real-time text to catch up more quickly.

        Upon receiving a Body Element indicating a completed message, it is acceptable for the full message text from <body/> to be displayed immediately in place of the real-time message, and discard any unprocessed action elements. This prevents any delay in displaying the final message delivery, however, this may cause a sudden surge of text in some situations.

        If the <w/> element is not supported, receiving clients can use an alternate text-smoothing method in order to Avoid Bursty Text Presentation (e.g. time-smoothed progressive output of received real-time text).

        @@ -624,7 +632,7 @@

        To minimize on-screen clutter of multiple idle real-time messages, clients can hide idle messages, clear old Stale Messages, and/or prioritize the display of the most useful real-time messages. Prominent visibility of real-time text can be assigned to recent typists and/or moderators (e.g. classroom teacher, convention speaker). For the same participant logged in multiple times in the same room, see Simultaneous Logins. In situations of simultaneous typing by a large number of participants, see Congestion Considerations.

        -

        In simultaneous login situations, transmitting of <rtt/> works in one-to-many situations without any special software support. For many-to-one situations where there is incoming <rtt/> from more than one simultaneous login, Keeping Real-Time Text Synchronized will pause the real-time message upon conflicting <rtt/>, and resume during the next Message Refresh, presumably from the active login. This provides a seamless system-switching experience. A good implementation of Message Refresh will improve user experience, regardless of whether or not the client follows &xep0296;. Clients can choose to distinguish the <rtt/> streams (via full JID and/or via <thread/>) and keep multiple concurrent real-time messages similar in manner to Multi-User Chat, with the Stale Messages being timed-out.

        +

        In situations where there are multiple sessions from the same JID (i.e. simultaneous logins on multiple clients/devices), transmitting of <rtt/> works in one-to-many situations without any special software support. For many-to-one situations where there is incoming <rtt/> from multiple sessions under the same JID, Keeping Real-Time Text Synchronized will pause the real-time message upon conflicting <rtt/>, and resume during the next Message Refresh, presumably from the active session. This provides a seamless system-switching experience. A good implementation of Message Refresh will improve user experience, regardless of whether or not the client follows &xep0296;. Clients can choose to distinguish the <rtt/> streams (via full JID and/or via <thread/>) and keep multiple concurrent real-time messages similar in manner to Multi-User Chat, with the Stale Messages being timed-out.

        @@ -633,7 +641,7 @@

        Senders that resume composing a message (i.e. continues a partially-composed message hours later) can do a Message Refresh, which allows recipients to redisplay the real-time message.

        -

        With real-time text, frequent screen updates can occur. Screen updates are a potential performance bottleneck, since fast typists type many key presses per second. Optimizing screen updates becomes especially important for slower platforms. The real-time message might be implemented as a separate window or separate display element.

        +

        With real-time text, frequent screen updates can occur. Screen updates are a potential performance bottleneck, since fast typists type many key presses per second. Optimizing screen updates is more important on slower platforms. The real-time message might be implemented as a separate window or separate display element.

        Battery life considerations are closely related to performance, as the addition of real-time text can have an impact on battery life. If Preserving Key Press Intervals is supported, then support for Element <w/> – Wait Interval needs to be implemented in a battery-efficient manner. The Transmission Interval can vary dynamically to optimize for battery life and wireless reception. For devices where screen updates are an unavoidable, inefficient bottleneck, see Low-Bandwidth and Low-Precision Text Smoothing to reduce the number of screen updates per second.

        @@ -976,7 +984,9 @@ RFC 5194: Framework for Real-Time Text over IP Using the Session Initiation Protocol (SIP) <http://tools.ietf.org/html/rfc5194>. and ITU-T T.140. Clients that run on multiple networks, might need to utilize multiple real-time text technologies. To interoperate between incompatible real-time text technologies, gateway servers can transcode between different real-time text technologies, along with other media such as audio and video. This can include TTY and textphones.

        In the SIP environment, real-time text is specified in IETF RFC 4103 and ITU-T T.140. SIP is a popular real-time session control protocol, and there are many implementations of real-time text controlled by SIP. This includes emergency services in some regions.

        -

        Interoperability considerations include addressing translation, media negotiation and translation, and media transcoding. Transcoding is straightforward between this specification and T.140/RFC4103, except for editing in the middle of messages. Text insertions or deletions, occurring far back in the message, can cause a large number of erase operations in T.140 that consume time and bandwidth. T.140 specifies the use of ISO 6429 control codes for presentation characteristics, such as text color, that are not supported by this specification. During transcoding, these control codes needs to be filtered off in order to not disturb the presentation of text.

        +

        Interoperability considerations include addressing translation, media negotiation and translation, and media transcoding. Transcoding is straightforward between this specification and T.140/RFC4103, except for editing in the middle of messages. Text insertions or deletions, occurring far back in the message, can cause a large number of erase operations in T.140 that consume time and bandwidth. T.140 specifies the use of ISO 6429 control codes for presentation characteristics, such as text color, that are not supported by this specification. During transcoding, these control codes needs to be filtered off in order to not disturb the presentation of text. Guidance on address translation and conveyance between XMPP and SIP can be found at Interworking between SIP and XMPP: Instant Messaging + draft-saintandre-sip-xmpp-im: Interworking between the Session Initiation Protocol (SIP) and the + Extensible Messaging and Presence Protocol (XMPP): Instant Messaging <http://tools.ietf.org/html/draft-saintandre-sip-xmpp-im-03>..

        According to ITU-T Rec. F.703, the “Total Conversation” standard defines the simultaneous use of audio, video, and real-time text. For convenience, real-time communication applications can be designed to have automatic negotiation of as many as possible of the three media preferred by the users.

        @@ -999,11 +1009,12 @@
      • Encryption at the <message/> stanza level (e.g. XEP-0200) can be used for all stanzas containing either <rtt/> or <body/>. It is noted that real-time text can have a higher rate of message stanzas, contributing to additional overhead. See Congestion Considerations.

      • Encryption at the <body/> level (e.g. deprecated XEP-0027) does not encrypt <rtt/>. In this case, <rtt/> needs to be encrypted separately. It is preferable to use a broader level of encryption, where possible.

      +

      It is possible for the timing of individual key presses to be used as a timing attack on encryption. Protection against this is provided by buffering of key presses into a regular Transmission Interval. As an additional measure of security, the risk of timing attacks can be further mitigated by padding <rtt/> elements to lengths not clearly related to the number of characters in the message. Alternatively, general XMPP protection mechanisms hiding length information can be applied on the complete message exchange instead of (or in concert with) <rtt/> specific protection mechanisms.

      The nature of real-time text can result in more frequent transmission of <message/> stanzas than would otherwise happen in a non-real-time text conversation. This can lead to increased network and server loading of XMPP networks.

      Transmission of real-time text can be throttled temporarily during poor network conditions. It is appropriate to use latency monitoring mechanisms (e.g. &xep0184; or &xep0198;) in order to temporarily adjust the Transmission Interval of real-time text beyond the recommended range. This results in lagged text (less real-time) but is better than failure during poor network conditions. The use of Message Refresh can also retransmit real-time text lost by poor network conditions, including stanzas dropped during a network issue or server error. These techniques are useful for mission-critical applications such as next generation emergency services (e.g. text to 9-1-1).

      -

      Excess numbers of real-time messages (e.g. during a DoS scenario in Multi-User Chat) might cause local resource-consumption issues, which can be mitigated by accelerated time-out of Stale Messages.

      +

      Excess numbers of real-time messages (e.g. during a Denial of Service (DoS) scenario in Multi-User Chat) might cause local resource-consumption issues, which can be mitigated by accelerated time-out of Stale Messages. Also see &xep0205;.

      According to multiple university studies worldwide (including Carnegie Mellon University Study Communication Characteristcs of Instant Messaging: Effects and Predictions of Interpersonal Relationships <http://seattle.intel-research.net/~davraham/pubs/Avrahami_CSCW_06.pdf>.), the average length of instant messages is under 40 characters. The additional incremental bandwidth overhead of real-time text can be very low for an existing XMPP client, especially one already using many extensions. Bandwidth can also be further mitigated using stream compression, to benefit bandwidth-constrained networks (e.g. GPRS, 3G, satellite).