From bcefb30da6c166c438d40b462162d7325e4ac2ff Mon Sep 17 00:00:00 2001 From: Sam Whited Date: Tue, 25 Aug 2015 18:31:22 -0500 Subject: [PATCH] Add initial draft of Entity Versioning. --- inbox/entityversioning.xml | 404 +++++++++++++++++++++++++++++++++++++ 1 file changed, 404 insertions(+) create mode 100644 inbox/entityversioning.xml diff --git a/inbox/entityversioning.xml b/inbox/entityversioning.xml new file mode 100644 index 00000000..a895be2c --- /dev/null +++ b/inbox/entityversioning.xml @@ -0,0 +1,404 @@ + + +%ents; +]> + + +
+ Entity Versioning + + A method by which rosters and disco items may be versioned so that servers + will not need to send the entire list if it has not been modified, saving + bandwidth and time during session initialization with minimal state being + stored by the server and client. + + &LEGALNOTICE; + xxxx + ProtoXEP + Standards Track + Standards + Council + + RFC 6120 + RFC 6121 + + + + EV + + Sam + Whited + swhited@atlassian.com + sam@samwhited.com + + + Doug + Keen + dkeen@atlassian.com + + + 0.0.1 + 2015-08-25 + ssw +

First draft

+
+
+ + +

+ This problem of "downloading the world" (downloading the entire roster + every time a session is initialized) was partially addressed by &xep0237; + which was later merged into &rfc6121; ยง2.6. While this solved the problem + for the roster, it didn't account for other entities (eg. MUC + disco#items). Furthermore, roster versioning requires that the server + maintain a great deal of state (multiple versions of the roster) which can + be difficult to implement in a large, distributed system. This XEP defines + a method by which entities other than the roster can be versioned and + cached. +

+
+ + +
    +
  • + An extra round trip MUST NOT be required to initiate entity versioning. +
  • +
  • + Clients that do not implement the protocol (but which use servers that + do) MUST still be able to request and receive entities normally. +
  • +
  • + Servers which implement this protocol MUST NOT be required to store + multiple versions of an entity list or maintain other redundant state. +
  • +
  • + Inconsistant state between servers in a cluster should not cause cache + invalidation for the entire entity list. +
  • +
  • + Large changes SHOULD NOT be required for existing servers / clients. +
  • +
+
+ + +
+
Aggregate Token
+
+ A hash which represents the state of a list of entities, and changes if + any of those entities changes. +
+
Versioned Entity
+
Any abstract object which may be versioned (eg. rooms, users).
+
Version Token
+
+ A generally short, case sensitive string which represents an entity and + changes if that entity changes. +
+
+
+ + + +
    +
  • + A client on a mobile device where bandwidth and throughput are limited + has a very large roster which cause connections to take an unacceptable + amount of time. With entity versioning, connections after the first + connection do not take as long, and use less bandwidth. +
  • +
  • + A client often wants to view the list of multi-user chat rooms + available on a servers MUC service. However, the list is very long and + takes a long time to download. After enabling entity versioning the + client can fetch the list, and then poll for changes at a later date + without re-requesting the entire list. +
  • +
+
+ +
    +
  • + A server is running in an environment where storing multiple versions + of each users roster may put too much pressure on the storage backend. + After enabling entity versioning, they only have to store a small token + per user and can calculate the diffs to send to the client afterwards. +
  • +
  • + A server maintains an out-of-band HTTP API for fetching information + about MUC rooms to display on their web page. They wish to use a + reverse proxy to cache API requests based on etags. Instead of + attempting to check if the backend page has changed and generate etags, + the room's entity version token is used as a weakly-validated ETag. +
  • +
+
+
+ + + +

+ If a server supports entity versioning, it MUST inform the connecting + client when returning stream features during the stream negotiation + process. This is done by including a <ver/> element, qualified by + the 'urn:xmpp:features:entityver:0' namespace. At the latest, this SHOULD + be done when informing a client that resource binding is required. For + example: +

+ + + + + + + ]]> +

+ The entity versioning stream feature is merely informative and therefore + is never mandatory-to-negotiate. +

+
+ + +

+ Version tokens are short case-sensitive strings which are generated by + the server. Their format is not defined in this spec, but a + recommendation may be found in the Implementation Notes. Version tokens + are akin to a weakly-validated etag for the entity in question. +

+

+ Servers that implement this protocol must assign such a version token to + each entity that is controlled by the server. The server MUST then update + this version every time any mutable property of the entity changes (eg. + when the subscription status of a user changes). The server MAY choose to + update this token at any time (to force the clients to invalidate their + cached representation fo the object). This version token MUST then be + included with every object representation of that entity sent down in the + stream. This is done by including a sub-node called "version" qualified + by the entity versioning XML namespace defined in this document. + Similarly, clients MAY also add version nodes for each version token they + possess to the request for a list (not specifying a version token will + force the server to send information on that entity to the client). If a + server sends up a list of version tokens, the server MUST then check to + see if those tokens correspond to any entity which it knows about, and + not send down any entities with matching version tokens in the response. +

+

For example, a roster request might look like:

+ + + + + 25P2A7H8 + + + VIZSVF0D + + + + + + + + 9ZFZXVP9 + + + + ]]> +

+ Note that in this case there may be three roster items total (and the + client only knows about two of them), or there may be two total roster + items and the server is informing the client about a change to + "bill@shakespeare.lit". Version tokens MUST also be present in roster + pushes: +

+ + + + XWE4MUUP + + + + ]]> +

A disco request for rooms (as defined in &xep0045;) might look like:

+ + + + 25P2A7H8 + + + 4OLGSVNY + + + + + + + + + VIZSVF0D + + + + ]]> +

+ In this example coven@chat.shakespeare.lit has been modified (eg. the + room name might have been changed), but inverness@chat.shakespeare.lit + has not changed, therefore no update is sent down. +

+

+ Clients that implement this protocol SHOULD then cache the entity in + question when a version token is received. +

+
+ +

+ While the version token approach to caching does not require a great deal + of state to be stored on the client or the server, it does require a lot + more information to be sent by the client when requesting a list of + entities. For a very large list which is not likely to have changed, it may + be useful know in advance if the roster has changed or not (so that we can + avoid sending the large request entirely). To do this, we can request an + aggregate version token from the server. This aggregate token is calculated + by constructing a string of comma separated "bare JID:version" pairs sorted + in byte-wise order, and taking the MD5 hash of the constructed string. For + example, if the server is calculating the aggregate version token for a + roster, it might end up with the following string: +

+ +

+ Which results in the aggregate token: +

+ +

+ The actual request is an IQ sent to the server, or entity handling the + versioned list which contains a query that specifies the namespace of the + list we want to fetch. Eg. to fetch the aggregate token for the roster one + would query the server: +

+ + + + + + + + + 0514fc90e6c7981b06bbb2173bb8ef03 + + + ]]> +

+ Similarly, to fetch the aggregate token for a list of MUC rooms, one would + query the MUC component directly: +

+ + + + + + + + + 32151d1d01440d5536a7f106afd3f4d8 + + + ]]> +

+ Because aggregate tokens are OPTIONAL to implement, clients MUST fall back + to a normal request if any error is returned in response to an aggregate + token IQ. +

+

+ Clients are also NOT REQUIRED to check aggregate tokens. However, clients + MAY wish to check aggregate tokens before making a roster or MUC request + when the cached roster or MUC list is very large. When to check aggregate + tokens is left up to the clients. +

+
+
+ + +

+ Version tokens may not provide enough collision resistance across versioned + entities (hereafter simply called "entities"), and may vary from server to + server, and therefore they MUST NOT be used as an entity identifier. +

+

+ Version tokens SHOULD always be considered opaque to the client (eg. even + if the version token is a derivable and consistant hash on the server side, + clients should not need to know how the server is calculating the token). +

+

+ The author RECOMMENDS using 8 character (32-bit) random alphanumeric ASCII + strings (eg. AABd7z9T) for version tokens. +

+

+ If a server which supports this XEP provides an HTTP API which can be used + to fetch information about entities (eg. for listing information about MUC + rooms that a server provides on the providers web page), the entities + version token MAY be used as a weakly validated ETag for any API requests + for that entity. +

+
+ + +

+ Client-side caching of entity information across sessions (rather than + holding them in memory only for the life of a session) could pose a privacy + risk, especially on shared systems. Implementations SHOULD protect cached + entity data with strong encryption or other appropriate means. +

+
+ + +

This document requires no interaction with &IANA;.

+
+ + + +

This specification defines the following XML namespace:

+
    +
  • urn:xmpp:features:entityver:0
  • +
+

+ Upon advancement of this specification from a status of Experimental to a + status of Draft, the ®ISTRAR; shall add the foregoing namespace to the + registry located at &STREAMFEATURES;, as described in Section 4 of + &xep0053;. +

+
+
+ + +TODO + + + +

+ The original entity versioning proposal was engineered and written by + HipChat's Doug Keen. +

+
+ +