1
0
mirror of https://github.com/moparisthebest/xeps synced 2024-11-21 08:45:04 -05:00

Add profiles / partial sync to entity versioning

This commit is contained in:
Sam Whited 2015-09-07 13:48:57 -05:00 committed by Matthew A. Miller
parent fb6a27747b
commit 61dd466023

View File

@ -8,10 +8,9 @@
<header>
<title>Entity Versioning</title>
<abstract>
A method by which rosters and disco items may be versioned so that servers
will not need to send the entire list if it has not been modified, saving
bandwidth and time during session initialization with minimal state being
stored by the server and client.
A method by which lists of items may be versioned so that servers will not
need to send the entire list if it has not been modified, saving bandwidth
and time with minimal state being stored by the server and client.
</abstract>
&LEGALNOTICE;
<number>xxxx</number>
@ -37,6 +36,12 @@
<surname>Keen</surname>
<email>dkeen@atlassian.com</email>
</author>
<revision>
<version>0.0.2</version>
<date>2015-09-17</date>
<initials>ssw</initials>
<remark><p>Add profiles / parcial sync.</p></remark>
</revision>
<revision>
<version>0.0.1</version>
<date>2015-08-25</date>
@ -48,17 +53,17 @@
<section1 topic='Introduction' anchor='intro'>
<p>
This problem of "downloading the world" (downloading the entire roster
every time a session is initialized) was partially addressed by &xep0237;
which was later merged into &rfc6121; §2.6. While this solved the problem
for the roster, it didn't account for other entities (eg. disco items).
every time a session is initialized or receiving an entire disco items
response every time a MUC list is queried, etc.) was partially addressed by
&xep0237; which was later merged into &rfc6121; §2.6. While this solved
the problem for the roster, it didn't account for other entities.
Furthermore, roster versioning requires that the server maintain a great
deal of state (roster items which should be pushed for each entity on
reconnect, or monotonically increasing counters, etc.) which can be
difficult to store or synchronize in a large, distributed system. This XEP
defines a method by which the roster and entities other than the roster can
be versioned and cached, and which is optimized for distributed systems
with large enity lists (but works equally well on small, single server
deployments).
defines a method by which generic entity lists can be versioned and cached,
and which is optimized for distributed systems with large entity lists (but
works equally well on small, single server deployments).
</p>
</section1>
@ -96,8 +101,8 @@
<dd>Any abstract object which may be versioned (eg. rooms, users).</dd>
<dt>Version Token</dt>
<dd>
A generally short, case sensitive string which represents an entity and
changes if that entity changes.
A short, case sensitive string which represents an entity and changes if
that entity changes.
</dd>
</dl>
</section1>
@ -144,55 +149,103 @@
</section2>
</section1>
<section1 topic='Entity Versioning Profiles' anchor='profiles'>
<p>
Because entity versioning is designed to be a generic system for syncing
any sort of list in XMPP, and the format and requirements of various entity
lists may vary greatly, no specific wire format is defined in this
specification. Instead, the specifics for various lists will be left up to
separate XEPs which will define entity versioning "profiles" which must be
registered with the XMPP registrar. These profiles will define exactly how
version tokens are represented in the specific list format for which they
wish to use entity versioning. The rest of this document will provide
details about entity versioning which will be common to all entity
versioning profiles and do not need to be redefined in EV profile XEPs. It
will also define an EV profile for fetching the roster.
</p>
<p>
The roster entity versioning profile which is used as an example throughout
this document will use the namespace 'urn:xmpp:entityver:profile:roster:0'
as described in the <link url="#registrar">XMPP Registrar
Considerations</link> section of this document.
</p>
</section1>
<section1 topic='Discovering Support' anchor='disco'>
<p>
If a server supports entity versioning, it MUST inform the connecting
client when returning stream features during the stream negotiation
process. This is done by including a &lt;ver/&gt; element, qualified by
the 'urn:xmpp:entityver:0' namespace. At the latest, this SHOULD
be done when informing a client that resource binding is required. For
example:
</p>
<example caption="Stream Features"><![CDATA[
<p>
If a server supports entity versioning, it MUST inform the connecting
client when returning stream features during the stream negotiation
process. This is done by including a &lt;ver/&gt; element, qualified by the
'urn:xmpp:entityver:0' namespace with child &lt;profile&gt; nodes for each
supported entity versioning profile. At the latest, this SHOULD be done
when informing a client that resource binding is required. For example if
the server only supports versioning of rosters it might return:
</p>
<example caption="Stream Features"><![CDATA[
<stream:features>
<bind xmlns='urn:ietf:params:xml:ns:xmpp-bind'>
<required/>
</bind>
<ver xmlns='urn:xmpp:entityver:0'/>
<ver xmlns='urn:xmpp:entityver:0'>
<profile xmlns='urn:xmpp:entityver:profile:roster:0'/>
</ver>
</stream:features>
]]></example>
]]></example>
<p>
The entity versioning stream feature is merely informative and therefore
is never mandatory-to-negotiate.
</p>
<p>
Clients, servers, and other entities that support &xep0030; and entity
versioning must respond to service discovery requests with a feature of
'urn:xmpp:entityver:0' and with a feature for each EV profile supported by
the responding entity as described in the relavant specifications. Eg. a
response from a server that supports roster versioning for the requesting
entity might look like the following:
<example caption="Service discovery information response"><![CDATA[
<iq from='shakespeare.lit'
id='ku6e51v3'
to='kingclaudius@shakespeare.lit/castle'
type='result'>
<query xmlns='http://jabber.org/protocol/disco#info'>
<feature var='urn:xmpp:entityver:0'/>
<feature var='urn:xmpp:entityver:profile:roster:0'/>
</query>
</iq>
]]></example>
</p>
</section1>
<section1 topic='Version Tokens' anchor='version_tokens'>
<p>
Version tokens are short case-sensitive strings which are generated by the
server. Their format is not defined in this spec, but a recommendation may
be found in the <link url="#impl">Implementation Notes</link>. Version
tokens are akin to a weakly-validated etag for the entity in question.
</p>
<p>
Servers that implement this protocol must assign such a version token to
each entity that is controlled by the server. The server MUST then update
this version every time any mutable property of the entity changes (eg.
when the subscription status of a user changes). The server MAY choose to
update this token at any time (to force the clients to invalidate their
cached representation fo the object). This version token MUST then be
included with every object representation of that entity sent down in the
stream. This is done by including a sub-node called "version" qualified
by the entity versioning XML namespace defined in this document.
Similarly, clients MAY also add version nodes for each version token they
possess to the request for a list (not specifying a version token will
force the server to send information on that entity to the client). If a
server sends up a list of version tokens, the server MUST then check to
see if those tokens correspond to any entity which it knows about, and
not send down any entities with matching version tokens in the response.
</p>
<p>For example, a roster request might look like:</p>
<example caption="Roster Request"><![CDATA[
<section1 topic='Entity Sync' anchor='entity_sync'>
<section2 topic='Version Tokens' anchor='version_tokens'>
<p>
Version tokens are short case-sensitive strings which are generated by
the server. Their format is not defined in this spec, but a
recommendation may be found in the <link url="#impl">Implementation
Notes</link>. Version tokens are akin to a weakly-validated etag for the
entity in question.
</p>
<p>
Servers that implement this protocol must assign such a version token to
each entity that is controlled by the server. The server SHOULD then
update this version every time any mutable property of the entity changes
(eg. when the subscription status of a user changes). The server MAY
choose to update this token at any time (to force the clients to
invalidate their cached representation fo the object). This version token
MUST then be included with every object representation of that entity
sent down in the stream. This is done by including a sub-node called
"version" qualified by the entity versioning XML namespace defined in
this document. Similarly, clients MAY also add version nodes for each
version token they possess to the request for a list (not specifying a
version token will force the server to send information on that entity to
the client). If a server sends up a list of version tokens, the server
MUST then check to see if those tokens correspond to any entity which it
knows about, and not send down any entities with matching version tokens
in the response.
</p>
<p>
For example, a versioned roster request might look like this:
<example caption="Roster Request"><![CDATA[
<!-- Client -->
<iq from='romeo@montague.lit/home' id='56' to='romeo@montague.lit' type='get'>
<query xmlns='jabber:iq:roster'>
@ -213,64 +266,25 @@
</item>
</query>
</iq>
]]></example>
]]></example>
</p>
<p>
Note that in this case there may be three roster items total (and the
client only knows about two of them), or there may be two total roster
items and the server is informing the client about a change to
"bill@shakespeare.lit". Version tokens MUST also be present in roster
pushes:
</p>
<example caption="Roster Push"><![CDATA[
<!-- Server -->
<example caption="Roster Push"><![CDATA[
<iq from='romeo@montague.lit' id='ah382g678jka7' to='romeo@montague.lit/home' type='set'>
<query xmlns='jabber:iq:roster' ver='ver34'>
<item jid='tybalt@shakespeare.lit' subscription='remove'>
<version xmlns='urn:xmpp:entityver:0'>XWE4MUUP</version>
</item>
</query>
</iq>
]]></example>
<p>A disco request for rooms (as defined in &xep0045;) might look like:</p>
<example caption="MUC Disco"><![CDATA[
<!-- Client -->
<iq from='hag66@shakespeare.lit/phone'
id='zb8q41fas6yn4'
to='chat.shakespeare.lit'
type='get'>
<query xmlns='http://jabber.org/protocol/disco#items>
<item jid='coven@chat.shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'>25P2A7H8</version>
</item>
<item jid='inverness@chat.shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'>4OLGSVNY</version>
<query xmlns='jabber:iq:roster' ver='ver34'>
<item jid='tybalt@shakespeare.lit' subscription='remove'>
<version xmlns='urn:xmpp:entityver:0'>XWE4MUUP</version>
</item>
</query>
</iq>
<!-- Server -->
<iq from='chat.shakespeare.lit'
id='zb8q41fas6yn4'
to='hag66@shakespeare.lit/phone'
type='result'>
<query xmlns='http://jabber.org/protocol/disco#items'>
<item jid='coven@chat.shakespeare.lit'
name='A Dark Cave'>
<version xmlns='urn:xmpp:entityver:0'>VIZSVF0D</version>
</item>
</query>
</iq>
]]></example>
<p>
In this example coven@chat.shakespeare.lit has been modified (eg. the
room name might have been changed), but inverness@chat.shakespeare.lit
has not changed, therefore no update is sent down.
</p>
<p>
Clients that implement this protocol SHOULD then cache the entity in
question when a version token is received.
</p>
]]></example>
</p>
</section2>
<section2 topic='Cache Invalidation' anchor='deletes'>
<p>
When a client syncs with the server and indicates that it has a version
@ -278,127 +292,154 @@
the server wants to remove an entity from the clients cache for any other
reason), the server MUST reply with an empty &lt;version/&gt; node. When
the client receives such an empty version node it SHOULD purge the entity
from its cache. For example, the following exchange would trigger the
removal of 'inverness@chat.shakespeare.lit' from the cached MUC list:
<example caption="MUC deletion"><![CDATA[
<!-- Client -->
<iq from='hag66@shakespeare.lit/phone'
id='zb8q41fas6yn4'
to='chat.shakespeare.lit'
type='get'>
<query xmlns='http://jabber.org/protocol/disco#items>
<item jid='coven@chat.shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'>25P2A7H8</version>
</item>
<item jid='inverness@chat.shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'>4OLGSVNY</version>
</item>
</query>
</iq>
<!-- Server -->
<iq from='chat.shakespeare.lit'
id='zb8q41fas6yn4'
to='hag66@shakespeare.lit/phone'
type='result'>
<query xmlns='http://jabber.org/protocol/disco#items'>
<item jid='inverness@chat.shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'/>
</item>
</query>
</iq>
]]></example>
from its cache. For example, the following would remove the roster item
'bill@shakespeare.lit' from the cache:
<example caption="Cache invalidation"><![CDATA[
<iq from='romeo@montague.lit' id='56' to='romeo@montague.lit/home' type='result>
<query xmlns='jabber:iq:roster'>
<item subscription='both' jid='bill@shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'/>
</item>
</query>
</iq>
]]></example>
</p>
<p>
If the client receives an indication that it should delete an item from a
list by any other means (eg. via a roster push), it SHOULD remove the
version token associated with that entity from its cache.
Roster pushes that indicate a deleted item MUST also remove the version
from the cache (and need not contain an empty &lt;version/&gt; element).
</p>
</section2>
</section1>
<section1 topic='Aggregate Tokens' anchor='agg_tokens'>
<p>
While the version token approach to caching does not require a great deal
of state to be stored on the client or the server, it does require a lot
more information to be sent by the client when requesting a list of
entities. For a very large list which is not likely to have changed, it may
be useful know in advance if the roster has changed or not (so that we can
avoid sending the large request entirely). To do this, we can request an
aggregate version token from the server. This aggregate token is calculated
by constructing a string of comma separated "JID:version" pairs sorted in
byte-wise order (because the JID:version pair is constructed before
sorting, if two items in the list have the same JID they can still be
sorted by the version token), and taking the MD5 hash of the constructed
string. For example, if the server is calculating the aggregate version
token for a roster, it might end up with the following string:
</p>
<example caption="Aggregate token list"><![CDATA[
<section2 topic='Partial sync'>
<p>
For very large groups fetching an entire list may not be practical or
necessary. For example, one might imagine a large corporation with a
shared roster that is too large for its version tokens to be sent up to
the server on every sync, or even to download fully the first time. To
solve this, servers MAY choose to send down only a part of an entity list
in response to a query (unless the individual EV profile forbids partial
list sync). How servers choose what items to return is an implementation
detail that is out of the scope of this document. Some suggestions may be
found in the <link url="#impl">Implementation Notes</link>. On subsequent
requests for the entity list, the server MAY choose to return more
entities (eg. based on changes in its internal selection criteria),
however it MUST NOT invalidate cached entities unless they have actually
been removed from the list.
</p>
<p>
XEPs defining entity versioning profiles MUST include a section to
indicate if partial sync is supported, and if so, how it will be
indicated to the client (and how the client can request a full list). If
no mechanism is specified, this is done by adding a boolean "full_list"
attribute to the request, eg. a roster request for a partial list looks
like:
<example caption="Roster Request"><![CDATA[
<!-- Client -->
<iq from='romeo@montague.lit/home'
id='56'
to='romeo@montague.lit'
type='get'>
<query xmlns='jabber:iq:roster' full_list='false'>
<item jid='bill@shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'>25P2A7H8</version>
</item>
<item jid='anne@shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'>VIZSVF0D</version>
</item>
</query>
</iq>
<!-- Server -->
<iq from='romeo@montague.lit' id='56' to='romeo@montague.lit/home' type='result>
<query xmlns='jabber:iq:roster' full_list='false'>
<item subscription='both' jid='bill@shakespeare.lit'>
<version xmlns='urn:xmpp:entityver:0'>9ZFZXVP9</version>
</item>
</query>
</iq>
]]></example>
</p>
<p>
When making a request for a partial list, clients do not need to send up
every entity in their cache. Instead they MAY send up just those entities
for which they wish to check for updates. The server MUST then respond
with any updates for those entities, and MAY also add other entities to
the list if desired. If the client requests a partial list but does not
indicate that it has anything in its cache, what entities to return (if
any) is left up to the server implementation.
</p>
</section2>
<section2 topic='Aggregate Tokens' anchor='agg_tokens'>
<p>
While the version token approach to caching does not require a great deal
of state to be stored on the client or the server, it does require a lot
more information to be sent by the client when requesting a list of
entities. For a very large list which is not likely to have changed, it
may be useful know in advance if the roster has changed or not (so that
we can avoid sending the large request entirely). To do this, we can
request an aggregate version token from the server. This aggregate token
is calculated by constructing a string of comma separated "ID:version"
pairs sorted in byte-wise order (because the ID:version pair is
constructed before sorting, if two items in the list have the same ID
they can still be sorted by the version token), and taking the MD5 hash
of the constructed string. The ID in the pair is any ID or key that
identifies the entity as defined in its profile (eg. a JID for roster
items and most other entities). For example, if the server is calculating
the aggregate version token for a roster, it might end up with the
following string:
</p>
<example caption="Aggregate token list"><![CDATA[
anne@shakespeare.lit:VIZSVF0D,bill@shakespeare.lit:25P2A7H8
]]></example>
<p>
Which results in the aggregate token:
</p>
<example caption="Aggregate token"><![CDATA[
]]></example>
<p>
Which results in the aggregate token:
</p>
<example caption="Aggregate token"><![CDATA[
0514fc90e6c7981b06bbb2173bb8ef03
]]></example>
<p>
The actual request is an IQ sent to the server, or entity handling the
versioned list which contains a query that specifies the namespace of the
list we want to fetch. Eg. to fetch the aggregate token for the roster one
would query the server with the type set to the `jabber:iq:roster`
namespace:
</p>
<example caption="Roster aggregate token request"><![CDATA[
]]></example>
<p>
The actual request is an IQ sent to the server, or entity handling the
versioned list which contains a query that specifies the namespace of the
list we want to fetch. Eg. to fetch the aggregate token for the roster
one would query the server with the query's XMLNS set to
'urn:xmpp:entityver:profile:roster:0':
</p>
<example caption="Roster aggregate token request"><![CDATA[
<!-- Client -->
<iq to='bill@shakespeare.lit' type='get' id='bill1'>
<query xmlns='urn:xmpp:entityver:0' type='jabber:iq:roster' />
<query xmlns='urn:xmpp:entityver:profile:roster:0'/>
</iq>
<!-- Server -->
<iq to='bill@shakespeare.lit/home' type='result' id='bill1'>
<query xmlns='urn:xmpp:entityver:0' type='jabber:iq:roster'>
<query xmlns='urn:xmpp:entityver:profile:roster:0'>
0514fc90e6c7981b06bbb2173bb8ef03
</query>
</iq>
]]></example>
<p>
Similarly, to fetch the aggregate token for a list of MUC rooms, one would
query the MUC component directly with the type set to the 'disco#items'
namespace:
</p>
<example caption="MUC rooms aggregate token request"><![CDATA[
<!-- Client -->
<iq to='chat.shakespeare.lit' type='get' id='bill2'>
<query xmlns='urn:xmpp:entityver:0' type='http://jabber.org/protocol/disco#items' />
</iq>
<!-- Server -->
<iq to='bill@shakespeare.lit/home' type='result' id='bill2'>
<query xmlns='urn:xmpp:entityver:0' type='http://jabber.org/protocol/disco#items'>
32151d1d01440d5536a7f106afd3f4d8
</query>
</iq>
]]></example>
<p>
Because aggregate tokens are OPTIONAL to implement, clients MUST fall back
to a normal request if any error is returned in response to an aggregate
token IQ.
</p>
<p>
If an aggregate token is requested for a list that may contain more than
one type of entity (eg. MUC rooms and pubsub nodes that live on the same
component), then the server MUST return the aggregate token constructed
with the entire list (rooms and pubsub nodes).
</p>
<p>
Clients are also NOT REQUIRED to check aggregate tokens. However, clients
MAY wish to check aggregate tokens before making a roster or MUC request
when the cached roster or MUC list is very large. When to check aggregate
tokens (if at all) is left up to the implementation.
</p>
]]></example>
<p>
Because aggregate tokens are OPTIONAL to implement, clients MUST fall
back to making a normal list request if any error is returned in response
to an aggregate token IQ.
</p>
<p>
If an aggregate token is requested for a list that may contain more than
one type of entity (eg. MUC rooms and pubsub nodes that live on the same
component), then the server MUST return the aggregate token constructed
with the entire list (rooms and pubsub nodes).
</p>
<p>
Because aggregate tokens are calculated for the entire list as seen by
the client or server, they will never match if partial lists have been
downloaded by the client.
</p>
<p>
Clients are also NOT REQUIRED to check aggregate tokens. However, clients
MAY wish to check aggregate tokens before making a roster or MUC request
when the cached roster or MUC list is very large. When to check aggregate
tokens (if at all) is left up to the implementation.
</p>
</section2>
</section1>
<section1 topic='Implementation Notes' anchor='impl'>
<p>
Version tokens may not provide enough collision resistance across versioned
@ -421,6 +462,22 @@ anne@shakespeare.lit:VIZSVF0D,bill@shakespeare.lit:25P2A7H8
version token MAY be used as a weakly validated ETag for any API requests
for that entity.
</p>
<p>
Servers following this specification may choose to send down partial entity
lists in response to queries. For the case of rosters one or more of the
following may be returned to the requesting entity during the initial
roster sync:
<ul>
<li>
Users that are grouped with the requester in some way. Eg. for a
company with a large shared roster which places the requesting client
in the "Marketing Department" group, the server may wish to return
roster items that also share that group.
</li>
<li>Users whom the requester has contacted recently or frequently.</li>
<li>Users that should always be returned as part of server policy.</li>
</ul>
</p>
</section1>
<section1 topic='Security Considerations' anchor='security'>
@ -449,6 +506,53 @@ anne@shakespeare.lit:VIZSVF0D,bill@shakespeare.lit:25P2A7H8
&xep0053;.
</p>
</section2>
<section2 topic='Namespace Versioning' anchor='registrar-versioning'>
&NSVER;
</section2>
<section2 topic='Entity Versioning Profiles Registry' anchor='registrar-ev'>
<p>
The XMPP Registrar shall maintain a registry of entity versioning
profiles. All EV profile registrations shall be defined in separate
specifications (not in this document). Application types defined within
the XEP series MUST be registered with the XMPP Registrar, resulting in
protocol URNs of the form "urn:xmpp:entityver:profile:name:X" (where
"name" is the registered name of the profile and "X" is a non-negative
integer).
</p>
&REGPROCESS;
<code><![CDATA[
<profile>
<name>The name of the entity versioning profile.</name>
<desc>A natural-language summary of the profile.</desc>
<listdef>
The document in which the original list definition is specified.
</listdef>
<doc>
The document in which the EV profile for the list is specified (may be the
same as &lt;listdev/&gt;).
</doc>
</profile>
]]></code>
</section2>
<section2 topic='Entity Versioning Profiles' anchor='registrar-evprofile'>
<p>This specification defines the following entity versioning profile:</p>
<ul>
<li>urn:xmpp:entityver:profile:roster:0</li>
</ul>
<p>
Upon advancement of this specification from a status of Experimental to a
status of Draft, the &REGISTRAR; shall add the following definition to
the entity versioning profiles registry, as described in this document:
<code><![CDATA[
<profile>
<name>Roster entity versioning</name>
<desc>Allows versioning of entities in an XMPP roster.</desc>
<listdef>RFC 6121</listdef>
<doc>TODO: Insert this document once it is assigned a number</doc>
</profile>
]]></code>
</p>
</section2>
</section1>
<section1 topic='XML Schema' anchor='schema'>