Address LC feedback around groupchat messages in user archives

This is largely Dave Cridland's suggestion from the list, in response to Georg's
feedback, with Georg having checked the material details, along with others
in the xsf MUC. I believe everyone is happy with this.
This commit is contained in:
Kevin Smith 2021-09-08 11:35:25 +01:00 committed by Jonas Schäfer
parent c0ff942bf8
commit 01fe1b35dc
1 changed files with 53 additions and 1 deletions

View File

@ -29,6 +29,14 @@
</schemaloc>
&mwild;
&ksmith;
<revision>
<version>0.8.0</version>
<date>2021-10-12</date>
<initials>ks</initials>
<remark>
<p>Update groupchat-messages-in-user-archive advice, introducing fields and disco features to make behaviour explicit in future implementations, in light of Last Call feedback.</p>
</remark>
</revision>
<revision>
<version>0.7.5</version>
<date>2021-09-24</date>
@ -477,6 +485,42 @@
<p>If any UID requested by the client in any of the 'before-id', 'after-id' or 'ids' form fields is not present in the archive, the server MUST return an item-not-found error in response to the query.</p>
</section3>
<section3 topic='Including groupchat results in a user archive' anchor='query-include-groupchat'>
<p>If the server advertises that it includes groupchat messages in a user's archive (see <link url='#support'>Determining support</link>), a client may query a user archive and request for them to be included in the result with the 'include-groupchat' field set to 'true'.
</p>
<example caption='Querying the archive and including groupchat messages in results'><![CDATA[
<iq type='set' id='juliet1'>
<query xmlns='urn:xmpp:mam:2'>
<x xmlns='jabber:x:data' type='submit'>
<field var='FORM_TYPE' type='hidden'>
<value>urn:xmpp:mam:2</value>
</field>
<field var='include-groupchat'>
<value>true</value>
</field>
...
</x>
</query>
</iq>
]]></example>
<p>If the server advertises that it includes groupchat messages in the archive, or it advertises that it doesn't, a client may request that they not be included by setting the 'include-groupchat' field to 'false'.</p>
<example caption='Querying the archive and excluding groupchat messages from results'><![CDATA[
<iq type='set' id='juliet1'>
<query xmlns='urn:xmpp:mam:2'>
<x xmlns='jabber:x:data' type='submit'>
<field var='FORM_TYPE' type='hidden'>
<value>urn:xmpp:mam:2</value>
</field>
<field var='include-groupchat'>
<value>false</value>
</field>
...
</x>
</query>
</iq>
]]></example>
<p>Note that where the client doesn't specify the 'include-groupchat' field, it is implementation-defined whether groupchat messages are included in the results (see <link url='#business_rules'>Business Rules</link>). Clients MUST NOT include this field where servers don't advertise support, as the server would reject such a form.</p>
</section3>
<section3 topic='Retrieving form fields' anchor='query-form'>
<p>In order for the client find out about additional fields the server might support, it can send an iq stanza of type 'get' addressed to the archive like this:</p>
@ -503,6 +547,7 @@
<open/>
</validate>
</field>
<field type='boolean' var='include-groupchat'/>
<field type='text-single' var='{http://example.com/}free-text-search'/>
<field type='text-single' var='{http://example.com/}stanza-content'/>
</x>
@ -749,9 +794,11 @@
<p>No requirements are placed on how a server implements its storage beyond that it has to store data sufficient to be able to comply with this document. When this document describes storage requirements (e.g. MUST NOT store more than one copy...), it refers to what would appear to have been stored in order to satisfy the query.</p>
<p>If an entity (user's server, MUC room, pubsub node, ...) rejects an incoming message (such as from an occupant not allowed to send messages to the room, a user not authorized to publish to a pubsub node, a contact blocked by the user etc.) that message should not appear in the archive for the entity that rejected it - the archive should represent what logical entities (MUC occupants, users, pubsub subscribers...) would have received, and so only contain messages accepted for delivery to such entities.</p>
<section3 topic="User Archives" anchor='business-storeret-user-archives'>
<p>A user archive is anticipated to provide the user with the ability to access their prior conversations. To this end, a server SHOULD include in a user archive all of the messages a user sends or receives of type 'normal' or 'chat' that contain a &lt;body&gt; element. A server SHOULD also include messages of type 'groupchat' that have a &lt;body&gt;, but where such history is accessible through another method (e.g. through an archive on the MUC JID), a server MAY exclude these from the archive. A server MAY include additional non-conversation messages. A server MAY include messages of type 'headline', but this is not generally suggested.</p>
<p>A user archive is anticipated to provide the user with the ability to access their prior conversations. To this end, a server SHOULD include in a user archive all of the messages a user sends or receives of type 'normal' or 'chat' that contain a &lt;body&gt; element. A server MAY include additional non-conversation messages. A server MAY include messages of type 'headline', but this is not generally suggested.</p>
<p>Previous versions of this specification stated that a server SHOULD also include messages of type 'groupchat' that have a &lt;body&gt; - however many deployments did not follow this (although some did). This advice has now been dropped, and servers MAY include groupchat messages in their archives. Whether a server stores groupchat messages or not is now left as an implementation (or deployment) decision. Whether a client wants to receive groupchat messages in results can be signalled with the 'include-groupchat' field (if supported by the server - see <link url='#support'>Determining support</link>) - where the server doesn't support this field, or where a client doesn't specify it in the query, whether groupchat messages are included in the result is implementation-defined; this allows existing deployments to not break with the introduction of the 'include-groupchat' query field in a later version of this specification, but it is RECOMMENDED that all client implementations of the current version of this specification always include the field where the server supports it, and RECOMMENDED that servers support it.</p>
<p>At a minimum, the server MUST store the &lt;body&gt; elements of a stanza. It is suggested that other elements that are used in a given deployment to supplement conversations (e.g. XHTML-IM payloads) are also stored. Other elements MAY be stored.</p>
<p>If a server supports mechanisms that multiply copies of a stanza (e.g. Carbons, or forking a stanza to a bare JID), it MUST store such a staza within a given archive only once, irrespective of multiple connected clients receiving copies.</p>
<p>A server MAY choose not to deliver offline messages to a client that has already queried their MAM archive and received the archived copies of those messages that would otherwise be delivered - while not required of an implementation, this is helpful to avoid duplicate messages for clients, so is suggested.</p>
</section3>
<section3 topic="MUC Archives" anchor='business-storeret-muc-archives'>
<p>A MUC archives allows a user to view the conversation within a room. All messages sent to the room that contain a &lt;body&gt; element SHOULD be stored, as should subject change stanzas, apart from those messages that the room rejects.</p>
@ -878,6 +925,8 @@
</tr>
</table>
<p>Servers that understand the 'include-groupchat' field MUST advertise the 'urn:xmpp:mam:2#groupchat-field' (even if they cannot return groupchat messages), and servers that understand the 'include-groupchat' field and store groupchat messages in the user's archive must advertise the 'urn:xmpp:mam:2#groupchat-available' feature</p>
</section1>
<section1 topic='Security Considerations' anchor='security'>
@ -898,6 +947,9 @@
archived. How such messages are identified is beyond the scope of this
specification, but technologies such as &xep0258; may be used, for example.</p>
</section2>
<section2 topic='Sender Impersonation' anchor='security-impersonation'>
<p>A client MUST verify the source of MAM query results against an open query (i.e. checking the stanza 'from' matches the entity that was queried) and MUST either ignore or otherwise disregard (maybe with a warning to the user) unsolicited results - whether because the 'from' doesn't match an open query, or because there is no open query. This is to avoid the situation where a malicious entity sends MAM results while the client is querying a different entity and the client processes the malicious results as if they were part of the legitimate results. Additionally, if the client has multiple queries in flight at once, it MUST also check that the query ID for a result matches that of an open query for that entity.</p>
</section2>
<section2 topic='Stanza IDs' anchor='security-stanza-ids'>
<p>Entities that implement this specification must also adhere to the security requirements of XEP-0359.</p>
</section2>