%ents; ]>
Extended Channel Search This specification provides a standardised protocol to search for public group chats. In contrast to XEP-0030 (Service Discovery), it works across multiple domains and in contrast to XEP-0055 (Jabber Search) it more clearly handles extensibility. &LEGALNOTICE; xxxx ProtoXEP Standards Track Standards Council XMPP Core XEP-0004 XEP-0030 XEP-0059 XEP-0068 ECS &jonaswielicki; 0.0.1 2020-02-19 jsc

First draft.

The XMPP instant messaging ecosystem is a federated one. This leads to many different group chat service providers existing and interesting public group chats being spread out across them. In order to provide users with a way to find public group chats (henceforth called channels) of interest to them, there needs to be a way to execute a cross-domain search based on keywords.

The protocol in this document provides a general and extensible search for channels across different domains and service types (e.g. MUC vs. MIX). It provides meta-information right in the result set, which allows searching entities to skip additional &xep0030; queries against the channels themselves.

The protocol is not only useful for cross-domain search, but also as an alternative to using a &xep0030; disco#items request followed by many disco#info requests on a group chat service.

The protocol:

Channel
A public group chat hosted on a &gcs;. This can either be a &xep0045; room, a &xep0369; channel or something else entirely.
&gcs;
An entity or deployment which offers multi-user chat relay, such as by &xep0045; or &xep0369;.
&searchservice;
An entity which offers the service described in this specification.
&searcher;
An entity which requests information from the &searchservice;.

An entity annouces that it supports serving search queries by publishing the &searchns; feature via &xep0030;:

]]>

To execute a keyword search, the &searcher; MAY first request the search form from the &searchservice;. Alternatively, the &searcher; MAY use the form specified in this document with only the fields which must be implemented by the &searchservice;.

After obtaining the search form, the &searcher; completes the form and sends it back to the &searchservice;. The &searchservice; replies with a &xep0059; paginated list of results.

The search form is a form conforming to &xep0068;.

To request the search form, an entity sends an empty search element qualified by the &searchns; namespace:

]]>

The &searchservice; replies with the form as in the following example:

]]>¶msns; true true true 1 xep-0045 ]]>&ordernusers; ]]>

Note: Not all of the fields shown above are mandatory to implement. See Search Form Fields for a list of fields and their implementation status.

To request the result list for a given search query, a &searcher; submits a form with the ¶msns; FORM_TYPE. The &searcher; MAY include a &xep0059; <set/> element inside the <search/> element. In either case, the &searchservice; may reply with a RSM-paginated result and the &searcher; MUST be able to process that.

If a &searcher; composes a search request using a search form template obtained by the &searchservice;, it MAY omit all fields it does not know or where it does not change the value already supplied by the &searchservice;.

5 ]]>¶msns; xmpp.org {]]>&orderns; ]]>

The &searchservice; calculates the result, paginates it according to its own policy (possibly taking into account the pagination request from the client) and returns a single result page in the response IQ.

commteam 10 XMPP Service Operators Discussion venue for operators of federated XMPP services 43 opaque-string-1 opaque-string-2 5 ]]>

The result items are &itemel; elements wrapped in a &resultel; element qualified by the &searchns; namespace. The schema, along with extension rules, is described in Result Item Format.

To obtain further results, the &searcher; re-submits the identical form with an appropriate &xep0059; pagination request, using the information provided by the &searchservice; in the result <set/> element.

If the sort key requested by the &searcher; is not supported by the &searchservice;, the &searchservice; MUST reply with <feature-not-implemented/> and the <invalid-sort-key> application defined condition and a modify type:

]]>

If the q field was supplied by the &searcher; and the contents of the q field did not yield any term suitable for search, the &searchservice; MUST reply with an <bad-request/> error and the <invalid-search-terms/> application defined condition. The error type MUST be modify.

The server SHOULD include a human-readable description of the constraints for search terms which were not met in the <text/> element of the error.

Search terms must have at least three characters. ]]>

If the &searchservice; can not or does (by policy) not want to process the request due to excessive amounts of requests (either by the requesting entity, their domain or any other criteria), it MUST reply with an <resource-constraint/> error with type wait.

The application defined error condition <rate-limit/> MUST be included. This error condition has a RECOMMENDED attribute, retry-after, which provides the amount of seconds after which the &searcher; MAY retry the request.

The &searchservice; MAY include a human-readable description of the rate limit and when to retry in the <text/> element.

]]>

Note: See also the rate-limiting related business rules for &searcher; entities.

If the &searchservice; can not or does (by policy) not want to allow a &searcher; to retrieve the entire database of channels, it MUST reject queries which set the all field to true with an error as follows:

  • If the feature is generally disabled: <not-allowed/> with type cancel
  • If the feature is not offered to the &searcher; based on its identity: <forbidden/> with type auth

In all cases, the application defined condition <full-set-retrieval-rejected/> MUST be included.

The &searchservice; MAY include a human-readable description of the restrictions around full-list retrieval.

For example, if the full set retrieval had been disabled service-wide by configuration, the &searchservice; would reply with the following error:

Retrieval of the full database is not allowed. ]]>

If the &searcher; provides form fields which are conflicting, the &searchservice; MUST reply with a <bad-request/> error of type modify. In addition, the <conflicting-fields/> application specific condition MUST be included.

Conflicting field values are those which fundamentally cannot be used in the same query in such a way that the definition of their function is still adhered to. For example, q restricts the results by keywords, but all specifies that all entries are returned.

The &searchservice; SHOULD include a human-readable description of the conflicting fields, referencing to the label values of the involved fields.

The <conflicting-fields/> element MAY have one or more <var/> child elements which refer to var values of the submitted fields. At least one of the referenced fields must be changed in order for a follow-up query to succeed.`

For example, if the &searcher; has set all to true and provided a query in q, the &searchservice; would reply with an error similar to the following:

Cannot both return all results and search by keywords. all q ]]>

If no field which would define a result set and which is understood by the &searchservice; is present, it MUST reply with a <bad-request/> error of type cancel.

In addition, the <no-search-conditions/> application defined condition MUST be included.

]]>

An example of this situation would be a form where neither q nor all are given.

The search form is extensible as per &xep0068;. Implementations are free to add fields on both sides of the exchange, as long as they are properly namespaced using Clark Notation.

The following fields are specified by this document:

var type Support level Description
q text-single RECOMMENDED Input for the keyword-based search. Conflicts with all.
all boolean OPTIONAL Return all results, ignoring text search terms. This does not influence the restrictions imposed by the types field. Conflicts with q.
sinaddress boolean RECOMMENDED if q is supported Control whether the keyword search searches in the address of the channel.
sinname boolean REQUIRED if q is supported Control whether the keyword search searches in the name of the channel.
sindescription boolean REQUIRED if q is supported Control whether the keyword search searches in the textual description of the channel.
types list-multi RECOMMENDED Constrain the service types of channels to return. If not supported, the search MUST only cover &xep0045; group chats.
key list-single REQUIRED Select how the results are ordered.

The sort keys specified by this document are the following:

Value Description
&orderaddress; Order the results by the address of the channel. This ordering mode guarantees that the &searcher; gets a duplicate-free view without omissions when paginating.
&ordernusers; Order the results descendingly by the number of users. This mode does not guarantee that all channels in the database are returned, nor does it guarantee that no duplicates occur across multiple pages.

&searchservice; implementations may offer custom values for the key field, provided Clark Notation is used to namespace the values.

The result items are &itemel; elements qualified by the &searchns; namespace.

Each &itemel; element MUST have an address attribute whose value is a proper JID (as per either &rfc6122; or &rfc7622;). It identifies the channel uniquely.

The following child elements of &itemel; are defined by this specification. They are all qualified by the same namespace as &itemel; itself.

Element name Content model Occurences Description
name text character data 1 The human-readable name of the channel.
description text character data 1 The human-readable description of the channel.
language text character data 1 A valid xml:lang code which indicates the primary language of the channel.
nusers non-negative integer character data 1 Number of occupants
service-type enumeration character data 1 The type of the service which hosts the channel. See below for values and semantics.
is-open boolean character data 1 If set to true, it indicates that the channel can be joined without extra credentials.
anonymity-mode enumeration character data 1 Anonymity level of participation. See below for values and semantics.

Notes:

  1. Any child element may be omitted by a &searchservice; if the data is not available for any or all rooms.
  2. The number of occupants may be stale by an undefined amount of time.
  3. A service MAY return future versions of those elements alongside with past versions. Entities need to treat elements with the same name, but different namespace, as entirely different elements.
Value Description
{&anonns;}none The bare JID of the account or the full JID of one or more devices of each occupant is visible to every other occupant.
muc_semianonymous As specified in &xep0045;
Value Description
xep-0045 A &xep0045; service.
xep-0369 A &xep0369; service.

If a &searchservice; would return entries with the same address with different service types, it SHOULD prefer &xep0369; over &xep0045;. Note that a &searchservice; MUST NOT return service types the client has not asked for.

&searchservice; implementations are free to add custom child elements to &itemel; elements. &searcher; implementations MUST be prepared to handle any unknown elements in &itemel;, for example by ignoring them.

Additional values for the <anonymity-mode/> element may be specified by future extensions. If an implementation encounters an unknown value on this field, it is RECOMMENDED to either treat it as synonymous to {&anonns;}none or request the anonymity mode from the address using a protocol appropriate for the channel's service.

When sending a search form with a q field, the &searcher; transmits potentially sensitive information to a third party.

This specification does not require any interaction with the IANA.

This specification should probably create registries for the various fields it defines, as well as register a form type.

To be done.

Instead of rolling a custom protocol for the result items, &xep0055; could have been used.

While the result format of &xep0055; allows for some generality, it does so in a rather restricted way. It is limited by the data formats and types expressable in &xep0004;. Sturctured data, beyond lists of text and JIDs, can not be represented with &xep0004; at all. Machine-readable data would also have to be human-readable at the same time to provide a fallback view for human users. Interationalization of such human-readable data in field values is not possible with &xep0004;.

The advantage of entities being able to process unknown fields in a degraded manner is, principally, still present in the current proposal (although with a different kind of degration).

Given the complexity of fully and correctly processing &xep0004; report data, the slim benefits did, in the eyes of the authors, not outweigh the costs.

The basis for this protocol was developed for the search.jabber.network public group chat search service. It has been cleaned up for publication as a Standards Track XEP by the author and modified to support more use-cases.