<!-- TODO: Specify defaults for service type and how it’s handled. -->
<!-- TODO: Public vs. non-public use-cases -->
<!-- TODO: hash out implementation vs. business rules -->
<header>
<title>Extended Channel Search</title>
<abstract>This specification provides a standardised protocol to search for public group chats. In contrast to XEP-0030 (Service Discovery), it works across multiple domains and in contrast to XEP-0055 (Jabber Search) it more clearly handles extensibility.</abstract>
<remark>Accepted by vote of Council on 2020-02-26.</remark>
</revision>
<revision>
<version>0.0.1</version>
<date>2020-02-19</date>
<initials>jsc</initials>
<remark><p>First draft.</p></remark>
</revision>
</header>
<section1topic='Introduction & Motivation'anchor='intro'>
<p>The XMPP instant messaging ecosystem is a federated one. This leads to many different group chat service providers existing and interesting public group chats being spread out across them. In order to provide users with a way to find public group chats (henceforth called channels) of interest to them, there needs to be a way to execute a cross-domain search based on keywords.</p>
<p>The protocol in this document provides a general and extensible search for channels across different domains and service types (e.g. MUC vs. MIX). It provides meta-information right in the result set, which allows searching entities to skip additional &xep0030; queries against the channels themselves.</p>
<p>The protocol is not only useful for cross-domain search, but also as an alternative to using a &xep0030; disco#items request followed by many disco#info requests on a group chat service.</p>
</section1>
<section1topic='Requirements'anchor='reqs'>
<p>The protocol:</p>
<ul>
<li>must work without state on the server side. This is to allow stateless proxies to be used for pseudonymisation or anonymisation.</li>
<li>must allow searching the list using a free-text keyword-based search.</li>
<li>must allow future extensions to the search query and the result.</li>
<li>must allow retrieving the entire data set (although, for clarification, an operator may choose to turn this off).</li>
<li>must use completely machine-readable and machine-writable data.</li>
</ul>
</section1>
<section1topic='Glossary'anchor='glossary'>
<dl>
<di>
<dt>Channel</dt>
<dd>A public group chat hosted on a &gcs;. This can either be a &xep0045; room, a &xep0369; channel or something else entirely.</dd>
</di>
<di>
<dt>&gcs;</dt>
<dd>An entity or deployment which offers multi-user chat relay, such as by &xep0045; or &xep0369;.</dd>
</di>
<di>
<dt>&searchservice;</dt>
<dd>An entity which offers the service described in this specification.</dd>
</di>
<di>
<dt>&searcher;</dt>
<dd>An entity which requests information from the &searchservice;.</dd>
<section2topic='Executing a keyword search'anchor='usecases-search'>
<p>To execute a keyword search, the &searcher; MAY first request the search form from the &searchservice;. Alternatively, the &searcher; MAY use the form specified in this document with only the fields which must be implemented by the &searchservice;.</p>
<p>After obtaining the search form, the &searcher; completes the form and sends it back to the &searchservice;. The &searchservice; replies with a &xep0059; paginated list of results.</p>
<p>The search form is a form conforming to &xep0068;.</p>
<section3topic='Requesting the search form'anchor='usecases-search-form'>
<p>To request the search form, an entity sends an empty <tt>search</tt> element qualified by the <tt>&searchns;</tt> namespace:</p>
<examplecaption='&searcher; requests form from the &searchservice;'><![CDATA[
<p><strong>Note:</strong> Not all of the fields shown above are mandatory to implement. See <linkurl='#impl-searchformfields'>Search Form Fields</link> for a list of fields and their implementation status.</p>
</section3>
<section3topic='Send a search request'anchor='usecases-search-request'>
<p>To request the result list for a given search query, a &searcher; submits a form with the ¶msns;<tt>FORM_TYPE</tt>. The &searcher; MAY include a &xep0059;<tt><set/></tt> element inside the <tt><search/></tt> element. In either case, the &searchservice; may reply with a RSM-paginated result and the &searcher; MUST be able to process that.</p>
<p>If a &searcher; composes a search request using a search form template obtained by the &searchservice;, it MAY omit all fields it does not know or where it does not change the value already supplied by the &searchservice;.</p>
<examplecaption='&searcher; submits a form to the &searchservice;'><![CDATA[
<p>The &searchservice; calculates the result, paginates it according to its own policy (possibly taking into account the pagination request from the client) and returns a single result page in the response IQ.</p>
<examplecaption='&searcher; submits a form to the &searchservice;'><![CDATA[
<description>Discussion venue for operators of federated XMPP services</description>
<nusers>43</nusers>
<is-open/>
</item>
<setxmlns='http://jabber.org/protocol/rsm'>
<first>opaque-string-1</first>
<last>opaque-string-2</last>
<max>5</max>
</set>
</result>
</iq>
]]></example>
<p>The result items are <tt>&itemel;</tt> elements wrapped in a <tt>&resultel;</tt> element qualified by the <tt>&searchns;</tt> namespace. The schema, along with extension rules, is described in <linkurl='#impl-resultitemformat'>Result Item Format</link>.</p>
<p>To obtain further results, the &searcher; re-submits the identical form with an appropriate &xep0059; pagination request, using the information provided by the &searchservice; in the result <tt><set/></tt> element.</p>
<p>If the sort <tt>key</tt> requested by the &searcher; is not supported by the &searchservice;, the &searchservice; MUST reply with <tt><feature-not-implemented/></tt> and the <tt><invalid-sort-key></tt> application defined condition and a <tt>modify</tt> type:</p>
<examplecaption='&searchservice; replies with feature-not-implemented'><![CDATA[
<p>If the <tt>q</tt> field was supplied by the &searcher; and the contents of the <tt>q</tt> field did not yield any term suitable for search, the &searchservice; MUST reply with an <tt><bad-request/></tt> error and the <tt><invalid-search-terms/></tt> application defined condition. The error type MUST be <tt>modify</tt>.</p>
<p>The server SHOULD include a human-readable description of the constraints for search terms which were not met in the <tt><text/></tt> element of the error.</p>
<examplecaption='&searchservice; replies with the invalid-search-terms error'><![CDATA[
<p>If the &searchservice; can not or does (by policy) not want to process the request due to excessive amounts of requests (either by the requesting entity, their domain or any other criteria), it MUST reply with an <tt><resource-constraint/></tt> error with type <tt>wait</tt>.</p>
<p>The application defined error condition <tt><rate-limit/></tt> MUST be included. This error condition has a RECOMMENDED attribute, <tt>retry-after</tt>, which provides the amount of seconds after which the &searcher; MAY retry the request.</p>
<p>The &searchservice; MAY include a human-readable description of the rate limit and when to retry in the <tt><text/></tt> element.</p>
<examplecaption='&searchservice; replies with a rate limit notification'><![CDATA[
<p><strong>Note:</strong> See also the rate-limiting related business rules for &searcher; entities.</p>
</section4>
<section4topic='Rejection of Full List Retrieval'anchor='usecases-support-request-nofulllist'>
<p>If the &searchservice; can not or does (by policy) not want to allow a &searcher; to retrieve the entire database of channels, it MUST reject queries which set the <tt>all</tt> field to true with an error as follows:</p>
<ul>
<li>If the feature is generally disabled: <tt><not-allowed/></tt> with type <tt>cancel</tt></li>
<li>If the feature is not offered to the &searcher; based on its identity: <tt><forbidden/></tt> with type <tt>auth</tt></li>
</ul>
<p>In all cases, the application defined condition <tt><full-set-retrieval-rejected/></tt> MUST be included.</p>
<p>The &searchservice; MAY include a human-readable description of the restrictions around full-list retrieval.</p>
<p>For example, if the full set retrieval had been disabled service-wide by configuration, the &searchservice; would reply with the following error:</p>
<examplecaption='&searchservice; replies with a full-set-retrieval-rejected error'><![CDATA[
<section4topic='Conflicting Field Options'anchor='usecases-support-request-conflictingoptions'>
<p>If the &searcher; provides form fields which are conflicting, the &searchservice; MUST reply with a <tt><bad-request/></tt> error of type <tt>modify</tt>. In addition, the <tt><conflicting-fields/></tt> application specific condition MUST be included.</p>
<p>Conflicting field values are those which fundamentally cannot be used in the same query in such a way that the definition of their function is still adhered to. For example, <tt>q</tt> restricts the results by keywords, but <tt>all</tt> specifies that <em>all</em> entries are returned.</p>
<p>The &searchservice; SHOULD include a human-readable description of the conflicting fields, referencing to the <tt>label</tt> values of the involved fields.</p>
<p>The <tt><conflicting-fields/></tt> element MAY have one or more <tt><var/></tt> child elements which refer to <tt>var</tt> values of the submitted fields. At least one of the referenced fields must be changed in order for a follow-up query to succeed.`</p>
<p>For example, if the &searcher; has set <tt>all</tt> to true and provided a query in <tt>q</tt>, the &searchservice; would reply with an error similar to the following:</p>
<examplecaption='&searchservice; replies with a conflicting-fields error'><![CDATA[
<p>If no field which would define a result set and which is understood by the &searchservice; is present, it MUST reply with a <tt><bad-request/></tt> error of type <tt>cancel</tt>.</p>
<p>In addition, the <tt><no-search-conditions/></tt> application defined condition MUST be included.</p>
<examplecaption='&searchservice; replies with the no-search-conditions error'><![CDATA[
<p>An example of this situation would be a form where neither <tt>q</tt> nor <tt>all</tt> are given.</p>
</section4>
</section3>
</section2>
</section1>
<section1topic='Business Rules'anchor='rules'>
<ul>
<li>When sending the form template, the &searchservice; MUST include all fields it supports with their respective default values.</li>
<li>When submitting a form to the &searchservice;, a &searcher; MAY omit all fields it either does not understand or it has left unchanged.</li>
<li>When submitting a form to the &searchservice;, a &searcher; MAY omit the <tt><option/></tt> elements.</li>
<li>When receiving a search form, the &searchservice; MUST ignore fields with a <tt>var</tt> value it does not understand.</li>
<li>When executing a keyword search, the service may process the keyword string in implementation-defined ways. This may include interpreting quotes and other "special" characters, removing keywords which do not fit internal criteria for suitability and others.</li>
<li>If the &searcher; receives a <tt><rate-limit/></tt> error, the behaviour of the &searcher; depends on the <tt>retry-after</tt> attribute:
<ul>
<li>If the <tt>retry-after</tt> attribute is present, the &searcher; MUST NOT send another search request before the amount of seconds indicated in the <tt>retry-after</tt> attribute have elapsed. There is no guarantee that the request will be accepted at that time.</li>
<li>If the <tt>retry-after</tt> attribute is <em>not</em> present, the &searcher; should wait for an implementation-defined amount of time and SHOULD back off exponentially on each subsequent <tt><rate-limit/></tt> error.</li>
</ul>
</li>
<li>If a search request does not yield any results, the &searchservice; MUST reply with a <tt>&resultel;</tt> without any <tt>&itemel;</tt> children in a <tt>type='result'</tt> IQ. Specifically, it MUST NOT reply with an <tt><item-not-found/></tt> error.</li>
<li>If the <tt>all</tt> field is set to true and the &searchservice; allows this operation, all results MUST be included in the result set (and then paginated using &xep0059;).</li>
<section2topic='Search Form Fields'anchor='impl-searchformfields'>
<p>The search form is extensible as per &xep0068;. Implementations are free to add fields on both sides of the exchange, as long as they are properly namespaced using Clark Notation.</p>
<p>The following fields are specified by this document:</p>
<tablecaption='Search Form Field Summary'>
<tr>
<th>var</th>
<th>type</th>
<th>Support level</th>
<th>Description</th>
</tr>
<tr>
<td><tt>q</tt></td>
<td><tt>text-single</tt></td>
<td>RECOMMENDED</td>
<td>Input for the keyword-based search. Conflicts with <tt>all</tt>.</td>
</tr>
<tr>
<td><tt>all</tt></td>
<td><tt>boolean</tt></td>
<td>OPTIONAL</td>
<td>Return all results, ignoring text search terms. This does not influence the restrictions imposed by the <tt>types</tt> field. Conflicts with <tt>q</tt>.</td>
</tr>
<tr>
<td><tt>sinaddress</tt></td>
<td><tt>boolean</tt></td>
<td>RECOMMENDED if <tt>q</tt> is supported</td>
<td>Control whether the keyword search searches in the address of the channel.</td>
</tr>
<tr>
<td><tt>sinname</tt></td>
<td><tt>boolean</tt></td>
<td>REQUIRED if <tt>q</tt> is supported</td>
<td>Control whether the keyword search searches in the name of the channel.</td>
</tr>
<tr>
<td><tt>sindescription</tt></td>
<td><tt>boolean</tt></td>
<td>REQUIRED if <tt>q</tt> is supported</td>
<td>Control whether the keyword search searches in the textual description of the channel.</td>
</tr>
<tr>
<td><tt>types</tt></td>
<td><tt>list-multi</tt></td>
<td>RECOMMENDED</td>
<td>Constrain the service types of channels to return. If not supported, the search MUST only cover &xep0045; group chats.</td>
</tr>
<tr>
<td><tt>key</tt></td>
<td><tt>list-single</tt></td>
<td>REQUIRED</td>
<td>Select how the results are ordered.</td>
</tr>
</table>
<p>The sort keys specified by this document are the following:</p>
<tablecaption='Sort Key Values'>
<tr>
<th>Value</th>
<th>Description</th>
</tr>
<tr>
<td><tt>&orderaddress;</tt></td>
<td>Order the results by the address of the channel. This ordering mode guarantees that the &searcher; gets a duplicate-free view without omissions when paginating.</td>
</tr>
<tr>
<td><tt>&ordernusers;</tt></td>
<td>Order the results descendingly by the number of users. This mode does not guarantee that all channels in the database are returned, nor does it guarantee that no duplicates occur across multiple pages.</td>
</tr>
</table>
<section3topic='Extensibility of the Search Form Fields'anchor='impl-searchformfields-extensibility'>
<p>&searchservice; implementations may offer custom values for the <tt>key</tt> field, provided Clark Notation is used to namespace the values.</p>
<p>The result items are <tt>&itemel;</tt> elements qualified by the <tt>&searchns;</tt> namespace.</p>
<p>Each <tt>&itemel;</tt> element MUST have an <tt>address</tt> attribute whose value is a proper JID (as per either &rfc6122; or &rfc7622;). It identifies the channel uniquely.</p>
<p>The following child elements of <tt>&itemel;</tt> are defined by this specification. They are all qualified by the same namespace as <tt>&itemel;</tt> itself.</p>
<tablecaption='Child elements of result items'>
<tr>
<th>Element name</th>
<th>Content model</th>
<th>Occurences</th>
<th>Description</th>
</tr>
<tr>
<td><tt>name</tt></td>
<td>text character data</td>
<td>1</td>
<td>The human-readable name of the channel.</td>
</tr>
<tr>
<td><tt>description</tt></td>
<td>text character data</td>
<td>1</td>
<td>The human-readable description of the channel.</td>
</tr>
<tr>
<td><tt>language</tt></td>
<td>text character data</td>
<td>1</td>
<td>A valid <tt>xml:lang</tt> code which indicates the primary language of the channel.</td>
</tr>
<tr>
<td><tt>nusers</tt></td>
<td>non-negative integer character data</td>
<td>1</td>
<td>Number of occupants</td>
</tr>
<tr>
<td><tt>service-type</tt></td>
<td>enumeration character data</td>
<td>1</td>
<td>The type of the service which hosts the channel. See below for values and semantics.</td>
</tr>
<tr>
<td><tt>is-open</tt></td>
<td>boolean character data</td>
<td>1</td>
<td>If set to true, it indicates that the channel can be joined without extra credentials.</td>
</tr>
<tr>
<td><tt>anonymity-mode</tt></td>
<td>enumeration character data</td>
<td>1</td>
<td>Anonymity level of participation. See below for values and semantics.</td>
</tr>
</table>
<p><strong>Notes:</strong></p>
<ol>
<li>Any child element may be omitted by a &searchservice; if the data is not available for any or all rooms.</li>
<li>The number of occupants may be stale by an undefined amount of time.</li>
<li>A service MAY return future versions of those elements alongside with past versions. Entities need to treat elements with the same name, but different namespace, as entirely different elements.</li>
</ol>
<tablecaption='Anonymity modes'>
<tr>
<th>Value</th>
<th>Description</th>
</tr>
<tr>
<td><tt>{&anonns;}none</tt></td>
<td>The bare JID of the account or the full JID of one or more devices of each occupant is visible to every other occupant.</td>
</tr>
<tr>
<td><tt>muc_semianonymous</tt></td>
<td>As specified in &xep0045;</td>
</tr>
</table>
<tablecaption='Service types'>
<tr>
<th>Value</th>
<th>Description</th>
</tr>
<tr>
<td><tt>xep-0045</tt></td>
<td>A &xep0045; service.</td>
</tr>
<tr>
<td><tt>xep-0369</tt></td>
<td>A &xep0369; service.</td>
</tr>
</table>
<p>If a &searchservice; would return entries with the same address with different service types, it SHOULD prefer &xep0369; over &xep0045;. Note that a &searchservice; MUST NOT return service types the client has not asked for.</p>
<p>&searchservice; implementations are free to add custom child elements to <tt>&itemel;</tt> elements. &searcher; implementations MUST be prepared to handle any unknown elements in <tt>&itemel;</tt>, for example by ignoring them.</p>
<p>Additional values for the <tt><anonymity-mode/></tt> element may be specified by future extensions. If an implementation encounters an unknown value on this field, it is RECOMMENDED to either treat it as synonymous to <tt>{&anonns;}none</tt> or request the anonymity mode from the <tt>address</tt> using a protocol appropriate for the channel's service.</p>
<p>Instead of rolling a custom protocol for the result items, &xep0055; could have been used.</p>
<p>While the result format of &xep0055; allows for some generality, it does so in a rather restricted way. It is limited by the data formats and types expressable in &xep0004;. Sturctured data, beyond lists of text and JIDs, can not be represented with &xep0004; at all. Machine-readable data would also have to be human-readable at the same time to provide a fallback view for human users. Interationalization of such human-readable data in field values is not possible with &xep0004;.</p>
<p>The advantage of entities being able to process unknown fields in a degraded manner is, principally, still present in the current proposal (although with a different kind of degration).</p>
<p>Given the complexity of fully and correctly processing &xep0004; report data, the slim benefits did, in the eyes of the authors, not outweigh the costs.</p>
</section1>
<section1topic='Acknowledgements'anchor='acks'>
<p>The basis for this protocol was developed for the search.jabber.network public group chat search service. It has been cleaned up for publication as a Standards Track XEP by the author and modified to support more use-cases.</p>