mirror of
https://github.com/moparisthebest/xeps
synced 2024-11-24 02:02:16 -05:00
482 lines
16 KiB
XML
482 lines
16 KiB
XML
<?xml version='1.0' encoding='UTF-8'?>
|
|
<!DOCTYPE xep SYSTEM 'xep.dtd' [
|
|
<!ENTITY % ents SYSTEM 'xep.ent'>
|
|
%ents;
|
|
]>
|
|
<?xml-stylesheet type='text/xsl' href='xep.xsl'?>
|
|
<xep>
|
|
<header>
|
|
<title>Message Styling</title>
|
|
<abstract>
|
|
This specification defines a formatted text syntax for use in instant
|
|
messages with simple text styling.
|
|
</abstract>
|
|
&LEGALNOTICE;
|
|
<number>0393</number>
|
|
<status>Proposed</status>
|
|
<lastcall>2020-05-26</lastcall>
|
|
<type>Standards Track</type>
|
|
<sig>Standards</sig>
|
|
<approver>Council</approver>
|
|
<dependencies>
|
|
<spec>XMPP Core</spec>
|
|
<spec>XEP-0001</spec>
|
|
</dependencies>
|
|
<supersedes><spec>XEP-0071</spec></supersedes>
|
|
<supersededby/>
|
|
<shortname>styling</shortname>
|
|
&sam;
|
|
<revision>
|
|
<version>0.2.1</version>
|
|
<date>2020-01-02</date>
|
|
<initials>ssw</initials>
|
|
<remark>
|
|
<p>
|
|
Make rules consistent with examples that show multiple styling
|
|
directives being applied to the same text.
|
|
</p>
|
|
</remark>
|
|
</revision>
|
|
<revision>
|
|
<version>0.2.0</version>
|
|
<date>2019-09-02</date>
|
|
<initials>ssw</initials>
|
|
<remark>
|
|
<p>
|
|
Clarify block quote termination and white space trimming.
|
|
</p>
|
|
</remark>
|
|
</revision>
|
|
<revision>
|
|
<version>0.1.4</version>
|
|
<date>2018-05-01</date>
|
|
<initials>ssw</initials>
|
|
<remark>
|
|
<p>
|
|
Clarify language around strong emphasis.
|
|
</p>
|
|
</remark>
|
|
</revision>
|
|
<revision>
|
|
<version>0.1.3</version>
|
|
<date>2018-02-14</date>
|
|
<initials>ssw</initials>
|
|
<remark>
|
|
<p>
|
|
Reorder block and span sections, simplify block parsing, and update the
|
|
definition of a span.
|
|
</p>
|
|
</remark>
|
|
</revision>
|
|
<revision>
|
|
<version>0.1.2</version>
|
|
<date>2018-01-13</date>
|
|
<initials>ssw</initials>
|
|
<remark>
|
|
<p>
|
|
Clarify block quote and plain text parsing and formatting behavior.
|
|
</p>
|
|
</remark>
|
|
</revision>
|
|
<revision>
|
|
<version>0.1.1</version>
|
|
<date>2018-01-12</date>
|
|
<initials>ssw</initials>
|
|
<remark>
|
|
<p>
|
|
Minor clarifications and updates, add security considerations, and
|
|
expand the glossary.
|
|
</p>
|
|
</remark>
|
|
</revision>
|
|
<revision>
|
|
<version>0.1.0</version>
|
|
<date>2017-11-22</date>
|
|
<initials>XEP Editor (ssw)</initials>
|
|
<remark><p>First draft approved by the XMPP Council.</p></remark>
|
|
</revision>
|
|
<revision>
|
|
<version>0.0.1</version>
|
|
<date>2017-10-28</date>
|
|
<initials>ssw</initials>
|
|
<remark><p>First draft.</p></remark>
|
|
</revision>
|
|
</header>
|
|
<section1 topic='Introduction' anchor='intro'>
|
|
<p>
|
|
Historically, XMPP has had no system for simple text styling.
|
|
Instead, specifications like &xep0071; that require full layout engines have
|
|
been used, leading to numerous security issues with implementations.
|
|
Some entities have also performed their own styling based on identifiers in
|
|
the body.
|
|
While this has worked well in the past, it is not interoperable and leads to
|
|
entities each supporting their own informal styling languages.
|
|
</p>
|
|
<p>
|
|
This specification aims to provide a single, interoperable formatted text
|
|
syntax that can be used by entities that do not require full layout engines.
|
|
</p>
|
|
</section1>
|
|
<section1 topic='Requirements' anchor='reqs'>
|
|
<ul>
|
|
<li>
|
|
Clients that do not support this specification MUST still be able to
|
|
receive messages sent by clients using this specification and display them
|
|
in a human-readable form.
|
|
</li>
|
|
<li>
|
|
Clients that support this specification MUST NOT be required to use a
|
|
layout engine such as HTML or LaTeX.
|
|
</li>
|
|
<li>
|
|
Messages formatted using this specification MUST NOT hinder readability on
|
|
receiving clients regardless of client background color, contrast, or
|
|
window size.
|
|
</li>
|
|
<li>
|
|
Messages formatted using this specification MUST NOT hinder readability by
|
|
users with color vision deficiency or impaired vision.
|
|
</li>
|
|
<li>
|
|
Messages formatted with this specification MUST render correctly in
|
|
locales with right-to-left (RTL) layouts without causing confusion.
|
|
</li>
|
|
<li>
|
|
Clients that support this specification MUST NOT be required to extract
|
|
metadata unrelated to formatting or text style from the message.
|
|
</li>
|
|
<li>
|
|
Servers MUST NOT need to implement any new functionality for this
|
|
specification to be supported.
|
|
</li>
|
|
</ul>
|
|
</section1>
|
|
<section1 topic='Use Cases' anchor='usecases'>
|
|
<ul>
|
|
<li>
|
|
As a user sending an instant message to a friend, I want to be able to
|
|
emphasize an important part of my message.
|
|
</li>
|
|
<li>
|
|
As a software developer, I want to be able to send code as pre-formatted,
|
|
monospace, block or inline text to another developer.
|
|
</li>
|
|
<li>
|
|
As a multi-user chat user I want to add context to my reply by quoting an
|
|
earlier message in the chat.
|
|
</li>
|
|
</ul>
|
|
</section1>
|
|
<section1 topic='Glossary' anchor='glossary'>
|
|
<p>
|
|
Many important terms used in this document are defined in &unicode;.
|
|
The terms "left-to-right" (LTR) and "right-to-left" (RTL) are defined in
|
|
&uax9;.
|
|
The term "formatted text" is defined in &rfc7764;.
|
|
</p>
|
|
<dl>
|
|
<di>
|
|
<dt>Block</dt>
|
|
<dd>
|
|
Any chunk of text that can be parsed unambiguously in one pass.
|
|
Blocks may contain one or more children which may be other blocks or
|
|
spans.
|
|
For example:
|
|
<ul>
|
|
<li>A single line of text comprising one or more spans</li>
|
|
<li>A block quotation</li>
|
|
<li>A preformatted code block</li>
|
|
</ul>
|
|
</dd>
|
|
</di>
|
|
<di>
|
|
<dt>Formal markup language</dt>
|
|
<dd>
|
|
A structured markup language such as LaTeX, SGML, HTML, or XML that is
|
|
formally defined and may include metadata unrelated to formatting or
|
|
text style.
|
|
</dd>
|
|
</di>
|
|
<di>
|
|
<dt>Plain text</dt>
|
|
<dd>
|
|
Text that does not convey any particular formatting or interpretation of
|
|
the text by computer programs.
|
|
</dd>
|
|
</di>
|
|
<di>
|
|
<dt>Span</dt>
|
|
<dd>
|
|
A group of text that may be rendered inline alongside other spans.
|
|
Spans may be either plain text with no formatting applied, or may be
|
|
formatted text that is enclosed by two styling directives.
|
|
Spans are always children of blocks and may not escape from their
|
|
containing block.
|
|
Some spans may contain child spans.
|
|
The following all contain spans marked by parenthesis:
|
|
<ul>
|
|
<li>(plain span)</li>
|
|
<li>(<strong>*strong span*</strong>)</li>
|
|
<li>(<em>_emphasized span_</em>)</li>
|
|
<li>(<em>_emphasized span containing </em>(<em><strong>*strong span*</strong></em>)<em>_</em>)</li>
|
|
<li>(span one )(<strong>*span two*</strong>)</li>
|
|
</ul>
|
|
</dd>
|
|
</di>
|
|
<di>
|
|
<dt>Styling directive</dt>
|
|
<dd>
|
|
A character or set of characters that indicates the beginning of a span
|
|
or block.
|
|
For example, in certain contexts the characters '*' (U+002A ASTERISK),
|
|
and '_' (U+005F LOW LINE) may be styling directives that indicate the
|
|
beginning of a strong or emphasis span and the string '```' (U+0060
|
|
GRAVE ACCENT) may be a styling directive that indicate the beginning of
|
|
a preformatted code block.
|
|
</dd>
|
|
</di>
|
|
<di>
|
|
<dt>Whitespace character</dt>
|
|
<dd>
|
|
Any Unicode scalar value which has the property "White_Space" or is in
|
|
category Z in the Unicode Character Database.
|
|
</dd>
|
|
</di>
|
|
</dl>
|
|
</section1>
|
|
<section1 topic='Business Rules' anchor='rules'>
|
|
<section2 topic='Blocks' anchor='block'>
|
|
<p>
|
|
Parsers implementing message styling will first parse blocks and then
|
|
parse child blocks or spans if allowed by the specific block type.
|
|
</p>
|
|
<section3 topic='Plain' anchor='line-block'>
|
|
<p>
|
|
Individual lines of text that are not inside of a preformatted text
|
|
block are considered a "plain" block.
|
|
Plain blocks are not bound by styling directives and do not imply
|
|
formatting themselves, but they may contain spans which imply
|
|
formatting.
|
|
Plain blocks may not contain child blocks.
|
|
</p>
|
|
<example caption='Plain block text'><![CDATA[
|
|
<body>
|
|
(There are three blocks in this body marked by parens,)
|
|
(but there is no *formatting)
|
|
(as spans* may not escape blocks.)
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
<section3 topic='Preformatted Text' anchor='pre-block'>
|
|
<p>
|
|
A preformatted text block is started by a line beginning with "```"
|
|
(U+0060 GRAVE ACCENT), and ended by a line containing only three grave
|
|
accents or the end of the parent block (whichever comes first).
|
|
Preformatted text blocks cannot contain child blocks or spans.
|
|
Text inside a preformatted block SHOULD be displayed in a monospace font.
|
|
</p>
|
|
<example caption='Preformatted block text'><![CDATA[
|
|
<body>
|
|
```ignored
|
|
(println "Hello, world!")
|
|
```
|
|
|
|
This should show up as monospace, preformatted text ⤴
|
|
</body>
|
|
]]></example>
|
|
<example caption='No closing preformatted text sequence'><![CDATA[
|
|
<body>
|
|
> ```
|
|
> (println "Hello, world!")
|
|
|
|
The entire blockquote is a preformatted text block, but this line
|
|
is plaintext!
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
<section3 topic='Quotations' anchor='quote'>
|
|
<p>
|
|
A quotation is indicated by one or more lines with a byte stream
|
|
beginning with a '>' (U+003E GREATER-THAN SIGN).
|
|
They are terminated by the first new line that is not followed by a
|
|
greater-than sign, or the end of the parent block (whichever comes
|
|
first).
|
|
Block quotes may contain any child block, including other quotations.
|
|
Lines inside the block quote MUST have the first leading whitespace
|
|
character trimmed before parsing the child block.
|
|
It is RECOMMENDED that text inside of a block quote be indented or
|
|
distinguished from the surrounding text in some other way.
|
|
</p>
|
|
<example caption='Quotation (LTR)'><![CDATA[
|
|
<body>
|
|
> That that is, is.
|
|
|
|
Said the old hermit of Prague.
|
|
</body>
|
|
]]></example>
|
|
<example caption='Nested Quotation'><![CDATA[
|
|
<body>
|
|
>> That that is, is.
|
|
> Said the old hermit of Prague.
|
|
|
|
Who?
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
</section2>
|
|
<section2 topic='Spans' anchor='span'>
|
|
<p>
|
|
Matches of spans between two styling directives MUST contain some text
|
|
between the two styling directives and the opening styling directive
|
|
MUST be located at the beginning of the line, after a whitespace
|
|
character, or after a different opening styling directive.
|
|
The opening styling directive MUST NOT be followed by a whitespace
|
|
character and the closing styling directive MUST NOT be preceeded by a
|
|
whitespace character.
|
|
Spans are always parsed from the beginning of the byte stream to the end
|
|
and are lazily matched.
|
|
Characters that would be styling directives but do not follow these rules
|
|
are not considered when matching and thus may be present between two other
|
|
styling directives.
|
|
</p>
|
|
<p>
|
|
For example, each of the following would be styled as indicated:
|
|
</p>
|
|
<ul>
|
|
<li><strong>*strong*</strong></li>
|
|
<li>plain <strong>*strong*</strong> plain</li>
|
|
<li><strong>*strong*</strong> plain <strong>*strong*</strong></li>
|
|
<li><strong>*strong*</strong>plain*</li>
|
|
<li>* plain <strong>*strong*</strong></li>
|
|
</ul>
|
|
<p>
|
|
Nothing would be styled in the following messages (where "\n" represents a
|
|
new line):
|
|
</p>
|
|
<ul>
|
|
<li>not strong*</li>
|
|
<li>*not strong</li>
|
|
<li>*not \n strong*</li>
|
|
<li>*not *strong</li>
|
|
<li>**</li>
|
|
<li>****</li>
|
|
</ul>
|
|
<section3 topic='Plain' anchor='plain'>
|
|
<p>
|
|
Any text inside of a block that is not part of another span is
|
|
implicitly considered to be inside of a "plain text" span.
|
|
</p>
|
|
<example caption='Plain'><![CDATA[
|
|
<body>
|
|
(Two spans, both )(*alike in dignity*)
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
<section3 topic='Emphasis' anchor='emph'>
|
|
<p>
|
|
Text enclosed by '_' (U+005F LOW LINE) is emphasized and SHOULD be
|
|
displayed in italics.
|
|
</p>
|
|
<example caption='Italic'><![CDATA[
|
|
<body>
|
|
The full title is _Twelfth Night, or What You Will_ but
|
|
_most_ people shorten it.
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
<section3 topic='Strong Emphasis' anchor='strong'>
|
|
<p>
|
|
Text enclosed by '*' (U+002A ASTERISK) is strongly emphasized and SHOULD
|
|
be displayed with a heavier font weight than the surrounding text
|
|
(bold).
|
|
</p>
|
|
<example caption='Strong'><![CDATA[
|
|
<body>
|
|
The full title is "Twelfth Night, or What You Will" but
|
|
*most* people shorten it.
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
<section3 topic='Strike through' anchor='strike'>
|
|
<p>
|
|
Text enclosed by '~' (U+007E TILDE) SHOULD be displayed with a horizontal
|
|
line through the middle.
|
|
</p>
|
|
<example caption='Strike through'><![CDATA[
|
|
<body>
|
|
Everyone ~dis~likes cake.
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
<section3 topic='Preformatted Span' anchor='mono'>
|
|
<p>
|
|
Text enclosed by a '`' (U+0060 GRAVE ACCENT) is a preformatted span SHOULD
|
|
be displayed inline in a monospace font.
|
|
A preformatted span may only contain a single plain span.
|
|
Inline formatting directives inside the preformatted span are not
|
|
rendered.
|
|
For example, the following all contain valid preformatted spans:
|
|
</p>
|
|
<ul>
|
|
<li>This is <tt>`monospace`</tt></li>
|
|
<li>This is <tt>`*monospace*`</tt></li>
|
|
<li>This is <strong><tt>*`monospace and bold`*</tt></strong></li>
|
|
</ul>
|
|
<example caption='Monospace text'><![CDATA[
|
|
<body>
|
|
Wow, I can write in `monospace`!
|
|
</body>
|
|
]]></example>
|
|
</section3>
|
|
</section2>
|
|
</section1>
|
|
<section1 topic='Implementation Notes' anchor='impl'>
|
|
<p>
|
|
This document does not define a regular grammar and thus styling cannot be
|
|
matched by a regular expression.
|
|
Instead, a simple parser can be constructed by first parsing all text into
|
|
blocks and then recursively parsing the child-blocks inside block
|
|
quotations, the spans inside individual lines, and by returning the text
|
|
inside preformatted blocks without modification.
|
|
</p>
|
|
<p>
|
|
It is RECOMMENDED that formatting characters be displayed and formatted in
|
|
the same manner as the text they apply to.
|
|
For example, the string "*emphasis*" would be rendered as
|
|
"<strong>*emphasis*</strong>".
|
|
</p>
|
|
</section1>
|
|
<section1 topic='Accessibility Considerations' anchor='access'>
|
|
<p>
|
|
When displaying text with formatting, developers should take care to ensure
|
|
sufficient contrast exists between styled and unstyled text so that users
|
|
with vision deficiencies are able to distinguish between the two.
|
|
</p>
|
|
<p>
|
|
Formatted text may also be rendered poorly by screen readers.
|
|
When applying formatting it may be desirable to include directives to
|
|
exclude formatting characters from being read.
|
|
</p>
|
|
</section1>
|
|
<section1 topic='Security Considerations' anchor='security'>
|
|
<p>
|
|
Authors of message styling parsers should take care that improperly
|
|
formatted messages cannot lead to buffer overruns or code execution.
|
|
</p>
|
|
</section1>
|
|
<section1 topic='IANA Considerations' anchor='iana'>
|
|
<p>
|
|
This document requires no interaction with &IANA;.
|
|
</p>
|
|
</section1>
|
|
<section1 topic='XMPP Registrar Considerations' anchor='registrar'>
|
|
<p>This specification requires no interaction with the ®ISTRAR;</p>
|
|
</section1>
|
|
<section1 topic='XML Schema' anchor='schema'>
|
|
<p>This document does not define any new XML structure requiring a schema.</p>
|
|
</section1>
|
|
<section1 topic='Acknowledgements' anchor='ack'>
|
|
<p>The author wishes to thank Kevin Smith for his review and feedback.</p>
|
|
</section1>
|
|
</xep>
|