mirror of https://github.com/moparisthebest/xeps
You can not select more than 25 topics
Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
455 lines
15 KiB
455 lines
15 KiB
<?xml version='1.0' encoding='UTF-8'?> |
|
<!DOCTYPE xep SYSTEM 'xep.dtd' [ |
|
<!ENTITY % ents SYSTEM 'xep.ent'> |
|
%ents; |
|
]> |
|
<?xml-stylesheet type='text/xsl' href='xep.xsl'?> |
|
<xep> |
|
<header> |
|
<title>Message Styling</title> |
|
<abstract> |
|
This specification defines a formatted text syntax for use in instant |
|
messages with simple text styling. |
|
</abstract> |
|
&LEGALNOTICE; |
|
<number>0393</number> |
|
<status>Experimental</status> |
|
<type>Standards Track</type> |
|
<sig>Standards</sig> |
|
<approver>Council</approver> |
|
<dependencies> |
|
<spec>XMPP Core</spec> |
|
<spec>XEP-0001</spec> |
|
</dependencies> |
|
<supersedes><spec>XEP-0071</spec></supersedes> |
|
<supersededby/> |
|
<shortname>styling</shortname> |
|
&sam; |
|
<revision> |
|
<version>0.1.4</version> |
|
<date>2018-05-01</date> |
|
<initials>ssw</initials> |
|
<remark> |
|
<p> |
|
Clarify language around strong emphasis. |
|
</p> |
|
</remark> |
|
</revision> |
|
<revision> |
|
<version>0.1.3</version> |
|
<date>2018-02-14</date> |
|
<initials>ssw</initials> |
|
<remark> |
|
<p> |
|
Reorder block and span sections, simplify block parsing, and update the |
|
definition of a span. |
|
</p> |
|
</remark> |
|
</revision> |
|
<revision> |
|
<version>0.1.2</version> |
|
<date>2018-01-13</date> |
|
<initials>ssw</initials> |
|
<remark> |
|
<p> |
|
Clarify block quote and plain text parsing and formatting behavior. |
|
</p> |
|
</remark> |
|
</revision> |
|
<revision> |
|
<version>0.1.1</version> |
|
<date>2018-01-12</date> |
|
<initials>ssw</initials> |
|
<remark> |
|
<p> |
|
Minor clarifications and updates, add security considerations, and |
|
expand the glossary. |
|
</p> |
|
</remark> |
|
</revision> |
|
<revision> |
|
<version>0.1.0</version> |
|
<date>2017-11-22</date> |
|
<initials>XEP Editor (ssw)</initials> |
|
<remark><p>First draft approved by the XMPP Council.</p></remark> |
|
</revision> |
|
<revision> |
|
<version>0.0.1</version> |
|
<date>2017-10-28</date> |
|
<initials>ssw</initials> |
|
<remark><p>First draft.</p></remark> |
|
</revision> |
|
</header> |
|
<section1 topic='Introduction' anchor='intro'> |
|
<p> |
|
Historically, XMPP has had no system for simple text styling. |
|
Instead, specifications like &xep0071; that require full layout engines have |
|
been used, leading to numerous security issues with implementations. |
|
Some entities have also performed their own styling based on identifiers in |
|
the body. |
|
While this has worked well in the past, it is not interoperable and leads to |
|
entities each supporting their own informal styling languages. |
|
</p> |
|
<p> |
|
This specification aims to provide a single, interoperable formatted text |
|
syntax that can be used by entities that do not require full layout engines. |
|
</p> |
|
</section1> |
|
<section1 topic='Requirements' anchor='reqs'> |
|
<ul> |
|
<li> |
|
Clients that do not support this specification MUST still be able to |
|
receive messages sent by clients using this specification and display them |
|
in a human-readable form. |
|
</li> |
|
<li> |
|
Clients that support this specification MUST NOT be required to use a |
|
layout engine such as HTML or LaTeX. |
|
</li> |
|
<li> |
|
Messages formatted using this specification MUST NOT hinder readability on |
|
receiving clients regardless of client background color, contrast, or |
|
window size. |
|
</li> |
|
<li> |
|
Messages formatted using this specification MUST NOT hinder readability by |
|
users with color vision deficiency or impaired vision. |
|
</li> |
|
<li> |
|
Messages formatted with this specification MUST render correctly in |
|
locales with right-to-left (RTL) layouts without causing confusion. |
|
</li> |
|
<li> |
|
Clients that support this specification MUST NOT be required to extract |
|
metadata unrelated to formatting or text style from the message. |
|
</li> |
|
<li> |
|
Servers MUST NOT need to implement any new functionality for this |
|
specification to be supported. |
|
</li> |
|
</ul> |
|
</section1> |
|
<section1 topic='Use Cases' anchor='usecases'> |
|
<ul> |
|
<li> |
|
As a user sending an instant message to a friend, I want to be able to |
|
emphasize an important part of my message. |
|
</li> |
|
<li> |
|
As a software developer, I want to be able to send code as pre-formatted, |
|
monospace, block or inline text to another developer. |
|
</li> |
|
<li> |
|
As a multi-user chat user I want to add context to my reply by quoting an |
|
earlier message in the chat. |
|
</li> |
|
</ul> |
|
</section1> |
|
<section1 topic='Glossary' anchor='glossary'> |
|
<p> |
|
Many important terms used in this document are defined in &unicode;. |
|
The terms "left-to-right" (LTR) and "right-to-left" (RTL) are defined in |
|
&uax9;. |
|
The term "formatted text" is defined in &rfc7764;. |
|
</p> |
|
<dl> |
|
<di> |
|
<dt>Block</dt> |
|
<dd> |
|
Any chunk of text that can be parsed unambiguously in one pass. |
|
Blocks may contain one or more children which may be other blocks or |
|
spans. |
|
For example: |
|
<ul> |
|
<li>A single line of text comprising one or more spans</li> |
|
<li>A block quotation</li> |
|
<li>A preformatted code block</li> |
|
</ul> |
|
</dd> |
|
</di> |
|
<di> |
|
<dt>Formal markup language</dt> |
|
<dd> |
|
A structured markup language such as LaTeX, SGML, HTML, or XML that is |
|
formally defined and may include metadata unrelated to formatting or |
|
text style. |
|
</dd> |
|
</di> |
|
<di> |
|
<dt>Plain text</dt> |
|
<dd> |
|
Text that does not convey any particular formatting or interpretation of |
|
the text by computer programs. |
|
</dd> |
|
</di> |
|
<di> |
|
<dt>Span</dt> |
|
<dd> |
|
A group of text that may be rendered inline alongside other spans. |
|
Spans may be either plain text with no formatting applied, or may be |
|
formatted text that is enclosed by two styling directives. |
|
Spans are always children of blocks and may not escape from their |
|
containing block. |
|
Some spans may contain child spans. |
|
The following all contain spans marked by parenthesis: |
|
<ul> |
|
<li>(plain span)</li> |
|
<li>(<strong>*strong span*</strong>)</li> |
|
<li>(<em>_emphasized span_</em>)</li> |
|
<li>(<em>_emphasized span containing </em>(<em><strong>*strong span*</strong></em>)<em>_</em>)</li> |
|
<li>(span one )(<strong>*span two*</strong>)</li> |
|
</ul> |
|
</dd> |
|
</di> |
|
<di> |
|
<dt>Styling directive</dt> |
|
<dd> |
|
A character or set of characters that indicates the beginning of a span |
|
or block. |
|
For example, in certain contexts the characters '*' (U+002A ASTERISK), |
|
and '_' (U+005F LOW LINE) may be styling directives that indicate the |
|
beginning of a strong or emphasis span and the string '```' (U+0060 |
|
GRAVE ACCENT) may be a styling directive that indicate the beginning of |
|
a preformatted code block. |
|
</dd> |
|
</di> |
|
<di> |
|
<dt>Whitespace character</dt> |
|
<dd> |
|
Any Unicode scalar value which has the property "White_Space" or is in |
|
category Z in the Unicode Character Database. |
|
</dd> |
|
</di> |
|
</dl> |
|
</section1> |
|
<section1 topic='Business Rules' anchor='rules'> |
|
<section2 topic='Blocks' anchor='block'> |
|
<p> |
|
Parsers implementing message styling will first parse blocks and then |
|
parse child blocks or spans if allowed by the specific block type. |
|
</p> |
|
<section3 topic='Plain' anchor='line-block'> |
|
<p> |
|
Individual lines of text that are not inside of a preformatted text |
|
block are considered a "plain" block. |
|
Plain blocks are not bound by styling directives and do not imply |
|
formatting themselves, but they may contain spans which imply |
|
formatting. |
|
Plain blocks may not contain child blocks. |
|
</p> |
|
<example caption='Plain block text'><![CDATA[ |
|
<body> |
|
(There are three blocks in this body marked by parens,) |
|
(but there is no *formatting) |
|
(as spans* may not escape blocks.) |
|
</body> |
|
]]></example> |
|
</section3> |
|
<section3 topic='Preformatted Text' anchor='pre-block'> |
|
<p> |
|
A preformatted text block is started by a line beginning with "```" |
|
(U+0060 GRAVE ACCENT), and ended by a line containing only three grave |
|
accents or the end of the parent block (whichever comes first). |
|
Preformatted text blocks cannot contain child blocks or spans. |
|
Text inside a preformatted block SHOULD be displayed in a monospace font. |
|
</p> |
|
<example caption='Preformatted block text'><![CDATA[ |
|
<body> |
|
```ignored |
|
(println "Hello, world!") |
|
``` |
|
|
|
This should show up as monospace, preformatted text ⤴ |
|
</body> |
|
]]></example> |
|
<example caption='No closing preformatted text sequence'><![CDATA[ |
|
<body> |
|
> ``` |
|
> (println "Hello, world!") |
|
|
|
The entire blockquote is a preformatted text block, but this line |
|
is plaintext! |
|
</body> |
|
]]></example> |
|
</section3> |
|
<section3 topic='Quotations' anchor='quote'> |
|
<p> |
|
A quotation is indicated by one or more lines with a byte stream |
|
beginning with a '>' (U+003E GREATER-THAN SIGN). |
|
Block quotes may contain any child block, including other quotations. |
|
Lines inside the block quote MUST have leading spaces trimmed before |
|
parsing the child block. |
|
It is RECOMMENDED that text inside of a block quote be indented or |
|
distinguished from the surrounding text in some other way. |
|
</p> |
|
<example caption='Quotation (LTR)'><![CDATA[ |
|
<body> |
|
> That that is, is. |
|
|
|
Said the old hermit of Prague. |
|
</body> |
|
]]></example> |
|
<example caption='Nested Quotation'><![CDATA[ |
|
<body> |
|
>> That that is, is. |
|
> Said the old hermit of Prague. |
|
|
|
Who? |
|
</body> |
|
]]></example> |
|
</section3> |
|
</section2> |
|
<section2 topic='Spans' anchor='span'> |
|
<p> |
|
Matches of spans between two styling directives MUST contain some text |
|
between the two styling directives and the opening styling directive MUST |
|
be located at the beginning of the line, or after a whitespace character. |
|
The opening styling directive MUST NOT be followed by a whitespace |
|
character and the closing styling directive MUST NOT be preceeded by a |
|
whitespace character. |
|
Spans are always parsed from the beginning of the byte stream to the end |
|
and are lazily matched. |
|
Characters that would be styling directives but do not follow these rules |
|
are not considered when matching and thus may be present between two other |
|
styling directives. |
|
</p> |
|
<p> |
|
For example, each of the following would be styled as indicated: |
|
</p> |
|
<ul> |
|
<li><strong>*strong*</strong></li> |
|
<li>plain <strong>*strong*</strong> plain</li> |
|
<li><strong>*strong*</strong> plain <strong>*strong*</strong></li> |
|
<li><strong>*strong*</strong>plain*</li> |
|
<li>* plain <strong>*strong*</strong></li> |
|
</ul> |
|
<p> |
|
Nothing would be styled in the following messages (where "\n" represents a |
|
new line): |
|
</p> |
|
<ul> |
|
<li>not strong*</li> |
|
<li>*not strong</li> |
|
<li>*not \n strong*</li> |
|
<li>*not *strong</li> |
|
<li>**</li> |
|
<li>****</li> |
|
</ul> |
|
<section3 topic='Plain' anchor='plain'> |
|
<p> |
|
Any text inside of a block that is not part of another span is |
|
implicitly considered to be inside of a "plain text" span. |
|
</p> |
|
<example caption='Plain'><![CDATA[ |
|
<body> |
|
(Two spans, both )(*alike in dignity*) |
|
</body> |
|
]]></example> |
|
</section3> |
|
<section3 topic='Emphasis' anchor='emph'> |
|
<p> |
|
Text enclosed by '_' (U+005F LOW LINE) is emphasized and SHOULD be |
|
displayed in italics. |
|
</p> |
|
<example caption='Italic'><![CDATA[ |
|
<body> |
|
The full title is _Twelfth Night, or What You Will_ but |
|
_most_ people shorten it. |
|
</body> |
|
]]></example> |
|
</section3> |
|
<section3 topic='Strong Emphasis' anchor='strong'> |
|
<p> |
|
Text enclosed by '*' (U+002A ASTERISK) is strongly emphasized and SHOULD |
|
be displayed with a heavier font weight than the surrounding text |
|
(bold). |
|
</p> |
|
<example caption='Strong'><![CDATA[ |
|
<body> |
|
The full title is "Twelfth Night, or What You Will" but |
|
*most* people shorten it. |
|
</body> |
|
]]></example> |
|
</section3> |
|
<section3 topic='Strike through' anchor='strike'> |
|
<p> |
|
Text enclosed by '~' (U+007E TILDE) SHOULD be displayed with a horizontal |
|
line through the middle. |
|
</p> |
|
<example caption='Strike through'><![CDATA[ |
|
<body> |
|
Everyone ~dis~likes cake. |
|
</body> |
|
]]></example> |
|
</section3> |
|
<section3 topic='Preformatted Span' anchor='mono'> |
|
<p> |
|
Text enclosed by a '`' (U+0060 GRAVE ACCENT) is a preformatted span SHOULD |
|
be displayed inline in a monospace font. |
|
A preformatted span may only contain a single plain span. |
|
Inline formatting directives inside the preformatted span are not |
|
rendered. |
|
For example, the following all contain valid preformatted spans: |
|
</p> |
|
<ul> |
|
<li>This is <tt>`monospace`</tt></li> |
|
<li>This is <tt>`*monospace*`</tt></li> |
|
<li>This is <strong><tt>*`monospace and bold`*</tt></strong></li> |
|
</ul> |
|
<example caption='Monospace text'><![CDATA[ |
|
<body> |
|
Wow, I can write in `monospace`! |
|
</body> |
|
]]></example> |
|
</section3> |
|
</section2> |
|
</section1> |
|
<section1 topic='Implementation Notes' anchor='impl'> |
|
<p> |
|
This document does not define a regular grammar and thus styling cannot be |
|
matched by a regular expression. |
|
Instead, a simple parser can be constructed by first parsing all text into |
|
blocks and then recursively parsing the child-blocks inside block |
|
quotations, the spans inside individual lines, and by returning the text |
|
inside preformatted blocks without modification. |
|
</p> |
|
<p> |
|
It is RECOMMENDED that formatting characters be displayed and formatted in |
|
the same manner as the text they apply to. |
|
For example, the string "*emphasis*" would be rendered as |
|
"<strong>*emphasis*</strong>". |
|
</p> |
|
</section1> |
|
<section1 topic='Accessibility Considerations' anchor='access'> |
|
<p> |
|
When displaying text with formatting, developers should take care to ensure |
|
sufficient contrast exists between styled and unstyled text so that users |
|
with vision deficiencies are able to distinguish between the two. |
|
</p> |
|
<p> |
|
Formatted text may also be rendered poorly by screen readers. |
|
When applying formatting it may be desirable to include directives to |
|
exclude formatting characters from being read. |
|
</p> |
|
</section1> |
|
<section1 topic='Security Considerations' anchor='security'> |
|
<p> |
|
Authors of message styling parsers should take care that improperly |
|
formatted messages cannot lead to buffer overruns or code execution. |
|
</p> |
|
</section1> |
|
<section1 topic='IANA Considerations' anchor='iana'> |
|
<p> |
|
This document requires no interaction with &IANA;. |
|
</p> |
|
</section1> |
|
<section1 topic='XMPP Registrar Considerations' anchor='registrar'> |
|
<p>This specification requires no interaction with the ®ISTRAR;</p> |
|
</section1> |
|
<section1 topic='XML Schema' anchor='schema'> |
|
<p>This document does not define any new XML structure requiring a schema.</p> |
|
</section1> |
|
<section1 topic='Acknowledgements' anchor='ack'> |
|
<p>The author wishes to thank Kevin Smith for his review and feedback.</p> |
|
</section1> |
|
</xep>
|
|
|