xeps/inbox/styling.xml

352 lines
12 KiB
XML
Raw Normal View History

2017-11-01 23:51:12 -04:00
<?xml version='1.0' encoding='UTF-8'?>
<!DOCTYPE xep SYSTEM 'xep.dtd' [
<!ENTITY % ents SYSTEM 'xep.ent'>
%ents;
<!ENTITY rfc7764 "<span class='ref'><link url='http://tools.ietf.org/html/rfc7764'>RFC 7764</link></span> <note>RFC 7764: Guidance on Markdown: Design Philosophies, Stability Strategies, and Select Registrations &lt;<link url='http://tools.ietf.org/html/rfc7764'>http://tools.ietf.org/html/rfc7764</link>&gt;.</note>" >
<!ENTITY uax9 "<span class='ref'>Unicode Standard Annex #9</span> <note>Unicode Standard Annex #9, &quot;Unicode Bidirectional Algorithm&quot;, edited by Mark Davis, Aharon Lanin, and Andrew Glass. An integral part of The Unicode Standard, &lt;<link url='http://unicode.org/reports/tr9/'>http://unicode.org/reports/tr9/</link>&gt;.</note>" >
]>
<?xml-stylesheet type='text/xsl' href='xep.xsl'?>
<xep>
<header>
<title>Message Styling</title>
<abstract>
2017-11-08 18:20:30 -05:00
This specification defines a formatted text syntax for use in instant
messages with simple text styling.
2017-11-01 23:51:12 -04:00
</abstract>
&LEGALNOTICE;
<number>xxxx</number>
<status>ProtoXEP</status>
<type>Standards Track</type>
<sig>Standards</sig>
<approver>Council</approver>
<dependencies>
<spec>XMPP Core</spec>
<spec>XEP-0001</spec>
</dependencies>
<supersedes/>
<supersededby/>
<shortname>styling</shortname>
&sam;
<revision>
<version>0.0.1</version>
<date>2017-10-28</date>
<initials>ssw</initials>
<remark><p>First draft.</p></remark>
</revision>
</header>
<section1 topic='Introduction' anchor='intro'>
<p>
Historically, XMPP has had no system for simple text styling.
Instead, specifications like &xep0071; that require full layout engines have
been used, leading to numerous security issues with implementations.
2017-11-08 18:20:30 -05:00
Some entities have also performed their own styling based on identifiers in
the body.
While this has worked well in the past, it is not interoperable and leads to
entities each supporting their own informal styling languages.
2017-11-01 23:51:12 -04:00
</p>
<p>
2017-11-08 18:20:30 -05:00
This specification aims to provide a single, interoperable formatted text
syntax that can be used by entities that do not require full layout engines.
2017-11-01 23:51:12 -04:00
</p>
</section1>
<section1 topic='Requirements' anchor='reqs'>
<ul>
<li>
Clients that do not support this specification MUST still be able to
receive messages sent by clients using this specification and display them
in a human-readable form.
</li>
<li>
Clients that support this specification MUST NOT be required to use a
layout engine such as HTML or LaTeX.
</li>
<li>
Messages formatted using this specification MUST NOT hinder readability on
receiving clients regardless of client background color, contrast, or
window size.
</li>
<li>
Messages formatted using this specification MUST NOT hinder readability by
users with color vision deficiency or impaired vision.
</li>
<li>
Messages formatted with this specification MUST render correctly in
locales with right-to-left (RTL) layouts without causing confusion.
</li>
<li>
Clients that support this specification MUST NOT be required to extract
metadata unrelated to formatting or text style from the message.
</li>
<li>
Servers MUST NOT need to implement any new functionality for this
specification to be supported.
</li>
</ul>
</section1>
<section1 topic='Glossary' anchor='glossary'>
<p>
Many important terms used in this document are defined in &unicode;.
The terms "left-to-right" (LTR) and "right-to-left" (RTL) are defined in
&uax9;.
2017-11-08 18:20:30 -05:00
The term "formatted text" is defined in &rfc7764;.
2017-11-01 23:51:12 -04:00
</p>
<dl>
<di>
<dt>Formal markup language</dt>
<dd>
A structured markup language such as LaTeX, SGML, HTML, or XML that is
formally defined and may include metadata unrelated to formatting or
text style.
</dd>
</di>
<di>
<dt>Plain text</dt>
<dd>
Text that does not convey any particular formatting or interpretation of
the text by computer programs.
</dd>
</di>
<di>
<dt>Whitespace character</dt>
<dd>
Any Unicode scalar value which has the property "White_Space" or is in
category Z in the Unicode Character Database.
</dd>
</di>
</dl>
</section1>
<section1 topic='Use Cases' anchor='usecases'>
<ul>
<li>
2017-11-08 18:20:30 -05:00
As a user sending an instant message to a friend, I want to be able to
2017-11-01 23:51:12 -04:00
emphasize an important part of my message.
</li>
<li>
As a software developer, I want to be able to send pre-formatted,
monospace, block or inline text to another developer.
</li>
<li>
As a multi-user chat user I want to quote something someone said earlier
in the chat and make it evident that the text is a quotation.
</li>
</ul>
</section1>
2017-11-08 18:20:30 -05:00
<section1 topic='Business Rules' anchor='rules'>
<section2 topic='Blocks' anchor='block'>
2017-11-01 23:51:12 -04:00
<p>
2017-11-08 18:20:30 -05:00
A block is any chunk of text that can be parsed unambiguously in one pass.
2017-11-01 23:51:12 -04:00
</p>
2017-11-08 18:20:30 -05:00
<ul>
<li>A single line of text containing only inline spans</li>
<li>A block quotation comprising one or more lines</li>
<li>A preformatted code block</li>
</ul>
2017-11-01 23:51:12 -04:00
</section2>
2017-11-08 18:20:30 -05:00
<section2 topic='Spans' anchor='span'>
2017-11-01 23:51:12 -04:00
<p>
2017-11-08 18:20:30 -05:00
A span are groups of text that do not result in a line break when rendered
(they are rendered inline) and where the entire group is rendered in the
same manner and in the same block.
Spans may be either plain text with no formatting applied, or may be
formatted text that is enclosed by two styling directives.
The following are all single spans:
2017-11-01 23:51:12 -04:00
</p>
<ul>
2017-11-08 18:20:30 -05:00
<li>plain span</li>
<li><strong>*emphasized span*</strong></li>
2017-11-01 23:51:12 -04:00
</ul>
<p>
2017-11-08 18:20:30 -05:00
Matches of spans between two styling directives MUST contain some text
between the two styling directives and the opening styling directive MUST
be located at the beginning of the line, or after a whitespace character.
The opening styling directive MUST also not be followed by a whitespace
character.
The closing styling directive MUST NOT be preceeded by a whitespace
character.
2017-11-08 18:20:30 -05:00
Spans are always parsed from the beginning of the byte stream to the end
and are lazily matched.
Characters that would be styling directives but do not follow these rules
are not considered when matching and thus may be present between two other
styling directives.
2017-11-01 23:51:12 -04:00
</p>
<p>
For example, each of the following would be emphasized as indicated:
2017-11-01 23:51:12 -04:00
</p>
<ul>
<li><strong>*emphasized*</strong></li>
<li>foo <strong>*emphasized*</strong> bar</li>
<li><strong>*emphasized*</strong> foo <strong>*emphasized*</strong></li>
2017-11-01 23:51:12 -04:00
<li><strong>*emphasized*</strong>foo*</li>
2017-11-08 18:20:30 -05:00
<li>* foo <strong>*emphasized*</strong></li>
<li><strong>*emphasized *foo*</strong></li>
2017-11-01 23:51:12 -04:00
</ul>
<p>
2017-11-08 18:20:30 -05:00
Nothing would be styled in the following messages (where \n represents a
2017-11-01 23:51:12 -04:00
new line):
</p>
<ul>
<li>not emphasized*</li>
<li>*not emphasized</li>
2017-11-08 18:20:30 -05:00
<li>*not \n emphasized*</li>
<li>*foo *bar</li>
2017-11-01 23:51:12 -04:00
<li>**</li>
<li>****</li>
</ul>
</section2>
<section2 topic='Bold' anchor='bold'>
<p>
2017-11-08 18:20:30 -05:00
Text enclosed by '*' (U+002A ASTERISK) SHOULD be displayed with a greater
2017-11-01 23:51:12 -04:00
weight than the surrounding text (bold face).
</p>
<example caption='Bold'><![CDATA[
<body>
2017-11-08 18:20:30 -05:00
The full title is *Twelfth Night, or What You Will* but
2017-11-01 23:51:12 -04:00
*most* people shorten it.
</body>
]]></example>
</section2>
<section2 topic='Italic' anchor='italic'>
<p>
2017-11-08 18:20:30 -05:00
Text enclosed by '_' (U+005F LOW LINE) SHOULD be displayed in italics.
2017-11-01 23:51:12 -04:00
</p>
<example caption='Italic'><![CDATA[
<body>
The full title is _Twelfth Night, or What You Will_ but
2017-11-08 18:20:30 -05:00
_most_ people shorten it.
2017-11-01 23:51:12 -04:00
</body>
]]></example>
</section2>
<section2 topic='Strike through' anchor='strike'>
<p>
2017-11-08 18:20:30 -05:00
Text enclosed by '~' (U+007E TILDE) SHOULD be displayed with a horizontal
2017-11-01 23:51:12 -04:00
line through the middle (strike through).
</p>
<example caption='Strike through'><![CDATA[
<body>
Everyone ~dis~likes cake.
</body>
]]></example>
</section2>
2017-11-08 18:20:30 -05:00
<section2 topic='Inline Preformatted Text' anchor='pre-inline'>
2017-11-01 23:51:12 -04:00
<p>
2017-11-08 18:20:30 -05:00
Text enclosed by a '`' (U+0060 GRAVE ACCENT) SHOULD be displayed inline in
a monospace font.
2017-11-01 23:51:12 -04:00
Inline formatting directives inside the inline preformatted text are not
rendered.
For example, in the following the word "monospace" is valid pre-formatted
inline text:
</p>
<ul>
<li>This is <tt>`monospace`</tt></li>
<li>This is <tt>`*monospace*`</tt></li>
<li>This is <strong><tt>*`monospace and bold`*</tt></strong></li>
</ul>
<example caption='Monospace text'><![CDATA[
<body>
2017-11-08 18:20:30 -05:00
Wow, I can write in `monospace`!
2017-11-01 23:51:12 -04:00
</body>
]]></example>
</section2>
<section2 topic='Preformatted Block Text' anchor='pre-block'>
<p>
A block of text surrounded by lines consisting of a sequence of three
2017-11-08 18:20:30 -05:00
backticks, "```" (U+0060 GRAVE ACCENT), is preformatted text and should be
2017-11-01 23:51:12 -04:00
displayed exactly as it was entered including whitespace.
2017-11-08 18:20:30 -05:00
If no closing "```" sequence exists, the preformatted block extends to the
end of the input stream or the end of the parent block (whichever comes
first).
No other formatting described in this document should be rendered inside a
preformatted text block.
2017-11-01 23:51:12 -04:00
</p>
<example caption='Preformatted block text'><![CDATA[
<body>
```
(println &quot;Hello, world!&quot;)
```
2017-11-08 18:20:30 -05:00
This should show up as monospace, preformatted text ⤴
</body>
]]></example>
<example caption='No closing preformatted text sequence'><![CDATA[
<body>
&gt; ```
&gt; (println &quot;Hello, world!&quot;)
The entire blockquote is a preformatted text block, but this line is
plaintext!
2017-11-01 23:51:12 -04:00
</body>
]]></example>
</section2>
<section2 topic='Quotations' anchor='quote'>
<p>
A quotation is indicated by one or more lines with a byte stream beginning
2017-11-08 18:20:30 -05:00
with a '&gt;' (U+003E GREATER-THAN SIGN).
Block quotes may contain any child block, including other quotations.
Lines inside the block quote MUST have leading spaces trimmed before
parsing the child block.
2017-11-01 23:51:12 -04:00
</p>
<example caption='Quotation (LTR)'><![CDATA[
<body>
&gt; That that is, is.
Said the old hermit of Prague.
</body>
]]></example>
<example caption='Nested Quotation'><![CDATA[
<body>
&gt;&gt; That that is, is.
&gt; Said the old hermit of Prague.
Who?
</body>
]]></example>
</section2>
</section1>
<section1 topic='Implementation Notes' anchor='impl'>
<p>
2017-11-08 18:20:30 -05:00
This document does not define a regular grammar and thus styling cannot be
2017-11-01 23:51:12 -04:00
matched by a regular expression.
2017-11-08 18:20:30 -05:00
Instead, a predictive recursive descent or LALR parser may be constructed.
For instance, a simple parser can be constructed by first parsing all text
into blocks and then recursively parsing the child-blocks inside block
quotations, the spans inside plain lines, and by returning the text inside
preformatted blocks without modification.
</p>
<p>
It is RECOMMENDED that formatting characters be displayed and formatted in
the same manner as the text they apply to.
For example, the string "*emphasis*" would be rendered as
"<strong>*emphasis*</strong>".
2017-11-01 23:51:12 -04:00
</p>
</section1>
<section1 topic='Accessibility Considerations' anchor='access'>
<p>
When displaying text with formatting, developers should take care to ensure
sufficient contrast exists between styled and unstyled text so that users
with vision deficiencies are able to distinguish between the two.
</p>
2017-11-08 18:20:30 -05:00
<p>
Formatted text may also be rendered poorly by screen readers.
When applying formatting it may be desirable to include directives to
exclude formatting characters from being read.
</p>
2017-11-01 23:51:12 -04:00
</section1>
<section1 topic='Internationalization Considerations' anchor='i18n'>
<p>OPTIONAL.</p>
</section1>
<section1 topic='Security Considerations' anchor='security'>
<p>REQUIRED.</p>
</section1>
<section1 topic='IANA Considerations' anchor='iana'>
<p>
This document requires no interaction with &IANA;.
</p>
</section1>
<section1 topic='XMPP Registrar Considerations' anchor='registrar'>
2017-11-08 18:20:30 -05:00
<p>This specification requires no interaction with the &REGISTRAR;</p>
2017-11-01 23:51:12 -04:00
</section1>
<section1 topic='XML Schema' anchor='schema'>
<p>This document does not define any new XML structure requiring a schema.</p>
</section1>
</xep>