No Description
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

xep-0393.xml 15KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254255256257258259260261262263264265266267268269270271272273274275276277278279280281282283284285286287288289290291292293294295296297298299300301302303304305306307308309310311312313314315316317318319320321322323324325326327328329330331332333334335336337338339340341342343344345346347348349350351352353354355356357358359360361362363364365366367368369370371372373374375376377378379380381382383384385386387388389390391392393394395396397398399400401402403404405406407408409410411412413414415416417418419420421422423424425426427428429430431432433434435436437438439440441442443444445446447448449450451452453454455
  1. <?xml version='1.0' encoding='UTF-8'?>
  2. <!DOCTYPE xep SYSTEM 'xep.dtd' [
  3. <!ENTITY % ents SYSTEM 'xep.ent'>
  4. %ents;
  5. ]>
  6. <?xml-stylesheet type='text/xsl' href='xep.xsl'?>
  7. <xep>
  8. <header>
  9. <title>Message Styling</title>
  10. <abstract>
  11. This specification defines a formatted text syntax for use in instant
  12. messages with simple text styling.
  13. </abstract>
  14. &LEGALNOTICE;
  15. <number>0393</number>
  16. <status>Experimental</status>
  17. <type>Standards Track</type>
  18. <sig>Standards</sig>
  19. <approver>Council</approver>
  20. <dependencies>
  21. <spec>XMPP Core</spec>
  22. <spec>XEP-0001</spec>
  23. </dependencies>
  24. <supersedes><spec>XEP-0071</spec></supersedes>
  25. <supersededby/>
  26. <shortname>styling</shortname>
  27. &sam;
  28. <revision>
  29. <version>0.1.4</version>
  30. <date>2018-05-01</date>
  31. <initials>ssw</initials>
  32. <remark>
  33. <p>
  34. Clarify language around strong emphasis.
  35. </p>
  36. </remark>
  37. </revision>
  38. <revision>
  39. <version>0.1.3</version>
  40. <date>2018-02-14</date>
  41. <initials>ssw</initials>
  42. <remark>
  43. <p>
  44. Reorder block and span sections, simplify block parsing, and update the
  45. definition of a span.
  46. </p>
  47. </remark>
  48. </revision>
  49. <revision>
  50. <version>0.1.2</version>
  51. <date>2018-01-13</date>
  52. <initials>ssw</initials>
  53. <remark>
  54. <p>
  55. Clarify block quote and plain text parsing and formatting behavior.
  56. </p>
  57. </remark>
  58. </revision>
  59. <revision>
  60. <version>0.1.1</version>
  61. <date>2018-01-12</date>
  62. <initials>ssw</initials>
  63. <remark>
  64. <p>
  65. Minor clarifications and updates, add security considerations, and
  66. expand the glossary.
  67. </p>
  68. </remark>
  69. </revision>
  70. <revision>
  71. <version>0.1.0</version>
  72. <date>2017-11-22</date>
  73. <initials>XEP Editor (ssw)</initials>
  74. <remark><p>First draft approved by the XMPP Council.</p></remark>
  75. </revision>
  76. <revision>
  77. <version>0.0.1</version>
  78. <date>2017-10-28</date>
  79. <initials>ssw</initials>
  80. <remark><p>First draft.</p></remark>
  81. </revision>
  82. </header>
  83. <section1 topic='Introduction' anchor='intro'>
  84. <p>
  85. Historically, XMPP has had no system for simple text styling.
  86. Instead, specifications like &xep0071; that require full layout engines have
  87. been used, leading to numerous security issues with implementations.
  88. Some entities have also performed their own styling based on identifiers in
  89. the body.
  90. While this has worked well in the past, it is not interoperable and leads to
  91. entities each supporting their own informal styling languages.
  92. </p>
  93. <p>
  94. This specification aims to provide a single, interoperable formatted text
  95. syntax that can be used by entities that do not require full layout engines.
  96. </p>
  97. </section1>
  98. <section1 topic='Requirements' anchor='reqs'>
  99. <ul>
  100. <li>
  101. Clients that do not support this specification MUST still be able to
  102. receive messages sent by clients using this specification and display them
  103. in a human-readable form.
  104. </li>
  105. <li>
  106. Clients that support this specification MUST NOT be required to use a
  107. layout engine such as HTML or LaTeX.
  108. </li>
  109. <li>
  110. Messages formatted using this specification MUST NOT hinder readability on
  111. receiving clients regardless of client background color, contrast, or
  112. window size.
  113. </li>
  114. <li>
  115. Messages formatted using this specification MUST NOT hinder readability by
  116. users with color vision deficiency or impaired vision.
  117. </li>
  118. <li>
  119. Messages formatted with this specification MUST render correctly in
  120. locales with right-to-left (RTL) layouts without causing confusion.
  121. </li>
  122. <li>
  123. Clients that support this specification MUST NOT be required to extract
  124. metadata unrelated to formatting or text style from the message.
  125. </li>
  126. <li>
  127. Servers MUST NOT need to implement any new functionality for this
  128. specification to be supported.
  129. </li>
  130. </ul>
  131. </section1>
  132. <section1 topic='Use Cases' anchor='usecases'>
  133. <ul>
  134. <li>
  135. As a user sending an instant message to a friend, I want to be able to
  136. emphasize an important part of my message.
  137. </li>
  138. <li>
  139. As a software developer, I want to be able to send code as pre-formatted,
  140. monospace, block or inline text to another developer.
  141. </li>
  142. <li>
  143. As a multi-user chat user I want to add context to my reply by quoting an
  144. earlier message in the chat.
  145. </li>
  146. </ul>
  147. </section1>
  148. <section1 topic='Glossary' anchor='glossary'>
  149. <p>
  150. Many important terms used in this document are defined in &unicode;.
  151. The terms "left-to-right" (LTR) and "right-to-left" (RTL) are defined in
  152. &uax9;.
  153. The term "formatted text" is defined in &rfc7764;.
  154. </p>
  155. <dl>
  156. <di>
  157. <dt>Block</dt>
  158. <dd>
  159. Any chunk of text that can be parsed unambiguously in one pass.
  160. Blocks may contain one or more children which may be other blocks or
  161. spans.
  162. For example:
  163. <ul>
  164. <li>A single line of text comprising one or more spans</li>
  165. <li>A block quotation</li>
  166. <li>A preformatted code block</li>
  167. </ul>
  168. </dd>
  169. </di>
  170. <di>
  171. <dt>Formal markup language</dt>
  172. <dd>
  173. A structured markup language such as LaTeX, SGML, HTML, or XML that is
  174. formally defined and may include metadata unrelated to formatting or
  175. text style.
  176. </dd>
  177. </di>
  178. <di>
  179. <dt>Plain text</dt>
  180. <dd>
  181. Text that does not convey any particular formatting or interpretation of
  182. the text by computer programs.
  183. </dd>
  184. </di>
  185. <di>
  186. <dt>Span</dt>
  187. <dd>
  188. A group of text that may be rendered inline alongside other spans.
  189. Spans may be either plain text with no formatting applied, or may be
  190. formatted text that is enclosed by two styling directives.
  191. Spans are always children of blocks and may not escape from their
  192. containing block.
  193. Some spans may contain child spans.
  194. The following all contain spans marked by parenthesis:
  195. <ul>
  196. <li>(plain span)</li>
  197. <li>(<strong>*strong span*</strong>)</li>
  198. <li>(<em>_emphasized span_</em>)</li>
  199. <li>(<em>_emphasized span containing </em>(<em><strong>*strong span*</strong></em>)<em>_</em>)</li>
  200. <li>(span one )(<strong>*span two*</strong>)</li>
  201. </ul>
  202. </dd>
  203. </di>
  204. <di>
  205. <dt>Styling directive</dt>
  206. <dd>
  207. A character or set of characters that indicates the beginning of a span
  208. or block.
  209. For example, in certain contexts the characters '*' (U+002A ASTERISK),
  210. and '_' (U+005F LOW LINE) may be styling directives that indicate the
  211. beginning of a strong or emphasis span and the string '```' (U+0060
  212. GRAVE ACCENT) may be a styling directive that indicate the beginning of
  213. a preformatted code block.
  214. </dd>
  215. </di>
  216. <di>
  217. <dt>Whitespace character</dt>
  218. <dd>
  219. Any Unicode scalar value which has the property "White_Space" or is in
  220. category Z in the Unicode Character Database.
  221. </dd>
  222. </di>
  223. </dl>
  224. </section1>
  225. <section1 topic='Business Rules' anchor='rules'>
  226. <section2 topic='Blocks' anchor='block'>
  227. <p>
  228. Parsers implementing message styling will first parse blocks and then
  229. parse child blocks or spans if allowed by the specific block type.
  230. </p>
  231. <section3 topic='Plain' anchor='line-block'>
  232. <p>
  233. Individual lines of text that are not inside of a preformatted text
  234. block are considered a "plain" block.
  235. Plain blocks are not bound by styling directives and do not imply
  236. formatting themselves, but they may contain spans which imply
  237. formatting.
  238. Plain blocks may not contain child blocks.
  239. </p>
  240. <example caption='Plain block text'><![CDATA[
  241. <body>
  242. (There are three blocks in this body marked by parens,)
  243. (but there is no *formatting)
  244. (as spans* may not escape blocks.)
  245. </body>
  246. ]]></example>
  247. </section3>
  248. <section3 topic='Preformatted Text' anchor='pre-block'>
  249. <p>
  250. A preformatted text block is started by a line beginning with "```"
  251. (U+0060 GRAVE ACCENT), and ended by a line containing only three grave
  252. accents or the end of the parent block (whichever comes first).
  253. Preformatted text blocks cannot contain child blocks or spans.
  254. Text inside a preformatted block SHOULD be displayed in a monospace font.
  255. </p>
  256. <example caption='Preformatted block text'><![CDATA[
  257. <body>
  258. ```ignored
  259. (println &quot;Hello, world!&quot;)
  260. ```
  261. This should show up as monospace, preformatted text ⤴
  262. </body>
  263. ]]></example>
  264. <example caption='No closing preformatted text sequence'><![CDATA[
  265. <body>
  266. &gt; ```
  267. &gt; (println &quot;Hello, world!&quot;)
  268. The entire blockquote is a preformatted text block, but this line
  269. is plaintext!
  270. </body>
  271. ]]></example>
  272. </section3>
  273. <section3 topic='Quotations' anchor='quote'>
  274. <p>
  275. A quotation is indicated by one or more lines with a byte stream
  276. beginning with a '&gt;' (U+003E GREATER-THAN SIGN).
  277. Block quotes may contain any child block, including other quotations.
  278. Lines inside the block quote MUST have leading spaces trimmed before
  279. parsing the child block.
  280. It is RECOMMENDED that text inside of a block quote be indented or
  281. distinguished from the surrounding text in some other way.
  282. </p>
  283. <example caption='Quotation (LTR)'><![CDATA[
  284. <body>
  285. &gt; That that is, is.
  286. Said the old hermit of Prague.
  287. </body>
  288. ]]></example>
  289. <example caption='Nested Quotation'><![CDATA[
  290. <body>
  291. &gt;&gt; That that is, is.
  292. &gt; Said the old hermit of Prague.
  293. Who?
  294. </body>
  295. ]]></example>
  296. </section3>
  297. </section2>
  298. <section2 topic='Spans' anchor='span'>
  299. <p>
  300. Matches of spans between two styling directives MUST contain some text
  301. between the two styling directives and the opening styling directive MUST
  302. be located at the beginning of the line, or after a whitespace character.
  303. The opening styling directive MUST NOT be followed by a whitespace
  304. character and the closing styling directive MUST NOT be preceeded by a
  305. whitespace character.
  306. Spans are always parsed from the beginning of the byte stream to the end
  307. and are lazily matched.
  308. Characters that would be styling directives but do not follow these rules
  309. are not considered when matching and thus may be present between two other
  310. styling directives.
  311. </p>
  312. <p>
  313. For example, each of the following would be styled as indicated:
  314. </p>
  315. <ul>
  316. <li><strong>*strong*</strong></li>
  317. <li>plain <strong>*strong*</strong> plain</li>
  318. <li><strong>*strong*</strong> plain <strong>*strong*</strong></li>
  319. <li><strong>*strong*</strong>plain*</li>
  320. <li>* plain <strong>*strong*</strong></li>
  321. </ul>
  322. <p>
  323. Nothing would be styled in the following messages (where "\n" represents a
  324. new line):
  325. </p>
  326. <ul>
  327. <li>not strong*</li>
  328. <li>*not strong</li>
  329. <li>*not \n strong*</li>
  330. <li>*not *strong</li>
  331. <li>**</li>
  332. <li>****</li>
  333. </ul>
  334. <section3 topic='Plain' anchor='plain'>
  335. <p>
  336. Any text inside of a block that is not part of another span is
  337. implicitly considered to be inside of a "plain text" span.
  338. </p>
  339. <example caption='Plain'><![CDATA[
  340. <body>
  341. (Two spans, both )(*alike in dignity*)
  342. </body>
  343. ]]></example>
  344. </section3>
  345. <section3 topic='Emphasis' anchor='emph'>
  346. <p>
  347. Text enclosed by '_' (U+005F LOW LINE) is emphasized and SHOULD be
  348. displayed in italics.
  349. </p>
  350. <example caption='Italic'><![CDATA[
  351. <body>
  352. The full title is _Twelfth Night, or What You Will_ but
  353. _most_ people shorten it.
  354. </body>
  355. ]]></example>
  356. </section3>
  357. <section3 topic='Strong Emphasis' anchor='strong'>
  358. <p>
  359. Text enclosed by '*' (U+002A ASTERISK) is strongly emphasized and SHOULD
  360. be displayed with a heavier font weight than the surrounding text
  361. (bold).
  362. </p>
  363. <example caption='Strong'><![CDATA[
  364. <body>
  365. The full title is "Twelfth Night, or What You Will" but
  366. *most* people shorten it.
  367. </body>
  368. ]]></example>
  369. </section3>
  370. <section3 topic='Strike through' anchor='strike'>
  371. <p>
  372. Text enclosed by '~' (U+007E TILDE) SHOULD be displayed with a horizontal
  373. line through the middle.
  374. </p>
  375. <example caption='Strike through'><![CDATA[
  376. <body>
  377. Everyone ~dis~likes cake.
  378. </body>
  379. ]]></example>
  380. </section3>
  381. <section3 topic='Preformatted Span' anchor='mono'>
  382. <p>
  383. Text enclosed by a '`' (U+0060 GRAVE ACCENT) is a preformatted span SHOULD
  384. be displayed inline in a monospace font.
  385. A preformatted span may only contain a single plain span.
  386. Inline formatting directives inside the preformatted span are not
  387. rendered.
  388. For example, the following all contain valid preformatted spans:
  389. </p>
  390. <ul>
  391. <li>This is <tt>`monospace`</tt></li>
  392. <li>This is <tt>`*monospace*`</tt></li>
  393. <li>This is <strong><tt>*`monospace and bold`*</tt></strong></li>
  394. </ul>
  395. <example caption='Monospace text'><![CDATA[
  396. <body>
  397. Wow, I can write in `monospace`!
  398. </body>
  399. ]]></example>
  400. </section3>
  401. </section2>
  402. </section1>
  403. <section1 topic='Implementation Notes' anchor='impl'>
  404. <p>
  405. This document does not define a regular grammar and thus styling cannot be
  406. matched by a regular expression.
  407. Instead, a simple parser can be constructed by first parsing all text into
  408. blocks and then recursively parsing the child-blocks inside block
  409. quotations, the spans inside individual lines, and by returning the text
  410. inside preformatted blocks without modification.
  411. </p>
  412. <p>
  413. It is RECOMMENDED that formatting characters be displayed and formatted in
  414. the same manner as the text they apply to.
  415. For example, the string "*emphasis*" would be rendered as
  416. "<strong>*emphasis*</strong>".
  417. </p>
  418. </section1>
  419. <section1 topic='Accessibility Considerations' anchor='access'>
  420. <p>
  421. When displaying text with formatting, developers should take care to ensure
  422. sufficient contrast exists between styled and unstyled text so that users
  423. with vision deficiencies are able to distinguish between the two.
  424. </p>
  425. <p>
  426. Formatted text may also be rendered poorly by screen readers.
  427. When applying formatting it may be desirable to include directives to
  428. exclude formatting characters from being read.
  429. </p>
  430. </section1>
  431. <section1 topic='Security Considerations' anchor='security'>
  432. <p>
  433. Authors of message styling parsers should take care that improperly
  434. formatted messages cannot lead to buffer overruns or code execution.
  435. </p>
  436. </section1>
  437. <section1 topic='IANA Considerations' anchor='iana'>
  438. <p>
  439. This document requires no interaction with &IANA;.
  440. </p>
  441. </section1>
  442. <section1 topic='XMPP Registrar Considerations' anchor='registrar'>
  443. <p>This specification requires no interaction with the &REGISTRAR;</p>
  444. </section1>
  445. <section1 topic='XML Schema' anchor='schema'>
  446. <p>This document does not define any new XML structure requiring a schema.</p>
  447. </section1>
  448. <section1 topic='Acknowledgements' anchor='ack'>
  449. <p>The author wishes to thank Kevin Smith for his review and feedback.</p>
  450. </section1>
  451. </xep>