<abstract>This specification provides a set of algorithms to consistently generate colors given a string. The string can be a nickname, a JID or any other piece of information. All entities adhering to this specification generate the same color for the same string, which provides a consistent user experience across platforms.</abstract>
<p>Colors provide a valuable visual cue to recognize shapes. Recognition of colors works much faster than recognition of text. Together with the length and overall shape of a piece of text (such as a nickname), a color provides a decent amount of entropy to distinguish a reasonable amount of entities, without having to actually read the text.</p>
<p>Clients have been using randomly or deterministically chosen colors for users in multi-user situations for a long time already. However, since there has been no standard for how this is implemented, the experience differs across platforms. The goal of this XEP is to provide a uniform, platform-independent, stateless and easy-to-implement way to map arbitrary bytestrings to colors, as well as give recommendations how this is applied to color names of participants in conversations, roster entries and other pieces of text.</p>
<p>To allow cross-client use, it is important that the color scheme can be adapted to different environments. This specification provides means to adapt colors to different background colors as well as &cvds;.</p>
<p>In no way is the system presented in this specification a replacement for names. It only serves as an additional visual aid.</p>
<p>The color generation mechanism should provide the following features:</p>
<ul>
<li>Consistent generation of color across all platforms depending solely on the identifier used as input for the algorithm.</li>
<li>The system should be reasonably fast; it must be possible to, for example, apply it to all roster entries even of very large rosters in reasonable amount of time.</li>
<li>It must be able to provide decent contrast on any background.</li>
<li>The implementation should be stateless and not be complex.</li>
<li>A fallback path for users with common types of &cvds; must be provided.</li>
<li>A fallback path for systems which can only use colors from a restricted palette must be provided.</li>
</ul>
</section1>
<section1topic='Use Cases'anchor='usecases'>
<section2topic='Generating a color'anchor='usecase-textcolor'>
<p>To generate a color from a string of text, the follownig algorithms are applied in order:</p>
<li>If the output device only supports a small palette of colors, <linkurl='#algorithm-mappalette'>map the angle to the closest palette color</link>.</li>
<li>If the output device supports RGB output, <linkurl='#algorithm-rgb'>Convert the angle to a RGB</link>.</li>
<p>In such cases, the color SHOULD be generated as described in the <linkurl='#usecase-textcolor'>Generating a color</link> section. The input used SHOULD be, in descending order of preference, (a) the bare JID of the user (not the room), (b) the nickname as chosen by the user in the room.</p>
<p>Implementations may want to show a picture in connection with a contact even if the contact does not have an avatar defined (e.g. via &xep0084;).</p>
<p>In such cases, auto-generating an avatar SHOULD happen as follows:</p>
<li>Obtain a name for the contact, in descending order of preference, (a) the nickname from the conversation, (b) the bare JID of the contact (<em>not</em> the bare JID of the conference in case of a &xep0045; room).</li>
<p>The algorithms in this document use the &hsluv;⎄ color space. It provides consistent brightness (for a given luminosity) across its entire definition space. There is also widespread library support.</p>
<li>Treat the output as little endian and extract the least-significant 16 bits. (These are the first two bytes of the output, with the second byte being the most significant one.)</li>
<p>Note: Some floating-point modulo implementations will return negative outputs for negative inputs. This algorithm assumes that your implementation returns <em>non-negative</em> outputs for all inputs.</p>
<p>Note: the same effect can be achieved by forcing the two most-significant bits of the angle to be equal to the second-most-significant bit before converting to a float in <linkurl="#algorithm-angle">Angle generation</link>. This avoids having to perform a floating-point modulo operation.</p>
<p>Note: the same effect can be achieved by setting the most-significant bit to zero before conversion to floating point in <linkurl="#algorithm-angle">Angle generation</link>. This avoids having to perform a floating-point modulo operation.</p>
<p>Use the HSLuv operation <tt>hsluvToRgb</tt> to convert the Hue angle to a color. For this, saturation SHOULD be set to 100 and lightness SHOULD be set to 50.</p>
<p>Note: when the algorithm finishes, the mapping maps angles (rounded to two decimal places) to the R, G, B triples which come closest to the desired color and lightness.</p>
<li>If the R, G and B values are equal, skip the color and continue with the next. (Grayscale does not work well, since its saturation and hue are undefined.)</li>
<li>Calculate H, S and L from R, G, B using HSLuv.</li>
<li>Round the angle to the next integer value.</li>
<li>If the angle is not in the mapping M yet, or if the L value of the existing entry is farther away from 73.2 than the new L value, put the L, R, G, and B values as value for the angle into the mapping.</li>
<p>Implementations are free to choose a representation for palette colors different from R, G, B triplets. The exact representation does not matter, as long as it can be converted to a Hue angle accordingly.</p>
<p>Note: See <linkurl='#algorithm-genpalette'>Conversion of an RGB color palette to a Hue palette</link> on how to convert an R, G, B triplet to an angle.</p>
<p>Implementations are free to choose a representation for palette colors different from R, G, B triplets. The exact representation does not matter, as long as it can be converted to a Hue angle accordingly.</p>
<section2topic='Background Color Correction'anchor='impl-bgcolor'>
<p>An implementation which shows the generated colors on a colored background SHOULD apply <linkurl='#algorithm-bg'>Adapting the Color for specific Background Colors</link>. If the background is not uniformly colored, it is up to the implementation to determine an appropriate surrogate background color to correct against.</p>
<p>If an implementation shows the generated colors on a grayscale (including white and black) background, it MAY apply the background color correction algorithm. It is RECOMMENDED to always apply the algorithm if the background color is changed dynamically, to avoid discontinuities between grayscale and colored backgrounds.</p>
<p>Implementations SHOULD use the same background color for all generated colors. If this is not feasible, implementations SHOULD use the same background color for all generated colors within the same GUI control (for example, within a conversation and within the roster).</p>
<p>When processing JIDs as text input, implementations MUST prepare the JID as it would for comparing it to another JID with a case-sensitive comparison function.</p>
<p>As outlined above, implementations SHOULD offer the &rgblind; and &bblind; corrections as defined in the <linkurl='#algorithm-cvd'>Corrections for &cvds;</link> section. Users SHOULD be allowed to choose between:</p>
<p>Some sources on the internet indicate that people with &cvds; may profit from having larger areas of color to be able to recognize them. This should be taken into consideration when selecting font weights and line widths for colored parts.</p>
<p>This specification extracts a bit more information from an entity and shows it alongside the existing information to the user. As the algorithm is likely to produce different colors for look-alikes (see &xep0165; for examples) in JIDs, it may add additional protection against attacks based on those.</p>
<p>Due to the limited set of distinguishable colors and only extracting 16 bits of the hash function output, possible &cvds; and/or use of palettes, entities MUST NOT rely on colors being unique in any context.</p>
<p>This section provides an overview of design considerations made while writing this specification. It shows alternatives which have been considered, and eventually rejected.</p>
<section2topic='The YCbCr color space'anchor='design-ycbcr'>
<p>The versions up to 0.5 of this document used a variant of the YCbCr color space (namely &BT.601;) along with a custom algorithm to convert from angles to CbCr and from there to RGB. The HSLuv color space provides extremely consistent apparent brightness of the colors which cannot be achieved with simple application of YCbCr. In addition, HSLuv has widespread library support.</p>
<section2topic='Hue-Saturation-Value/Lightness color space'anchor='design-hsv'>
<p>The HSV and HSL color spaces fail to provide uniform luminosity with fixed value/lightness and saturation parameters. Adapting those parameters for uniform luminosity across the hue range would have complicated the algorithm with litte to no gain.</p>
</section2>
<section2topic='Palette-based and context-aware coloring'anchor='design-context'>
<p>Given a fixed-size and finite palette of colors, it would be possible to ensure that, until the number of entities to color exceeds the number of colors, no color collisions happen.</p>
<p>There are issues with this approach when the set of entities is dynamic. In such cases, it is possible that an entity changes its associated color (for example by re-joining a colored group chat), which defeats the original purpose.</p>
<p>In addition, more state needs to be taken into account, increasing the complexity of choosing a color.</p>
</section2>
<section2topic='Choice of mixing function in angle generation'anchor='design-mixing'>
<p>This specification needs to collapse an arbitrarily long string into just a few bits (the angle in the CbCr plane). To do so, SHA-1 (&rfc3174;) is used.</p>
<p>CRC32 and Adler32 have been considered as faster alternatives. Downsides of these functions:</p>
<ul>
<li>Bad mixing without additional entropy.</li>
<li>Adler32 is rarely available in standard libraries.</li>
<li>CRC32 is ambiguous: there are multiple polynomials in widespread use (e.g. the Ethernet and the zlib polynomials). Often it is not clear which polynomial is used by a library.</li>
</ul>
<p>SHA-1 is widely available. From a security point of view, the exact choice of hash function does not matter here, since it is truncated to 16 bits. At this length, any cryptographic hash function is weak.</p>
<p>The palette-mapping algorithm operates on angles only and disregards the Y value except if the angles match. This has the downside that the brightness is not equal over the range of the palette mapped colors.</p>
<p>The alternative would be to require Y to be close to the target Y. This has several issues:</p>
<ul>
<li>We cannot know if a palette can satisfy the given Y at all.</li>
<li>Many colors from e.g. the "Web Safe" palette (used in 256 color terminals and the test vectors) will not satisfy any given Y, reducing the size of the effective palette drastically.</li>
</ul>
<p>For the sake of having more colors available, the given algorithm was chosen which prefers many colors with hue conformance over fewer colors with hue and lightness conformance.</p>
<section2topic='Input for color generation in a Multi-User Chat (XEP-0045) context'anchor='design-muc-input'>
<p>In &xep0045; conversations (MUCs), there are two viable choices for the hash function input when generating a color for a participant: the nickname as chosen by the participant (or their full JID) and the participants real bare JID. Both options have advantages and disadvantages. The advantages of using the nickname are:</p>
<ul>
<li>Yields the same output even in anonymous MUCs if the same nickname is used.</li>
<li>Is guaranteed to work for transports implementing &xep0045;.</li>
</ul>
<p>The advantages of using the bare JID are:</p>
<ul>
<li>Allows to use the same color in 1:1 communications with that entity as in group chats, helping with recognizability.</li>
<li>Stays constant even across changes of ephemeral information like nicknames.</li>
</ul>
<p>There is no obvious correct choice to make here; both choices break in different use-cases. Specifically, the "nickname" choice breaks when the same entity has different nicknames in different rooms, as well as when two different entities have the same nickname in different rooms. The "bare JID" choice breaks when (semi-)anonymous MUCs are involved.</p>
<p>The choice "bare JID" has the conceptual advantage that it ties as closely as possible to the identity of the entity. It is also forward-compatible with future protocols where nicknames might not be available or work differently.</p>
<p>Thanks to Klaus Herberth, Daniel Gultsch, Georg Lukas, Tobias Markmann, Christian Schudt, and Marcus Waldvogel for their input and feedback on this document.</p>
<p>This section holds test vectors for the different configurations. The test vectors are provided as Comma Separated Values. Strings are enclosed by single quotes ('). The first line contains a header. Each row contains, in that order, the original text, the text encoded as UTF-8 as hexadecimal octets, the angle in degrees, the calculated hue in degrees (differs from angle only for CVD-corrected rows), and the Red, Green, and Blue values.</p>
<p>The used palette can be generated by sampling the RGB cube evenly with six samples on each axis (resulting in 210 colors (grayscales are excluded)). The resulting palette is commonly known as the palette of so-called "Web Safe" colors.</p>