Add more recommended reading to 0286

Update compression section with better numbers
This commit is contained in:
Sam Whited 2015-07-23 17:59:53 -05:00 committed by Matthew A. Miller
parent e183820d43
commit f6b6883aff
1 changed files with 34 additions and 26 deletions

View File

@ -35,6 +35,12 @@
<email>sam@samwhited.com</email>
<jid>sam@samwhited.com</jid>
</author>
<revision>
<version>0.3</version>
<date>2015-07-24</date>
<initials>ssw</initials>
<remark><p>Include real world compression numbers and additional recommended reading.</p></remark>
</revision>
<revision>
<version>0.2</version>
<date>2015-07-22</date>
@ -82,33 +88,34 @@
<section1 topic='Compression' anchor='compression'>
<p>
XML, and by extension XMPP, is known to be highly compressible. In a simple
test of a small (266089 byte) XMPP stream (connection, stream
initialization, feature discovery, roster loading, several presence stanzas
sent and received, disconnect), the entropy of the stream was found to be
5.616313 bits per byte. Using the `gzip` tool to apply Lempel-Ziv coding
(LZ77) without concern for server-side CPU usage resulted in a compression
ratio of 21% (a 79% reduction in bandwidth). In one test with a much larger
dataset typical of a corporate environment (many hundreds of users in the
roster), the ratio was as low as 13%, an 87% reduction in bandwidth!
</p>
<p>
XML, and by extension XMPP, is known to be highly compressible.
Compression of XMPP data can be achieved with the DEFLATE algorithm
(&rfc1951;) via TLS compression (&rfc3749;) or &xep0138;. While the
security implications of stream compression are beyond the scope of this
document (See the aforementioned RFC or XEP for more info), mitigating them
may affect compression ratios. The author does not recommend using TLS
compression with XMPP (or in general). If compression must be used, stream
level compression should be implemented instead. Compressing at the stream
level gives us the benefit of being able to flush the compression stream on
stanza boundaries to help prevent information from leaking. This, however,
may drastically increase compression ratios.
(&rfc1951;) via TLS compression (&rfc3749;) or &xep0138; (which also
supports other compression algorithms). While the security implications of
stream compression are beyond the scope of this document (See the
aforementioned RFC or XEP for more info), the author does not recommend
using TLS compression with XMPP (or in general). If compression must be
used, stream level compression should be implemented instead, and the
compressed stream should have a full flush performed on stanza boundaries
to help prevent a class of chosen plaintext attacks which can cause data
leakage in compressed streams. While this may mitigate some of the benefits
of compression by raising compression ratios, in a large, real world
deployment at HipChat, network traffic was still observed to decrease by a
factor of 0.58 when enabling &xep0138; with ZLIB compression!
</p>
<p>
While the CPU cost of compression directly translates to higher power
While the CPU cost of compression may directly translate to higher power
usage, it is vastly outweighed by the benefits of reduced network
utilization, especially on modern LTE networks which use a great deal more
power per bit than 3G networks as will be seen later in this document.
However, CPU usage is also not guaranteed to rise due to compression. In
the aforementioned deployment of stream compression, a <em>decrease</em> in
CPU utilization by a factor of 0.60 was observed due to the fact that there
were fewer packets that needed to be handled by the OS (which also takes
CPU time), and, potentially more importantly, less data that needed to be
TLS-encrypted (which is a much more CPU-expensive operation than
compression). Therefore CPU time spent on compression (for ZLIB, at least;
other algorithms were not tested) should be considered negligable.
</p>
<p>
Supporting compression and flushing on stanza boundaries is highly
@ -184,6 +191,7 @@
servers.
</p>
<p>&xep0138; provides stream level compression.</p>
<p>&xep0322; allows XMPP streams to use the EXI XML format.</p>
<p>
&xep0115; provides a mechanism for caching, and hence eliding, the
disco#info requests needed to negotiate optional features.
@ -200,19 +208,19 @@
<p>
&xep0357; implements push notifications (third party message delivery),
which are often used on mobile devices and highly optimized to conserve
battery. Push notifications also allow delivery of notifications to
mobile clients that are currently offline (eg. in an XEP-0198 "zombie"
state).
battery. Push notifications also allow delivery of notifications to mobile
clients that are currently offline (eg. in an XEP-0198 "zombie" state).
</p>
<p>
&xep0313; lets clients fetch messages which they missed (eg. due to poor
mobile coverage and a flakey network connection).
mobile coverage and a flaky network connection).
</p>
</section1>
<section1 topic='Acknowledgements' anchor='acks'>
<p>
This XEP was originally written by Dave Cridland, and parts of his original
work were used in this rewrite.
work were used in this rewrite. Thanks to Atlassian for allowing me to
release hard numbers from their XMPP compression deployment.
</p>
</section1>
<section1 topic='Security Considerations' anchor='security'>