1
0
mirror of https://github.com/moparisthebest/xeps synced 2024-11-28 12:12:22 -05:00

XEP-0286 v0.3: Include real world compression numbers and additional recommended reading.

This commit is contained in:
Matthew A. Miller 2015-07-27 08:16:34 -06:00
commit 751ef3ed25

View File

@ -35,6 +35,12 @@
<email>sam@samwhited.com</email> <email>sam@samwhited.com</email>
<jid>sam@samwhited.com</jid> <jid>sam@samwhited.com</jid>
</author> </author>
<revision>
<version>0.3</version>
<date>2015-07-24</date>
<initials>ssw</initials>
<remark><p>Include real world compression numbers and additional recommended reading.</p></remark>
</revision>
<revision> <revision>
<version>0.2</version> <version>0.2</version>
<date>2015-07-22</date> <date>2015-07-22</date>
@ -82,33 +88,34 @@
<section1 topic='Compression' anchor='compression'> <section1 topic='Compression' anchor='compression'>
<p> <p>
XML, and by extension XMPP, is known to be highly compressible. In a simple XML, and by extension XMPP, is known to be highly compressible.
test of a small (266089 byte) XMPP stream (connection, stream
initialization, feature discovery, roster loading, several presence stanzas
sent and received, disconnect), the entropy of the stream was found to be
5.616313 bits per byte. Using the `gzip` tool to apply Lempel-Ziv coding
(LZ77) without concern for server-side CPU usage resulted in a compression
ratio of 21% (a 79% reduction in bandwidth). In one test with a much larger
dataset typical of a corporate environment (many hundreds of users in the
roster), the ratio was as low as 13%, an 87% reduction in bandwidth!
</p>
<p>
Compression of XMPP data can be achieved with the DEFLATE algorithm Compression of XMPP data can be achieved with the DEFLATE algorithm
(&rfc1951;) via TLS compression (&rfc3749;) or &xep0138;. While the (&rfc1951;) via TLS compression (&rfc3749;) or &xep0138; (which also
security implications of stream compression are beyond the scope of this supports other compression algorithms). While the security implications of
document (See the aforementioned RFC or XEP for more info), mitigating them stream compression are beyond the scope of this document (See the
may affect compression ratios. The author does not recommend using TLS aforementioned RFC or XEP for more info), the author does not recommend
compression with XMPP (or in general). If compression must be used, stream using TLS compression with XMPP (or in general). If compression must be
level compression should be implemented instead. Compressing at the stream used, stream level compression should be implemented instead, and the
level gives us the benefit of being able to flush the compression stream on compressed stream should have a full flush performed on stanza boundaries
stanza boundaries to help prevent information from leaking. This, however, to help prevent a class of chosen plaintext attacks which can cause data
may drastically increase compression ratios. leakage in compressed streams. While this may mitigate some of the benefits
of compression by raising compression ratios, in a large, real world
deployment at HipChat, network traffic was still observed to decrease by a
factor of 0.58 when enabling &xep0138; with ZLIB compression!
</p> </p>
<p> <p>
While the CPU cost of compression directly translates to higher power While the CPU cost of compression may directly translate to higher power
usage, it is vastly outweighed by the benefits of reduced network usage, it is vastly outweighed by the benefits of reduced network
utilization, especially on modern LTE networks which use a great deal more utilization, especially on modern LTE networks which use a great deal more
power per bit than 3G networks as will be seen later in this document. power per bit than 3G networks as will be seen later in this document.
However, CPU usage is also not guaranteed to rise due to compression. In
the aforementioned deployment of stream compression, a <em>decrease</em> in
CPU utilization by a factor of 0.60 was observed due to the fact that there
were fewer packets that needed to be handled by the OS (which also takes
CPU time), and, potentially more importantly, less data that needed to be
TLS-encrypted (which is a much more CPU-expensive operation than
compression). Therefore CPU time spent on compression (for ZLIB, at least;
other algorithms were not tested) should be considered negligable.
</p> </p>
<p> <p>
Supporting compression and flushing on stanza boundaries is highly Supporting compression and flushing on stanza boundaries is highly
@ -184,6 +191,7 @@
servers. servers.
</p> </p>
<p>&xep0138; provides stream level compression.</p> <p>&xep0138; provides stream level compression.</p>
<p>&xep0322; allows XMPP streams to use the EXI XML format.</p>
<p> <p>
&xep0115; provides a mechanism for caching, and hence eliding, the &xep0115; provides a mechanism for caching, and hence eliding, the
disco#info requests needed to negotiate optional features. disco#info requests needed to negotiate optional features.
@ -200,19 +208,19 @@
<p> <p>
&xep0357; implements push notifications (third party message delivery), &xep0357; implements push notifications (third party message delivery),
which are often used on mobile devices and highly optimized to conserve which are often used on mobile devices and highly optimized to conserve
battery. Push notifications also allow delivery of notifications to battery. Push notifications also allow delivery of notifications to mobile
mobile clients that are currently offline (eg. in an XEP-0198 "zombie" clients that are currently offline (eg. in an XEP-0198 "zombie" state).
state).
</p> </p>
<p> <p>
&xep0313; lets clients fetch messages which they missed (eg. due to poor &xep0313; lets clients fetch messages which they missed (eg. due to poor
mobile coverage and a flakey network connection). mobile coverage and a flaky network connection).
</p> </p>
</section1> </section1>
<section1 topic='Acknowledgements' anchor='acks'> <section1 topic='Acknowledgements' anchor='acks'>
<p> <p>
This XEP was originally written by Dave Cridland, and parts of his original This XEP was originally written by Dave Cridland, and parts of his original
work were used in this rewrite. work were used in this rewrite. Thanks to Atlassian for allowing me to
release hard numbers from their XMPP compression deployment.
</p> </p>
</section1> </section1>
<section1 topic='Security Considerations' anchor='security'> <section1 topic='Security Considerations' anchor='security'>