Include real world compression numbers and additional recommended reading.
- XML, and by extension XMPP, is known to be highly compressible. In a simple - test of a small (266089 byte) XMPP stream (connection, stream - initialization, feature discovery, roster loading, several presence stanzas - sent and received, disconnect), the entropy of the stream was found to be - 5.616313 bits per byte. Using the `gzip` tool to apply Lempel-Ziv coding - (LZ77) without concern for server-side CPU usage resulted in a compression - ratio of 21% (a 79% reduction in bandwidth). In one test with a much larger - dataset typical of a corporate environment (many hundreds of users in the - roster), the ratio was as low as 13%, an 87% reduction in bandwidth! -
-+ XML, and by extension XMPP, is known to be highly compressible. Compression of XMPP data can be achieved with the DEFLATE algorithm - (&rfc1951;) via TLS compression (&rfc3749;) or &xep0138;. While the - security implications of stream compression are beyond the scope of this - document (See the aforementioned RFC or XEP for more info), mitigating them - may affect compression ratios. The author does not recommend using TLS - compression with XMPP (or in general). If compression must be used, stream - level compression should be implemented instead. Compressing at the stream - level gives us the benefit of being able to flush the compression stream on - stanza boundaries to help prevent information from leaking. This, however, - may drastically increase compression ratios. + (&rfc1951;) via TLS compression (&rfc3749;) or &xep0138; (which also + supports other compression algorithms). While the security implications of + stream compression are beyond the scope of this document (See the + aforementioned RFC or XEP for more info), the author does not recommend + using TLS compression with XMPP (or in general). If compression must be + used, stream level compression should be implemented instead, and the + compressed stream should have a full flush performed on stanza boundaries + to help prevent a class of chosen plaintext attacks which can cause data + leakage in compressed streams. While this may mitigate some of the benefits + of compression by raising compression ratios, in a large, real world + deployment at HipChat, network traffic was still observed to decrease by a + factor of 0.58 when enabling &xep0138; with ZLIB compression!
- While the CPU cost of compression directly translates to higher power + While the CPU cost of compression may directly translate to higher power usage, it is vastly outweighed by the benefits of reduced network utilization, especially on modern LTE networks which use a great deal more power per bit than 3G networks as will be seen later in this document. + However, CPU usage is also not guaranteed to rise due to compression. In + the aforementioned deployment of stream compression, a decrease in + CPU utilization by a factor of 0.60 was observed due to the fact that there + were fewer packets that needed to be handled by the OS (which also takes + CPU time), and, potentially more importantly, less data that needed to be + TLS-encrypted (which is a much more CPU-expensive operation than + compression). Therefore CPU time spent on compression (for ZLIB, at least; + other algorithms were not tested) should be considered negligable.
Supporting compression and flushing on stanza boundaries is highly @@ -184,6 +191,7 @@ servers.
&xep0138; provides stream level compression.
+&xep0322; allows XMPP streams to use the EXI XML format.
&xep0115; provides a mechanism for caching, and hence eliding, the disco#info requests needed to negotiate optional features. @@ -200,19 +208,19 @@
&xep0357; implements push notifications (third party message delivery), which are often used on mobile devices and highly optimized to conserve - battery. Push notifications also allow delivery of notifications to - mobile clients that are currently offline (eg. in an XEP-0198 "zombie" - state). + battery. Push notifications also allow delivery of notifications to mobile + clients that are currently offline (eg. in an XEP-0198 "zombie" state).
&xep0313; lets clients fetch messages which they missed (eg. due to poor - mobile coverage and a flakey network connection). + mobile coverage and a flaky network connection).
This XEP was originally written by Dave Cridland, and parts of his original - work were used in this rewrite. + work were used in this rewrite. Thanks to Atlassian for allowing me to + release hard numbers from their XMPP compression deployment.