Welcome to End Point’s blog

Ongoing observations by End Point people

XZ compression

XZ is a new free compression file format that is starting to be more widely used. The LZMA2 compression method it uses first became popular in the 7-Zip archive program, with an analogous Unix command-line version called 7z.

We used XZ for the first time in the Interchange project in the Interchange 5.7.3 packages. Compared to gzip and bzip2, the file sizes were as follows:

interchange-5.7.3.tar.gz   2.4M
interchange-5.7.3.tar.bz2  2.1M
interchange-5.7.3.tar.xz   1.7M

Getting that tighter compression comes at the cost of its runtime being about 4 times slower than bzip2, but a bonus is that it decompresses about 3 times faster than bzip2. The combination of significantly smaller file sizes and faster decompression made it a clear win for distributing software packages, leading to it being the format used for packages in Fedora 12.

It's also easy to use on Ubuntu 9.10, via the standard xz-utils package. When you install that with apt-get, aptitude, etc., you'll get a scary warning about it replacing lzma, a core package, but this is safe to do because xz-utils provides compatible replacement binaries /usr/bin/lzma and friends (lzcat, lzless, etc.). There is also built-in support in GNU tar with the new --xz aka -J options.

1 comment:

Jon Jensen said...

Nice to see xz was added to RHEL 5.5 (and thus also CentOS 5.5).