Welcome to End Point’s blog

Ongoing observations by End Point people

Postgres WAL files: best compression methods

Turtle turtle by WO1 Larry Olson from US Army

The PostgreSQL database system uses the write-ahead logging method to ensure that a log of changes is saved before being applied to the actual data. The log files that are created are known as WAL (Write Ahead Log) files, and by default are 16 MB in size each. Although this is a small size, a busy system can generate hundreds or thousands of these files per hour, at which point disk space becomes an issue. Luckily, WAL files are extremely compressible. I examined different programs to find one that offered the best compression (as indicated by a smaller size) at the smallest cost (as indicated by wall clock time). All of the methods tested worked better than the venerable gzip program, which is suggested in the Postgres documentation for the archive_command option. The best overall solution was using the pxz program inside the archive_command setting, followed closely by use of the 7za program. Use of the built-in wal_compression option was an excellent solution as well, although not as space-saving as using external programs via archive_command.

A database system is a complex beast, involving many trade-offs. An important issue is speed: waiting for changes to get written to disk before letting the client proceed can be a very expensive solution. Postgres gets around this with the use of the Write Ahead Log, which generates WAL files indicating what changes were made. Creating these files is much faster than performing the actual updates on the underlying files. Thus, Postgres is able to tell the client that the work is "done" when the WAL file has been generated. Should the system crash before the actual changes are made, the WAL files are used to replay the changes. As these WAL files represent a continuous unbroken chain of all changes to the database, they can also be used for Point in Time Recovery - in other words, the WAL files can be used to rewind the database to any single point in time, capturing the state of the database at a specific moment.

Postgres WAL files are exactly 16 MB in size (although this size may be changed at compilation time, it is extremely unheard of to do this). These files primarily sit around taking up disk space and are only accessed when a problem occurs, so being able to compress them is a good one-time exchange of CPU effort for a lower file size. In theory, the time to decompress the files should also be considered, but testing revealed that all the programs decompressed so quickly that it should not be a factor.

WAL files can be compressed in one of two ways. As of Postgres 9.5, the wal_compression feature can be enabled, which instructs Postgres to compress parts of the WAL file in-place when possible, leading to the ability to store much more information per 16 MB WAL file, and thus reducing the total number generated. The second way is to compress with an external program via the free-form archive_command parameter. Here is the canonical example from the Postgres docs, showing use of the gzip program for archive_command:

archive_command = 'gzip < %p > /var/lib/pgsql/archive/%f'

It is widely known that gzip is no longer the best compression option for most tasks, so I endeavored to determine which program was the best at WAL file compression - in terms of final file size versus the overhead to create the file. I also wanted to examine how these fared versus the new wal_compression feature.

To compare the various compression methods, I examined all of the compression programs that are commonly available on a Linux system, are known to be stable, and which perform at least as good as the common utility gzip. The contenders were:

  • gzip - the canonical, default compression utility for many years
  • pigz - parallel version of gzip
  • bzip2 - second only to gzip in popularity, it offers better compression
  • lbzip2 - parallel version of bzip
  • xz - an excellent all-around compression alternative to gzip and bzip
  • pxz - parallel version of xz
  • 7za - excellent compression, but suffers from complex arguments
  • lrzip - compression program targeted at "large files"

For the tests, 100 random WAL files were copied from a busy production Postgres system. Each of those 100 files were compressed nine times by each of the programs above: from the "least compressed" option (e.g. -1) to the "best compressed" option (e.g. -9). The tests were performed on a 16-core system, with plenty of free RAM and nothing else running on the server. Results were gathered by wrapping each command with /usr/bin/time -verbose, which produces a nice breakdown of results. To gather the data, the "Elapsed (wall clock) time" was used, along with size of the compressed file. Here is some sample output of the time command:

  Command being timed: "bzip2 -4 ./0000000100008B91000000A5"
  User time (seconds): 1.65
  System time (seconds): 0.01
  Percent of CPU this job got: 99%
  Elapsed (wall clock) time (h:mm:ss or m:ss): 0:01.66
  Average shared text size (kbytes): 0
  Average unshared data size (kbytes): 0
  Average stack size (kbytes): 0
  Average total size (kbytes): 0
  Maximum resident set size (kbytes): 3612
  Average resident set size (kbytes): 0
  Major (requiring I/O) page faults: 0
  Minor (reclaiming a frame) page faults: 938
  Voluntary context switches: 1
  Involuntary context switches: 13
  Swaps: 0
  File system inputs: 0
  File system outputs: 6896
  Socket messages sent: 0
  Socket messages received: 0
  Signals delivered: 0
  Page size (bytes): 4096
  Exit status: 0

The wal_compression feature was tested by creating a new Postgres 9.6 cluster, then running the pgbench program twice to generate WAL files - once with wal_compression enabled, and once with it disabled. Then each of the resulting WAL files was compressed using each of the programs above.

Table 1.
Results of compressing 16 MB WAL files - average for 100 files
CommandWall clock time (s)File size (MB)
gzip -10.2714.927
gzip -20.2924.777
gzip -30.3664.667
gzip -40.3814.486
gzip -50.4924.318
gzip -60.7344.250
gzip -70.9914.235
gzip -82.0424.228
gzip -93.6264.227
CommandWall clock time (s)File size (MB)
bzip2 -11.5403.817
bzip2 -21.5313.688
bzip2 -31.5703.638
bzip2 -41.6253.592
bzip2 -51.6673.587
bzip2 -61.7073.566
bzip2 -71.7313.559
bzip2 -81.7523.557
bzip2 -91.7843.541
CommandWall clock time (s)File size (MB)
xz -10.9623.174
xz -21.1863.066
xz -35.9112.711
xz -46.2922.682
xz -56.6942.666
xz -68.9882.608
xz -79.1942.592
xz -89.1172.596
xz -99.1642.597
Table 2.
Results of compressing 16 MB WAL file - average for 100 files
CommandWall clock time (s)File size (MB)
-l -L1
-l -L2
-l -L3
-l -L4
-l -L5
-l -L6
-l -L7
-l -L8
-l -L9
CommandWall clock time (s)File size (MB)
-z -L1
-z -L2
-z -L3
-z -L4
-z -L5
-z -L6
-z -L7
-z -L8
-z -L9
CommandWall clock time (s)File size (MB)
7za -bd -mx=1
a test.7za
7za -bd -mx=2
a test.7za
7za -bd -mx=3
a test.7za
7za -bd -mx=4
a test.7za
7za -bd -mx=5
a test.7za
7za -bd -mx=6
a test.7za
7za -bd -mx=7
a test.7za
7za -bd -mx=8
a test.7za
7za -bd -mx=9
a test.7za
Table 3.
Results of compressing 16 MB WAL file - average for 100 files
CommandWall clock time (s)File size (MB)
pigz -10.0514.904
pigz -20.0514.755
pigz -30.0514.645
pigz -40.0514.472
pigz -50.0514.304
pigz -60.0604.255
pigz -70.0814.225
pigz -80.1404.212
pigz -90.2514.214
CommandWall clock time (s)File size (MB)
lbzip2 -10.1353.801
lbzip2 -20.1513.664
lbzip2 -30.1513.615
lbzip2 -40.1513.586
lbzip2 -50.1513.562
lbzip2 -60.1513.545
lbzip2 -70.1503.538
lbzip2 -80.1513.524
lbzip2 -90.1503.528
CommandWall clock time (s)File size (MB)
pxz -10.1353.266
pxz -20.1753.095
pxz -31.2442.746
pxz -42.5282.704
pxz -55.1152.679
pxz -69.1162.604
pxz -79.2552.599
pxz -89.2672.595
pxz -99.3552.591
Table 4.
Results of Postgres wal_compression option
ModificationsTotal size of WAL files (MB)
No modifications208.1
wal_compression enabled81.0
xz -28.6
wal_compression enabled PLUS xz -29.4

Table 1 shows some baseline compression values for the three popular programs gzip, bzip2, and xz. Both gzip and bzip2 show little change in the file sizes as the compression strength is raised. However, xz has a relatively large jump when going from -2 to -3, although the time cost increases to an unacceptable 5.9 seconds. As a starting point, something well under one second is desired.

Among those three, xz is the clear winner, shrinking the file to 3.07 MB with a compression argument of -2, and taking 1.2 seconds to do so. Both gzip and bzip2 never even reach this file size, even when using a -9 argument. For that matter, the best compression gzip can ever achieve is 4.23 MB, which the other programs can beat without breakng a sweat.

Table 2 demonstrates two ways of invoking the lrzip program: the -l option (lzo compression - described in the lrzip documentation as "ultra fast") and the -z option (zpaq compression - "extreme compression, extreme slow"). All of those superlatives are supported by the data. The -l option runs extremely fast: even at -L5 the total clock time is still only .39 seconds. Unfortunately, the file size hovers around an undesirable 5.5 MB, no matter what compression level is used. The -z option produces the smallest file of all the programs here (2.48 MB) at a whopping cost of over 30 seconds! Even the smallest compression level (-L1) takes 3.5 seconds to produce a 3.4 MB file. Thus, lrzip is clearly out of the competition.

Compression in action (photo by Eric Armstrong)

The most interesting program is without a doubt 7za. Unlike the others, it is organized around creating archives, and thus doesn't do in-place compression as the others do. Nevertheless, the results are quite good. At the lowest level, it takes a mere 0.13 seconds to produce a 3.18 MB file. As it takes xz 1.19 seconds to produce a nearly equivalent 3.10 MB file, 7za is the clear winner ... if we had only a single core available. :)

It is rare to find a modern server with a single processor, and a crop of compression programs have appeared to support this new paradigm. First up is the amazing pigz, which is a parallel version of gzip. As Table 3 shows, pigz is extraordinarily fast on our 16 core box, taking a mere 0.05 seconds to run at compression level -1, and only 0.25 seconds to run at compression level -9. Sadly, it still suffers from the fact that gzip is simply not good at compressing WAL files, as the smallest size was 4.23 MB. This rules out pigz from consideration.

The bzip2 program has been nipping at the heels of gzip for many years, so naturally it has its own parallel version, known as lbzip2. As Table 3 shows, it is also amazingly fast. Not as fast as pigz, but with a speed of under 0.2 seconds - even at the highest compression level! There is very little variation among the compression levels used, so it is fair to simply state that lbzip2 takes 0.15 seconds to shrink the WAL file to 3.5 MB. A decent entry.

Of course, the xz program has a parallel version as well, known as pxz. Table 3 shows that the times still vary quite a bit, and reach into the 9 second range at higher compression levels, but does very well at -2, taking a mere 0.17 seconds to produce a 3.09 MB file. This is comparable to the previous winner, 7za, which took 0.14 seconds to create a 3.12 MB file.

So the clear winners are 7za and pxz. I gave the edge to pxz, as (1) the file size was slightly smaller at comparable time costs, and (2) the odd syntax for 7za for both compressing and decompressing was annoying compared with the simplicity of "xz -2" and "xz -d".

Now, what about the built-in compression offered by the wal_compression option? As Table 4 shows, the compression for the WAL files I tested went from 208 MB to 81 MB. This is a significant gain, but only equates to compressing a single WAL file to 6.23 MB, which is a poor showing when compared to the compression programs above. It should be pointed out that the wal_compression option is sensitive to your workload, so you might see reports of greater and lesser compressions.

Of interest is that the WAL files generated by turning on wal_compression are capable of being further compressed by the archive_command option, and doing a pretty good job of it as well - going from 81 MB of WAL files to 9.4 MB of WAL files. However, using just xz in the archive_command without wal_compression on still yielded a smaller overall size, and means less CPU because the data is only compressed once.

It should be pointed out that wal_compression has other advantages, and that comparing it to archive_command is not a fair comparison, but this article was primarily about the best compression option for storing WAL files long-term.

Thus, the overall winner is "pxz -2", followed closely by 7za and its bulky arguments, with honorable mention given to wal_compression. Your particular requirements might guide you to another conclusion, but hopefully nobody shall simply default to using gzip anymore.

Thanks to my colleague Lele for encouraging me to try pxz, after I was already happy with xz. Thanks to the authors of xz, for providing an amazing program that has an incredible number of knobs for tweaking. And a final thanks to the authors of the wal_compression feature, which is a pretty nifty trick!

Wrocloverb part 2 - The Elixir Hype

One of the main reasons I attend Wrocloverb almost every year, is that it's a great forum for confronting ideas. It's almost a tradition to have at least 2 very enjoyful discussion panels during this conference. One of them was devoted to Elixir and why Ruby [1] community is so hyping about it.

Why Elixir is “sold” to us as “new better Ruby” while its underlying principles are totally different? Won’t it result in Elixir programmers that do not understand Elixir (like Rails programmers that do not know Ruby)?

Panelists discussed briefly the history of Elixir:

Jose Valim (who created Elixir) was working on threading in Rails and he was searching for better approaches for threading in web frameworks. He felt like lots of things were lacking in Erlang and Elixir is his approach for better Exceptions, better developer experience.

Then they've jumped to Elixir's main goals which are:
  • Compatibility with Erlang (all datatypes)
  • Better tooling
  • Improving developer's experience

After that, they've started speculating about problems that Elixir solves and RoR doesn't:

Ruby on Rails addresses many problems in ways that may be somehow archaic to us in the ever-scaling world of 2017. There are many approaches to it - e.g. "actor model" which is implemented in Ruby by Celluloid, in Scala with Akka and also Elixir and Phoenix (Elixir's rails-like framework) has it's own actor model.

Phoenix ("Rails for Elixir") is just an Elixir app - unlike Rails, it is not separate from Elixir. Moreover Elixir is exactly the same language as Erlang so:

Erlang = Elixir = Phoenix

Great comment:

Then a discussion followed during which panelists speculated about the price of the jump from Rails to Elixir:

Java to Rails jump was caused by business/productivity. There's no such jump for Phoenix/Elixir. Elixir code is more verbose (less instance variables, all args are passed explicitly to all functions)

My conclusions

A reason why this discussion was somehow shallow and pointless was that those two world have different problems. Great comment:

Elixir solves a lot of technical problems with scaling thanks to Erlang's virtual machine. Such problems are definitely only a small part of what typical ruby problems solvers deal with on a daily basis. Hearing Elixir and Ruby on Rails developers talk past each other was probably the first sign of the fact that there's no hyping technology right now. Each problem can be addressed by many tech tools and frameworks.

[1] Wrocloverb describes itself as a "best Java conference in Ruby world". It's deceiving:

Wrocloverb 2017 part 1

Wrocloverb is a single-track 3-day conference that takes place in Wrocław (Poland) every year in March.

Here's a subjective list of most interesting talks from the first day

# Kafka / Karafka (by Maciej Mensfeld)

Karafka is another library that simplifies Apache Kafka usage in Ruby. It lets Ruby on Rails apps benefit from horizontally scalable message busses in a pub-sub (or publisher/consumer) type of network.

Why Kafka is (probably) better message/task broker for your app:
- broadcasting is a real power feature of kafka (http lacks that)
- author claims that its easier to support it  rather than ZeroMQ/RabbitMQ
- it's namespaced with topics (similar to Robot Operating System)
- great replacement for ruby-kafka and Poseidon

# Machine Learning to the Rescue (Mariusz Gil)

This talk was devoted to Machine Learning success (and failure) story of the author.

Author underlined that Machine Learning is a process and proposed following workflow:

  1. define a problem
  2. gather you data
  3. understand your data
  4. prepare and condition the data
  5. select & run your algorithms
  6. tune algorithms parameters
  7. select final model
  8. validate final model (test using production data)
Mariusz described few ML problems that he has dealt with in the past. One of them was a project designed to estimate cost of a code review. He outlined the process of tuning the input data. Here's a list of what comprised the input for a code review estimation cost:
  • number of lines changed
  • number of files changed
  • efferent coupling
  • afferent coupling
  • number of classes
  • number of interfaces
  • inheritance level
  • number of method calls
  • lloc metric
  • lcom metric (whether single responsibility pattern is followed or not)

# Spree lightning talk by

One of the lightning talks was devoted to Spree. Here's some interesting latest data from spree world:

  • number of contributors of spree - 700
  • it's very modular modular
  • it's api driven
  • it's one of the biggest repos on github
  • very large number of extensions
  • it drives thousands of stores worldwide
  • Spark Solutions is a maintainer
  • Popular companies that use spree: Go Daddy, Goop, Casper, Bonobos, Littlebits, Greetabl
  • it support rails 5, rails 4.2 and rails 3.x
Author also released newest 3.2.0 stable version during the talk:

Shell efficiency: mkdir and mv

Little tools can be a nice improvement. Not everything needs to be thought-leaderish.

For example, once upon a time in my Unix infancy I didn't know that mkdir has the -p option to make intervening directories automatically. So back then, in order to create the path a/b/c/ I would've run: mkdir a; mkdir b; mkdir c when I could instead have simply run: mkdir -p a/b/c.

In working at the shell, particularly on my own local machine, I often find myself wanting to move one or several files into a different location, to file them away. For example:

mv -i ~/Downloads/Some\ Long\ File\ Name.pdf ~/some-other-long-file-name.tar.xz ~/archive/new...

at which point I realize that the subdirectory of ~/archive that I want to move those files into does not yet exist.

I can't simply move to the beginning of the line and change mv to mkdir -p without removing my partially-typed ~/archive/new....

I can go ahead and remove that, and then after I run the command I have to change the mkdir back to mv and add back the ~/archive/new....

In one single day I found I was doing that so often that it became tedious, so I re-read the GNU coreutils manpage for mv to see if there was a relevant option I had missed or a new one that would help. And I searched the web to see if a prebuilt tool is out there, or if anyone had any nice solutions.

To my surprise I found nothing suitable, but I did find some discussion forums full of various suggestions and many brushoffs and ill-conceived suggestions that either didn't work for me or seemed much overengineered.

The solution I came up with was very simple. I've been using it for a few months and am happy enough with it to share it and see if it helps anyone else.

In zsh (my main local shell) add to ~/.zshrc:

mkmv() {
    mkdir -p -- "$argv[-1]"
    mv "$@"

And in bash (which I use on most of the many servers I access remotely) add to ~/.bashrc:

mkmv() {
    mkdir -p -- "${!#}"
    mv "$@"

To use: Once you realize you're about to try to move files or directories into a nonexistent directory, simply go to the beginning of the line (^A in standard emacs keybindings) and type mk in front of the mv that was already there:

mkmv -i ~/Downloads/Some\ Long\ File\ Name.pdf ~/some-other-long-file-name.tar.xz ~/archive/new...

It makes the directory (or directories) and then completes the move.

There are a few important considerations that I didn't foresee in my initial naive implementation:

  • Having the name be somethingmv meant less typing than something requiring me to remove the mv.
  • For me, it needs to support not just moving one thing to one destination, but rather a whole list of things. That meant accessing the last argument (the destination) for the mkdir.
  • I also needed to allow through arguments to mv such as -i, -v, and -n, which I often use.
  • The -- argument to mkdir ensures that we don't accidentally end up with any other options and that we can handle a destination with a leading - (however unlikely that is).
  • The mv command needs to have a double-quoted "$@" so that the original parameters are each expanded into double-quoted arguments, allowing for spaces and other shell metacharacters in the paths. (See the zsh and bash manpages for details on the important difference in behavior of "$@" compared to "$*" and either of them unquoted.)

This doesn't support GNU extensions to mv such as a --target-directory that precedes the source paths. I don't use that interactively, so I don't mind.

Because this is such a small thing, I avoided for years bothering to set it up. But now that I use it all the time I'm glad I have it!

360° Panoramic Video on Liquid Galaxy

End Point’s Liquid Galaxy recently had an exciting breakthrough! We can now show 360° panoramic video effectively and seamlessly on our Liquid Galaxy systems.

Since our integration of seamless 360° panovideo, our marketing and content teams have been loading up our system with the best 360° panoramic videos. The Liquid Galaxy system at our headquarter office has 360° panoramic videos from National Geographic, The New York Times, Google Spotlight, gaming videos, and more.

The technical challenge with panoramic video is to get multiple instances of the video player app to work in sync. On the Liquid Galaxy, you're actually seeing 7 different video player apps all running at the same time and playing the same video file, but with each app showing a slightly adjusted slice of the 360° panoramic video. This synchronization and angle-adjustment is at the heart of the Liquid Galaxy platform, and allows a high-resolution experience and surrounding immersion that doesn't happen on any other video wall system.

We anticipate that our flawless 360° panoramic video will resonate with many industries, one of whom is the gaming industry. Below we’ve included a video of 360° gaming videos, and how they appear on Liquid Galaxy. Mixed with all the other exciting capabilities on Liquid Galaxy, we anticipate the ability to view angle-adjusted and 3D-immersive 360° video on the system will be a huge crowd-pleaser.

If you are interested in learning more about 360° panoramic video on Liquid Galaxy, don’t hesitate to contact us or visit our Liquid Galaxy website for more information.