News

Welcome to End Point’s blog

Ongoing observations by End Point people

Postgres "unsupported frontend protocol" mystery

The wonderful tail_n_mail program continues to provide me with new mysteries from our Postgres clients. One of the main functions it provides is to send an immediate email to us when an unexpected FATAL (or ERROR or PANIC) message appears in the Postgres logs. While these are often simple application errors, or deeper problems such as running out of disk space, once in a blue moon you see something completely unexpected. Some time ago, I saw a bunch of these messages appear in an email from a tail_n_mail email:


[1] From files A to B Count: 2
First: [A] 2015-12-01T06:30:00 server1 postgres[1948]
Last:  [B] 2015-12-01T06:30:00 server2 postgres[29107]
FATAL: unsupported frontend protocol 65363.19778: server supports 1.0 to 3.0

I knew what caused this error in general, but decided to get to the bottom of the problem. Before we go into the specific error, let's review what causes this particular message to appear. When a Postgres client (such as psql or DBD::Pg) connects to Postgres, the first thing it does is to issue a startup message. One of the things included in this request is the version of the Postgres protocol the client wishes to use. Since 2003, Postgres servers have been using version 3.1. It is very rare to see a client or server that uses anything else. Because this protocol number request occurs at the very start of the connection request, non-Postgres programs often trigger this error, because the server is expecting a number at the start of the request.

We can verify this by use of a small Perl script that connects to the server, and sends an invalid protocol request:

#!/usr/bin/env perl

use strict;
use warnings;
use IO::Socket;

my $server = IO::Socket::UNIX->new('/tmp/.s.PGSQL.5432')
  or die "Could not connect!: $@";

my $packet = pack('nn', 1234,56789) . "user\0pg\0\0";
$packet = pack('N', length($packet) + 4). $packet;
$server->send($packet, 0);

After running the above program, a new error pops up in the Postgres logs as expected:

$ tail -1 /var/lib/pgsql/data/pg_log/postgres-2015-05-20.log
2015-05-21 12:00:00 EDT [unknown]@[unknown] [10281] FATAL:  unsupported frontend protocol 1234.56789: server supports 1.0 to 3.0

There is our error, as expected. The "unknown"s are because my log_line_prefix looks like this: %t %u@%d [%p] . While the timestamp (%t) and the process ID (%p) are easily filled in, the login failed, so both the user (%u) and database (%d) are still unknown.

Now on to our specific error, which you will recall is "unsupported frontend protocol 65363.19778". The above program shows that the protocol number is sent in a specific format. Let's use Perl to display the numbers 65363.19778 and see if there are any clues buried within it:

$ perl -e 'print pack "nn", 65363,19778'
�SMB

Some sort of unprintable character in there; let's take a deeper look just for completeness:

$ perl -e 'print pack "nn", 65363,19778' | hd
00000000  ff 53 4d 42                                       |.SMB|
00000004

Aha! SMB is not just a random placement of three letters, it is a big clue as to what is causing this message. SMB stands for Server Message Block, and is used by a variety of things. We can guess that this is either some program randomly hitting the Postgres port without realizing what it is, or some sort of purposeful port scanner. Why would something want to connect to the port but not log in? For one, you can determine the version of Postgres without logging in.

To cut to the chase, the culprit is the nmap program. In addition to simply scanning ports, it has the ability to do a deeper inspection to determine not only what is running on each port, but what version it is as well (with the "-sV" argument). Let's see nmap in action. We will use a non-standard Postgres port so as not to give it any additional hints about what is on that port:

$ nmap -p 5930 localhost -sV
Starting Nmap 6.40 ( http://nmap.org ) at 2015-05-20 12:00 EDT
Nmap scan report for localhost (127.0.0.1)
Host is up (0.000088s latency).
PORT     STATE SERVICE    VERSION
5930/tcp open  postgresql PostgreSQL DB
1 service unrecognized despite returning data. If you know the service/version, please submit the following fingerprint at http://www.insecure.org/cgi-bin/servicefp-submit.cgi :
SF-Port5930-TCP:V=6.40%I=7%D=5/21%Time=504C5445%P=i686-pc-linux-emu%r(Kerb
SF:eros,85,"E\0\0\0\x84SFATAL\0C0A000\0Munsupported\x20frontend\x20protoco
SF:l\x2027265\.28208:\x20server\x20supports\x201\.0\x20to\x203\.0\0Fpostma
SF:ster\.c\0L1834\0RProcessStartupPacket\0\0")%r(SMBProgNeg,85,"E\0\0\0\x8
SF:4SFATAL\0C0A000\0Munsupported\x20frontend\x20protocol\x2065363\.19778:\
SF:x20server\x20supports\x201\.0\x20to\x203\.0\0Fpostmaster\.c\0L1834\0RPr
SF:ocessStartupPacket\0\0");

Service detection performed. Please report any incorrect results at http://nmap.org/submit/ .
Nmap done: 1 IP address (1 host up) scanned in 6.73 seconds

It looks like it triggered the "unsupported protocol" message, based on what was returned. Taking a peek at the Postgres 9.3 logs shows our mystery message:

$ tail -1 /var/lib/pgsql/pg9.3/pg_log/postgres-2015-05-20.log
2015-05-21 12:00:00 EDT [unknown]@[unknown] [2318] FATAL:  unsupported frontend protocol 65363.19778: server supports 1.0 to 3.0

As a final check, let's confirm that nmap is using SMB when it runs the version check:

$ nmap localhost -p 5930 -sV --version-trace 2>/dev/null | grep SMB
Service scan sending probe SMBProgNeg to 127.0.0.1:5930 (tcp)
Service scan match (Probe SMBProgNeg matched with SMBProgNeg line 10662): 127.0.0.1:5930 is postgresql.  Version: |PostgreSQL DB|||
SF:ster\.c\0L1834\0RProcessStartupPacket\0\0")%r(SMBProgNeg,85,"E\0\0\0\x8

Bingo. Mystery solved. If you see that error in your logs, it is most likely caused by someone running nmap in version detection mode.

Postfix Address Verification

We recently upgraded some mail servers, moving from Exim to Postfix in the process. These server works as a front line spam/RBL filter, rejecting invalid message and relaying valid ones to different SMTP based on the destination domain.

While looking for the best configuration layout to achieve this, we found that Postfix has a very useful and interesting feature: Address Verification. This technique allows the Postfix server to check that a sender or a recipient address is valid before accepting a message, preventing junk messages from entering the queue.

How does Address Verification work?

Upon receiving a message Postfix will probe the preferred MTA for the address. If that address is valid the message is accepted and processed, otherwise it is rejected.

Message Probes does not actually go through the whole delivery process; Postfix will just connect to the MTA, send a HELO + MAIL FROM + RCPT TO sequence and check its response. Probe checks results are cached on disk, minimizing network and resource impact. During this check the client is put "on hold"; if the probe takes too much a temporary reject is given; a legitimate mail server will have no problem retrying the delivery later, when the cached result will likely be available.

Everything is highly configurable: response codes, timeouts, cache storage type and location, and so on.

Configure Recipient Address Verification

In our case, we wanted to only accept messages with a valid recipient address. Recipient Address Verification took care of this in a very smooth and elegant way.

Adding Recipient Address Verification it's easy. Just add these lines to /etc/postfix/main.cf:

# Your relaying configuration will already be in place. For example:
# relayhost = [next.hop.ip.address]

smtpd_recipient_restrictions = 
    permit_mynetworks
    reject_unauth_destination
    reject_unknown_recipient_domain
    reject_unverified_recipient

# Custom reply message when probe fails (Postfix 2.6 and later)
unverified_recipient_reject_reason = Address lookup failure

Settings order is important as they are verified one after another; when a decision is triggered (PERMIT or REJECT) the parsing process ends.

Let's see them in details:

  • permit_mynetworks: permit message from local or trusted addresses listed in $mynetworks;
  • reject_unauth_destination: reject message unless recipient domain is a local one (typically in $mydestination, $virtual_alias_domains or $virtual_mailbox_domains) or is accepted for forwarding (in $relay_domains);
  • reject_unknown_recipient_domain: reject message if recipient domain has no DNS MX and A record, or has a malformed MX record;
  • reject_unverified_recipient: reject the message if the Recipient Address Verification fails.

If you want to learn more, the best place to find more information is the Postfix Address Verification Howto.

Connected to PgBouncer or Postgres?

Determining if your current database connection is using PgBouncer, or going directly to Postgres itself, can be challenging, as PgBouncer is a very low-level, transparent interface. It is possible, and here are some detection methods you can use.

This was inspired by someone asking on the Perl DBD IRC channel if it was possible to easily tell if your current database handle (usually "$dbh") is connected to PgBouncer or not. Since I've seen this question asked in other venues, I decided to take a crack at it.

There are actually two questions to be answered: (1) are we connected to PgBouncer, and if so, (2) what pool_mode is being run? The quickest and easiest way I found to answer the first question is to try and connect to a non-existent database. Normally, this is a FATAL message, as seen here:

$ psql testdb -p 5432
testdb=# \c ghostdb
FATAL:  database "ghostdb" does not exist
Previous connection kept
testdb=# 

However, a slightly different ERROR message is returned if the same thing is attempted while connected to PgBouncer:

$ psql testdb -p 6432
testdb=# \c ghostdb
ERROR:  No such database: ghostdb
Previous connection kept
testdb=# 

Thus, an ERROR will always indicate that you are connected to PgBouncer and not directly to Postgres, which will always issue a FATAL.

In the future, there will be an even simpler method. As of this writing, pgBouncer 1.6 has not been released, but it will have the ability to customize the application_name. This is a configurable session-level variable that is fairly new in Postgres. Andrew Dunstan wrote a patch which enables adding this to your pgbouncer.ini file:

application_name_add_host = 1

This will make PgBouncer modify the application_name to append some information to it such as the remote host, the remote port, and the local port. This is a feature many PgBouncer users will appreciate, as it offers an escape from the black hole of connection information that PgBouncer suffers from. Here is what it looks like on both a normal Postgres connection, and a PgBouncer connection. As you can see, this is an easier check than the "invalid database connection" check above:

## Postgres:
$ psql testdb -p 5432 -c 'show application_name'
 application_name 
------------------
 psql

## PgBouncer:
$ psql testdb -p 6432 -c 'show application_name'
        application_name        
--------------------------------
 psql - unix(7882@gtsm.com):6432

## DBD::Pg connections to PgBouncer get a very similar change:
$ perl testme.tmp.pl --port 6432
app - unix(6772@gtsm.com):6432

Now we have answered question of "are we connected to PgBouncer or not?". The next question is which pool mode we are in. There are three pool modes you can set for PgBouncer, which controls when your particular connection is returned to "the pool". For "session" mode, you keep the same Postgres backend the entire time you are connected. For "transaction", you keep the same Postgres backend until the end of a transaction. For "statement", you may get a new Postgres backend after each statement.

First, we can check if we are connected to PgBouncer in a statement level pool mode by taking advantage of the fact that multi-statement transactions are prohibited. PgBouncer enforces this by intercepting any attempts to enter a transaction (e.g. by issuing a BEGIN command). A very PgBouncer specific error about "Long transactions not allowed" is issued back to the client like so:

$ psql testdb -p 6432
testdb=# begin;
ERROR:  Long transactions not allowed

So, that takes care of detecting a pool_mode set to 'statement'. The other two modes, transaction and session, will *not* give the same error. Thus, seeing that error indicates you are using a statement-level PgBouncer connection.

The next pool mode is "transaction", which means that the server connection if released back to the pool at the end of a transaction. To figure out if we are in this mode, we take advantage of the fact that PgBouncer can be set to clean up the connection at the end of each transaction by issuing a specific command. By default, the command set by server_reset_query is DISCARD ALL, which invalidates any prepared statements, temporary tables, and other transaction-spanning, session-level items. Thus, our test will see if these session-level artifacts get discarded or not:

## Direct Postgres:
$ psql testdb -p 5432
testdb=# prepare abc(int) as select $1::text;
PREPARE
testdb=# execute abc(1);
 text
------
 1

## PgBouncer:
$ psql testdb -p 6432
testdb=# prepare abc(int) as select $1::text;
PREPARE
testdb=# execute abc(1);
ERROR:  prepared statement "abc" does not exist

Keep in mind that there are no true "transactionless" commands in Postgres. Even though we did not use a BEGIN in the psql prompt above, each command is treated as its own mini-transaction. In the case of the PgBouncer connection, the prepare is immediately followed with a DISCARD ALL, which means that our prepared statement no longer exists. Hence, we have determined that we are using a transaction-level PgBouncer connection.

Unfortunately, not getting an error does not necessarily mean your PgBouncer is NOT in transaction mode! It could be that server_reset_query is empty, meaning that temporary artifacts are not discarded at the end of the transaction. In such a case, we can take advantage of the fact that PgBouncer will allow other clients to share in our current connection, and thus be able to see the temporary items. If we create a temporary table in one pgbouncer connection, then connect again as a new client, the temporary table will only show up if we are sharing sessions but not transactions. Easier shown than explained, I suspect:

## Regular Postgres gets a fresh session:
$ psql test1 -p 5432
test1=# create temp table abc(a int);
CREATE TABLE
test1=# select * from abc;
(No rows)
test1=# ^Z ## (we suspend with CTRL-Z)
[2]+  Stopped                 psql test1 -p 5432

$ psql test1 -p 5432
test1=# select * from abc;
ERROR:  relation "abc" does not exist

## PgBouncer will re-use the same session:
$ psql test1 -p 6432
test1=# create temp table abc(a int);
CREATE TABLE
test1=# select * from abc;
(No rows)
test1=# ^Z
[2]+  Stopped                 psql test1 -p 6432

$ psql test1 -p 6432
test1=# select * from abc;
(No rows)

The final PgBouncer pool mode is "session", and basically means the only advantage over a normal Postgres connection is the overhead to start up and connect to a new Postgres backend. Thus, the PgBouncer connections are only returned to the pool upon disconnection. The only way to tell if you are in this mode is by determining that you are *not* in the other two modes. :)

So, although PgBouncer is extremely transparent, there are some tricks to determine if you are connected to it, and at what pool_mode. If you can think of other (SQL-level!) ways to check, please let me know in the comments section.

Job opening: Web developer

We are looking for another talented software developer to consult with our clients and develop their web applications in AngularJS, Node.js, Ruby on Rails, and other technologies. If you like to focus on solving business problems and can take responsibility for getting a job done well without intensive oversight, please read on!

What is in it for you?

  • Flexible full-time work hours
  • Health insurance benefit
  • Paid holidays and vacation
  • 401(k) retirement savings plan (U.S. employees)
  • Annual bonus opportunity
  • Ability to move without being tied to your job location

What you will be doing:

  • Work from your home office, or from our offices in New York City and the Tennessee Tri-Cities area
  • Consult with clients to determine their web application needs
  • Build, test, release, and maintain web applications for our clients
  • Work with open source tools and contribute back as opportunity arises
  • Use your desktop platform of choice: Linux, Mac OS X, Windows
  • Learn and put to use new technologies
  • Direct much of your own work

What you will need:

  • Professional experience building reliable server-side apps in Ruby on Rails, Node.js and Express, Django, CakePHP, etc.
  • Good front-end web skills with responsive design using HTML, CSS, and JavaScript, including jQuery, AngularJS, Backbone.js, Ember.js, etc.
  • Experience with databases such as PostgreSQL, MySQL, MongoDB, CouchDB, Redis, Elasticsearch, etc.
  • A focus on needs of our clients and their users
  • Strong verbal and written communication skills

About us

End Point is a 19-year-old web consulting company based in New York City, with 39 full-time employees working mostly remotely from home offices. We are experts in web development, database, and DevOps, collaborating using ssh, Screen/tmux, IRC, Google+ Hangouts, Skype, and good old phones.

We serve over 200 clients ranging from small family businesses to large corporations. We use free software frameworks in a variety of languages including JavaScript, Ruby, Python, Perl, PHP, and Java, tracked by Git, running on Linux.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of gender, race, religion, color, national origin, sexual orientation, age, marital status, veteran status, or disability status.

How to apply

Please email us an introduction to jobs@endpoint.com to apply. Include a resume, your GitHub or LinkedIn URLs, or whatever else that would help us get to know you. We look forward to hearing from you! Full-time employment seekers only, please. No agencies or subcontractors.

Non-English Google Earth Layers on the Liquid Galaxy

The availability to activate layers within Google Earth is one of the things that makes Earth so powerful. In fact, there are many standard layers that are built into Earth, including weather, roads, place names, etc. There are also some additional layers that have some really interesting information, including one I noticed relatively recently called "Appalachian Mountaintop Removal" which is interesting to me now that I live in Tennessee.


As you can see, however, that while some of these available layers are interesting on a desktop, they're not necessarily very visually appealing on a Liquid Galaxy. We have identified a standard set of layers to enable and disable within Earth so that things don't appear too cluttered while running. Some things we've disabled by default are the weather and the roads, as well as many levels of place names and boundaries. For example, we have boundaries of countries and water bodies enabled, but don't want lines drawn for states, provinces, counties, or other areas such as those.

To disable these layers, we modify the GECommonSettings.conf file on the machines that running Earth. This file has everything pretty well spelled out in a readable manner, and it's fairly easy to determine what layers you're enabling or disabling.

Here's an example of some of the entries contained within the GECommonSettings.conf file:
1st%20Level%20Admin%20Names%20%28States_Provinces%29=false
2nd%20Level%20Admin%20Regions%20%28Counties%29=false
Islands=false
http%3A__mw1.google.com_mw-earth-vectordb_geographic_features_en.kml\Geographic%20Features=false
Water%20Bodies=true
Places%20=false
Panoramio=false
360%20Cities=false
Photorealistic%20=true
Trees=true
Populated%20Places=true
Roads=false
Gray%20=false
http%3A__mw1.google.com_mw-weather_base_files_kml_weather_en.kmz\Clouds=true
http%3A__mw1.google.com_mw-weather_base_files_kml_weather_en.kmz\Radar=true
http%3A__mw1.google.com_mw-weather_base_files_kml_weather_en.kmz\Conditions%20and%20Forecasts=true
http%3A__mw1.google.com_mw-weather_base_files_kml_weather_en.kmz\Information=true
http%3A__mw1.google.com_mw-earth-vectordb_gallery_layers_gallery_root_en.kmz\360Cities=false

See, that's pretty self-explanatory, isn't it?

Well, it is until you start introducing other languages to the mix. For example, we needed to run Earth in Korean language mode for one of our clients. Once we fired up Earth and specified that the language would be Korean, Earth came up with all of the default layers turned on. All of those layers that we'd disabled in English Earth stayed on and cluttered the displays with so many icons for each and every layer.

It took some trial and error, but I was eventually able to figure out what to do to resolve this. I loaded Google Earth in Korean mode on my Ubuntu VM, made my changes to the selected layers via the Layers selection area within Earth, then quit. When I looked at the GECommonSettings.conf file after quitting Earth, I found a bunch of new line items added to the file. It seems that each of the options had new lines, though I couldn't exactly decipher which lines controlled which options.

Here's an example of some of the new entries that are now contained within my GECommonSettings.conf file:
http%3A__kh.google.com%3A80_\%UAD6D%UACBD=false
http%3A__kh.google.com%3A80_\%UAD6D%UAC00%UBA85=false
http%3A__kh.google.com%3A80_\%UD574%UC548%UC120=false
http%3A__kh.google.com%3A80_\1%UCC28%20%UD589%UC815%20%UACBD%UACC4%UC120%28%UC8FC_%UB3C4%29=false
http%3A__kh.google.com%3A80_\2%UCC28%20%UD589%UC815%UB2E8%UC704%20%UC9C0%UC5ED%28%UAD70%29=false
http%3A__kh.google.com%3A80_\%UC778%UAD6C%UBC00%UC9D1%UC9C0%UC5ED=true
http%3A__kh.google.com%3A80_\%UC12C=false

Now, I'll be honest and say that I don't have clue exactly what %UAD6D%UACBD and 1%UCC28%20%UD589%UC815%20%UACBD%UACC4%UC120%28%UC8FC_%UB3C4%29 mean, but I really don't have to know. All I know is that they got disabled when I disabled the undesired layers within Earth. I then copied these lines to the configuration on my Liquid Galaxy, and the next time I fired it up in Korean, the layers were no longer cluttering up the displays.

I was able to use this exact same method to determine which layers to enable or disable for one of our Spanish-language clients, as well.