End Point

News

Welcome to End Point's blog

Ongoing observations by End Point people.

The PGCon "Hall Track"

One of my favorite parts of PGCon is always the "hall track", a general term for the sideline discussions and brainstorming sessions that happen over dinner, between sessions (or sometimes during sessions), and pretty much everywhere else during the conference. This year's hall track topics seemed to be set by the developers' meeting; everywhere I went, someone was talking about hooks for external security modules, MERGE, predicate locking, extension packaging and distribution, or exposing transaction order for replication. Other developers' pet projects that didn't appear in the meeting showed up occasionally, including unlogged tables and range types. Even more than, for instance, the wiki pages describing the things people plan to work on, these interstitial discussions demonstrate the vibrancy of the community and give a good idea just how active our development really is.

This year I shared rooms with Robert Haas, so I got a good overview of his plans for global temporary and unlogged tables. I spent a while with Jeff Davis looking through the code for exclusion constraints and deciding whether it was realistically possible to cause a starvation problem with many concurrent insertions into a table with an exclusion constraint. I didn't spend the time I should have talking with Dimitri Fontaine about his PostgreSQL extensions project, but if time permits I'd like to see if I could help out with it. Nor did I find the time I'd have liked to work on PL/Parrot, but I was glad to meet Jonathan Leto, who has done most of the coding work thus far on that project.

In contrast to other conferences, I didn't have a particular itch of my own to scratch between sessions. During past conferences I've been eager to discuss ideas for multi-column statistics; though that work continues, slowly, time hasn't permitted enough recent development even for the topic to be fresh in my mind, much less worthy of in-depth discussion. This lack of one overriding subject turned out to be a refreshing change, however, as it left the other hall track subjects less filtered.

Finally, it was nice to spend time with co-workers, and in fact to meet (finally) in person the one of the "Greg"s I'd talked to on the phone many times, but never actually met in person. Various engagements in my family or his have gotten in the way in the past. One of the quirks of working for a distributed organization...

Update: Fixed link to developers' meeting wiki page, thanks to comment from roppert

Learn more about End Point's Postgres Support, Development, and Consulting.

Spree and Multi-site Architecture for Ecommerce

Running multiple stores from a single ecommerce platform instance seems to be quite popular these days. End Point has worked with several clients in developing a multi-store architecture running from one Interchange installation. In the case of our work with Backcountry.com, the data structure requires that a site_id column be included in product and navigation tables to specify which stores products and categories belong to. Frontend views are generated and "partial views" or "view components" are organized into a per-site directory structure and database calls request products against the current site_id. Below is a simplified view of the data structure used for Backcountry.com's multi-site architecture.


Basic multi-store data architecture

A similar data model was implemented and enhanced for another client, College District. A site_id column exists in multiple tables including the products and navigation tables, and sites can only access data in the current site schema. College District takes one step further in development for appearance management by storing CSS values in the database and enabling CSS stylesheet generation on the fly with the stored CSS settings. This architecture works well for College District because it allows the owners to quickly publish new sites. Below is a simplified view of the data structure used for College District, where the table site_variables contains CSS values (text, background colors, etc.).


Extended multi-store data architecture where site_variables table contains CSS settings. The store frontend can access data in the current store schema only.

In the past year, running a multi-site setup has been a popular topic in the user group for Spree, an open source Ruby on Rails platform. I've been happy to be involved in a couple of client multi-site projects. Here I'll discuss some comments for my Spree multi-site implementation work.

Basic Extension

First, the Spree multi-domain extension developed by the Spree core team serves as a strong starting point. The extension migrations produce the data structure shown below, by creating a new table stores and products_stores, and adding a store_id column to the tracker (Google Analytics data) and orders table.

Code changes in addition to data model changes are:

  • A before filter sets the current store by examining the current environment variable "SERVER_NAME".
  • Frontend product retrieval is modified to retrieve products in the current store only.
  • Frontend navigation is generated based on products assigned to the current store.
  • The Tracker (Google Analytics) retrieval method is modified to only retrieve the current store settings.
  • Order processing assigns the store_id value to each order.

With this extension, a user can check out across multiple stores running from a single Spree application with the same account, products can be assigned to multiple stores, and a single default store is set to render if no domains match the current SERVER_NAME. The extension does not introduce the advanced schema behavior like College District's data architecture, however, the extension could be customized to do so. The extension suits basic requirements for a multi-store architecture.

Additional Customizations Required

In my project, there were additional changes required. I used the Spree static pages extension, which introduces functionality to present and manage static content pages. I modified this extension to create an additional stores_pages table that introduces a has and belongs to many relationship between stores and pages.

Basic Spree static pages data model. Expanded Spree static pages data model with multi-store design.

Other custom requirements may include modifying the spree-faq extension to build a has and belongs to many relationship between questions and stores, or similar changes that create relationships between the stores table and other data.

File Structure Organization

The next interesting design choice I faced was how to handle appearance management across multiple sites. As I mentioned before, a method was developed to retrieve and build views based on the current store in Backcountry.com's multi-store solution. With College District, the database stored CSS values and generated new stylesheets on the fly for each site. I chose to implement a simplified version of the College District implementation, where a single stylesheet contains the rules for each site.

In the Spree implementation, the Spree::BaseController class is modified with a before filter to set the store and asset (images, javascript, stylesheet) location:

module Spree::Fantegrate::BaseController
  def self.included(controller)
    controller.class_eval do
      controller.append_before_filter :set_store_and_asset_location
    end
  end
  def set_store_and_asset_location
    @current_store ||= Store.by_domain(request.env['SERVER_NAME']).first
    @current_store ||= Store.default.first
    @asset_location = @current_store.domains.gsub('.mydomain.com', '')
  end
end

The default layout includes the main and site specific stylesheets:

  <%= stylesheet_link_tag "style" %>
  <%= stylesheet_link_tag "#{@asset_location}/site" %>

The site specific stylesheet contains style rules and includes site specific image settings:

body.site_a { background: #101a35 url(/images/site_a/bg.jpg) repeat-x 0 0; }
.site_a h1#logo { float: left; display: inline; width: 376px; height: 131px; position:relative; left: -15px; padding-bottom: 10px; }
.site_a h1#logo a { display: block; height: 131px; background: url(/images/site_a/logo.png); }
.site_a #top-right-info,
.site_a #top-right-info a,
.site_a #top-right-info b { color: #fff; }
...

This implementation acts on the assumption that there will be minimal design differences across stores. This is a simple and effective way to get an initial multi-store architecture in place that allows you to manage multiple site's appearance in a single Spree application.

Advanced Topics

Some advanced topics I've considered with this work are:

  • Can we dynamically generate the migrations based on which extensions are installed? For example, a list of tables would be included in the main extension. This list of tables would be iterated through and if the table exists, a dynamic migration is generated to build a has many/belongs to, has one/belongs to, or has and belongs to many relationship between stores and the table.
  • SSL setup for our Spree multi-store implementation was accomplished with a wildcard SSL certificate, where each store can be accessed from a different subdomain. Backcountry.com and College District implementation was accomplished with standard SSL certificates because the stores do not share domains. The Spree implementation methods described in this article do not vary for subdomain versus different domain design, but this is certainly something to consider at the start of a multi-site project.
  • The multi-domain extension could be customized and enhanced to include a table that contains CSS settings similar to our College District implementation and allows you to generate new stylesheets dynamically to change the look of a site. The obvious advantage of this is that a user with site administrative permissions can change the appearance of the site via backend Spree management without involving development resources.

Learn more about End Point's Ruby on Rails Development or Ruby on Rails Ecommerce Services.

Postgres Conference - PGCon2010 - Day Two

Day two of the PostgreSQL Conference started a little later than the previous day in obvious recognition of the fact that many people were up very, very late the night before. (Technically, this is day four, as the first two days consisted of tutorials; this was the second day of "talks").

The first talk I went to was PgMQ: Embedding messaging in PostgreSQL by Chris Bohn. It was well attended, although there were definitely a lot of late-comers and bleary eyes. A tough slot to fill! Chris is from Etsy.com and I've worked with him there, although I had no interaction with the PgMQ project, which looks pretty cool. From the talk description:

PgMQ (PostgreSQL Message Queueing) is an add-on that embeds a messaging client inside PostgreSQL. It supports the AMQP, STOMP and OpenWire messaging protocols, meaning that it can work with all of the major messaging systems such as ActiveMQ and RabbitMQ. PgMQ enables two replication capabilities: "Eventually Consistent" Replication and sharding.

As near as I can tell, "eventually consistent" is the same as "asynchronous replication": the slave won't be the same as the master right away, but will be eventually. As with Bucardo and Slony, the actual lag is very small in practice: a handful of seconds at the most. I like the fact that it supports all those common messaging protocols. Chris mentioned in the talk that it should be possible for other systems like Bucardo to support something similar. I'll have to play around with PgMQ a bit and see about doing just that. :)

The typical post-talk gatherings
The typical post-talk gatherings

The next "talk" was the enigmatically labeled Replication Panel. Enigmatic in this case as it had no description whatsoever. It's a good thing I had decided to check it out anyway (I'm a sucker for any talk related to replication, in case it wasn't obvious yet). I was apparently nominated to be on the panel, representing Bucardo! So much for getting all my speaking done and over with the first day. The panel represented a pretty wide swatch of Postgres replication technologies, and by the people who are very deep in the development of each one. From left to right on a cluster of stools at the front of the room was:

After a quick one-minute each intro describing who we were and what our replication system was, we took questions from the audience. Rather, Dan Langille played the part of the moderator and gathered written questions from the audience which he read to us, and we each took turns answering. We managed to get through 16 questions. All were interesting, even if some did not apply to all the solutions. Some of the more relevant ones I remember:

    "If your replication solution was not available, which of the other replication solutions would you recommend?" This was my favorite question. My answer was: if using Bucardo in multi-master mode, switch to pgpool. If using in master-slave mode, use Slony.

    "How will PG 9.0 affect your solution? Will your solution still remain relevant?" This most heavily affects Bucardo, Slony, and Londiste, and we all agreed that we're happy to lose users who simply need a read-only copy of their database. Their remains plenty of use cases that 9.0 will not solve however.

    "For multi-master solutions: How are database collisions resolved? Do you recommend your solution for geographically remote locations?" This one is pretty much for me alone. :) I gave a quick overview of Bucardo's built-in conflict resolution systems, and how custom ones built on business logic works. Since Bucardo was originally built to support servers over a non-optimal network, the second part was an easy Yes.

    "Is there a way to standardize and reduce the number of replication systems and focus on making the subset more robust, efficient, and versatile?" The general answer was no, as the use cases for all of them are so wildly different. I thought the only possible reduction was to combine Slony and Londiste, as they are very close technically and have pretty much identical use cases.

    "How easy is it to switch masters? Are you planning on improving the tools to do so?" With Bucardo, switching is as easy as pointing to a different database if using master-master. However, Bucardo master-slave has no built in support at all for failover (like Slony does). So the answer is "not easy at all" and yes, we want to provide tools to do so.

    "What is your biggest bug, problem, or limitation you are fixing now?" All three of the async trigger solutions (Bucardo, Slony, and Londiste) answered "DDL triggers". Which is hopefully coming for 9.1 (stop reading this blog and get to work on that, Jan).

    All in all, I really liked the panel, and I think the audience did as well. Hopefully we'll see more things like at future conferences. Since we did not know the questions before hand, and took everything from the audience, it was the polar opposite of someone giving a talk with prepared slides.

    I had some people come up to me afterwards to ask for more details about Bucardo, because (as they pointed out), it's the only multi-master replication system for Postgres (not technically true, as pg-pool and rubyrep provide multi-master use cases as well, but the former is synchronous and fairly complex, while the latter is very new and lacking some features). Maybe next year I should give a whole talk on Bucardo rather than just blabbing about it here on the blog. :)

    After that, I popped into the Check Please! What Your Postgres Databases Wishes You Would Monitor talk by Robert Treat (who I also used to work with). It was a good talk, but pretty much review for me, as watching over and monitoring databases is what I spend a lot of my time doing. :) Here's the description:

    Compared to many proprietary systems, Postgres tends to be pretty straight forward to run. However, if you want to get the most from your database, you shouldn't just set it and forget it, you need to monitor a few key pieces of information to keep performance going. This talk will review several key metrics you should be aware of, and explain under which scenarios you may need additional monitoring.

    The final talk I went to was Deploying and testing triggers and functions in multiple databases by Norman Yamada. This was an interesting talk for me because he was using a lot of the code from the same_schema action in the check_postgres program to do the actual comparison. Indeed, I made some patches while at the conference to allow for better index comparison's at Norman's request. I also managed to get some work done on tail_n_mail and Bucardo while there - something about being surrounded by all that Postgres energy made me productive despite having very little free time.

    I had to catch an early flight, and was not able to catch the final talk slot of the day, nor the closing session or the BOFs that night. Hopefully someone who did catch those will blog about it and let me know how it went. I hear the t-shirt we signed at the developer's meeting went for a sweet ransom.

    If you went to PgCon, I have two requests for you. First, please fill out the feedback for each talk you went to. It takes less than a minute per talk, and is invaluable for both the speakers and the conference organizers. Second, please blog about PgCon. It's helpful for people who did not get to go to see the conference through other people's eyes. And do it now, while things are still fresh.

    If you did not go to PgCon, I have one request for you: go next year! Perhaps next year at PgCon 2011 we'll break the 200 person mark. Thanks to Dan Langille as always for creating PgCon and keeping it running smooth year after year.

Learn more about End Point's Postgres Support, Development, and Consulting.

PostgreSQL Conference - PGCon 2010 - Day One

The first day of talks for PGCon 2010 is now over, here's a recap of the parts that I attended.

On Wednesday, the developer's meeting took place. It was basically 20 of us gathered around a long conference table, with Dave Page keeping us to a strict schedule. While there were a few side conversations and contentious issues, overall we covered an amazing amount of things in a short period of time, and actually made action items out of almost all of them. My favorite *decision* we made was to finally move to git, something myself and others have been championing for years. The other most interesting parts for me were the discussion of what features we will try to focus on for 9.1 (it's an ambitious list, no doubt), and DDL triggers! It sounds like Jan Wieck has already given this a lot of thought, so I'm looking forward to working with him in implementing these triggers (or at least nagging him about it if he slows down). These triggers will be immensely useful to replication systems like Bucardo and Slony, which implement DDL replication in a very manual and unsatisfactory way. These triggers will not be like the current triggers, in that they will not be directly attached to system tables. Instead, they will be associated with certain DDL events, such that you could have a trigger on any CREATE events (or perhaps also allowing something finer grained such as a trigger on a CREATE TABLE event). Whenever it comes in, I'll make sure that Bucardo supports it, of course!

The first day of talks kicked off the the plenary by Gavin Roy called "Perspectives on NoSQL" (description and slides are available). Gavin actually took the time to *gasp* research the topic, and gave a quick rundown of some of the more popular "NoSQL" solutions, including CouchDB, MongoDB, Cassandra, Project Voldemort, Redis, and Tokyo Tyrant. He then benchmarked all of them against Postgres for various tasks - and did it against both "regular safe" Postgres and "running with scissors" fsync-off Postgres. The results? Postgres scales, very well, and more than holds it own against the NoSQL newcomers. MongoDB did surprisingly well: see the slides for the details. His slides also had the unfortunate portmanteau of "YeSQL", which only helps to empahsize how silly our "PostgreSQL" name is. :)

The next talk was Postgres (for non-Postgres people) by Greg Sabino Mullane (me!). Unlike previous years, my slides are already online. Yes, at first blush, it seems a strange talk to give at a conference like this, but we always have a good number of people from other database systems that are considering Postgres, are in the process of migrating to Postgres, or are just new to Postgres. The talk was in three parts: the first was about the mechanics of migrating your application to Postgres: the data types that Postgres uses, how we implement indexes, the best way to migrate your data, and many other things, with an eye towards common migration problems (especially when coming from MySQL). The second part of the talk discussed some of the quirks of Postgres people coming from DB2, Oracle, etc. should be aware of. Some things discussed: how Postgres does MVCC and need for vacuum, our really smart planner and lack of hints, the automatic (and against the spec) lowercasing, and our concept of schemas. I also touched on what I see as some of our drawbacks: tuned for a toaster, no true in place upgrade, the unpronounceable name, the lack of marketing. and what some of our perceived-but-not-real drawbacks are: lack of replication, poor speed. What would a list of drawbacks be without a list of strengths?: transactional DDL, very friendly and helpful community, PostGIS, authentication options, awesome query planner, the ability to create your own custom database objects, and our distributed nature that ensures the project cannot be bought out or destroyed. The last part of the talk went over the Postgres project itself: the community, the developers, the philosophy, and how it all fits together. I ran out of time so did not get to tell my "longest patch process ever" story for \dfS (six years!) but I don't think I missed anything important and gave time for some questions.

The next talk was Hypothetical Indexes towards self-tuning in PostgreSQL by Sergio Lifschitz. In the words of Sergio:

Hypothetical indexes are simulated index structures created solely in the database catalog. This type of index has no physical extension and, therefore, cannot be used to answer actual queries. The main benefit is to provide a means for simulating how query execution plans would change if the hypothetical indexes were actually created in the database. This feature is quite useful for database tuners and DBAs.

It was a very interesting talk. Robert Haas asked him to put it in the PostgreSQL license so we can easily put it into the project as needed. Sergio promised to make the change immediately after the talk!

After lunch, the next talk was pg_statsinfo - More useful statistics information for DBAs by Tatsuhito Kasahara. This talk was a little hard to follow along, but had some interesting ideas about monitoring Postgres, a lot of which overlapped with some of my projects such as tail_n_mail and check_postgres.

The next talk was Forensic Analysis of Corrupted Databases by Greg Stark. This was a neat little talk; many of the error messages he displayed were all too familiar to me. It was nice overview of how to track down the exact location of a problem in a corrupted database, and some strategies for fixing it, including the old "using dd to write things from /dev/zero directly into your Postgres files" trick. There was even a discussion about the possibility of zeroing out specific parts of a page header, with the consensus that it would not work as one would hope.

After a quick hacky sack break with Robert Treat and some Canadian locals, I went to the final real talk of the day: The PostgreSQL Query Planner by Robert Haas. I had seen this talk recently, but wanted to see it again as I missed some of the beginning of the talk when I saw it at Pg East 2010 in Philly. Robert gave a good talk, and was very good at repeating the audience's questions. I didn't learn all that much, but it was a very good overview of the planner, including some of the new planner tricks (such as join removal) in 9.0 and 9.1.

After that, the lightning talks started. I really like lightning talks, and thankfully they weren't held on the last day of the conference this time (a common mistake). The MC was Selena Deckelmann, who did a great job of making sure all the slides were gathered up beforehand, and strictly enforced the five minute time limit. The list of slides is on the Postgres wiki. I talked on my latest favorite project, tail_n_mail - the slides are available on the wiki. I didn't make it through all my slides, so if you were at the talks, check out the PDF for the final two that were not shown. There seemed to be good interest in the project, and I had several people tell me afterwards they would try it out.

The night ended with the EnterpriseDB sponsored party. I spoke to a lot of people there, about replication, PITR scripts, log monitoring, the problem with a large number of inherited objects, and many other topics. Note to EDB: I don't think that venue is going to scale, as the conference gets bigger each year! The total number of people at the conference this year was 184, a new record.

A very good first day: I learned a lot, met new people, saw old friends, and hopefully sold Postgres to some non-Postgres people :). I also managed to git push some changes to tail_n_mail, check_postgres, and Bucardo. It's hard to say no to feature requests when someone asks you in person. :)

Learn more about End Point's Postgres Support, Development, and Consulting.

PostgreSQL switches to Git

Looks like the Postgres project is finally going to be bite the bullet and switch to git as the canonical VCS. Some details are yet to be hashed out, but the decision has been made and a new repo will be built soon. Now to lobby to get that commit-with-inline-patches list to be created...

PostgreSQL 8.4 on RHEL 4: Teaching an old dog new tricks

So a client has been running a really old version of PostgreSQL in production for a while. We finally got the approval to upgrade them from 7.3 to the latest 8.4. Considering the age of the installation, it should come as little surprise that they had been running a similarly ancient OS: RHEL 4.

Like the installed PostgreSQL version, RHEL 4 is ancient -- 5 years old. I anticipated that in order to get us to a current version of PostgreSQL, we'd need to resort to a source build or rolling our own PostgreSQL RPMs. Neither approach was particularly appealing.

While the age/decrepitude of the current machine's OS came as little surprise, what did come as a surprise was that there were supported RPMs available for RHEL 4 in the community yum rpm repository, located at http://yum.pgrpms.org/8.4/redhat/rhel-4-i386/repoview/ (modulo your architecture of choice).

In order to get things installed, I followed the instructions for installing the specific yum repo. There were a few seconds where I was confused because the installation command was giving a "permission denied" error when attempting to install the 8.4 PGDG rpm as root. A little brainstorming and a lsattr later revealed that a previous administrator, apparently in the quest for über-security, had performed a chattr +i on the /etc/yum.repo.d directory.

Evil having been thwarted, in the interest of über-usability I did a quick chattr -i /etc/yum.repo.d and installed the PGDG rpm. Away we went. From that point, the install was completely straightforward; I had a PostgreSQL 8.4.4 system running in no time, and could finally get off that 7.3 behemoth. Now to talk my way into an OS upgrade...

Learn more about End Point's Postgres Support, Development, and Consulting.

Finding the PostgreSQL version - without logging in!

Metasploit used the error messages given by a PostgreSQL server to find out the version without actually having to log in and issue a "SELECT version()" command. The original article is at http://blog.metasploit.com/2010/02/postgres-fingerprinting.html and is worth a read. I'll wait.

The basic idea is that because version 3 of the Postgres protocol gives you the file and the line number in which the error is generated, you can use the information to figure out what version of Postgres is running, as the line numbers change from version to version. In effect, each version of Postgres reveals enough in its error message to fingerprint it. This was a neat little trick, and I wanted to explore it more myself. The first step was to write a quick Perl script to connect and get the error string out. The original Metasploit script focuses on failed login attempts, but after some experimenting I found an easier way was to send an invalid protocol number (Postgres expects "2.0" or "3.0"). Sending a startup packet with an invalid protocol of "3.1" gave me back the following string:

E|SFATALC0A000Munsupported frontend protocol 3.1: 
server supports 1.0 to 3.0Fpostmaster.cL1507RProcessStartupPacket

The important part of the string was the parts indicating the file and line number:

Fpostmaster.cL1507

In this case, we can clearly see that line 1507 of postmaster.c was throwing the error. After firing up a few more versions of Postgres and recording the line numbers, I found that all versions since 7.3 were hitting the same chunk of code from postmaster.c:

/* Check we can handle the protocol the frontend is using. */

if (PG_PROTOCOL_MAJOR(proto) <> PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST) ||
  (PG_PROTOCOL_MAJOR(proto) == PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST) &&
   PG_PROTOCOL_MINOR(proto) > PG_PROTOCOL_MINOR(PG_PROTOCOL_LATEST)))
  ereport(FATAL,
  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
    errmsg("unsupported frontend protocol %u.%u: server supports %u.0 to %u.%u",
      PG_PROTOCOL_MAJOR(proto), PG_PROTOCOL_MINOR(proto),
      PG_PROTOCOL_MAJOR(PG_PROTOCOL_EARLIEST),
      PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST),
      PG_PROTOCOL_MINOR(PG_PROTOCOL_LATEST))));

Line numbers were definitely different across major versions of Postgres (e.g. 8.2 vs. 8.3), and were even different sometimes across revisions. Rather than fire up every possible revision of Postgres and run my program against it, I simply took advantage of the cvs tags (aka symbolic names) and did this:

cvs update -rREL8_3_0 -p postmaster.c | grep -Fn 'LATEST))))'

This showed me that the string occurred on line 1497 of postmaster.c. I created a Postgres instance and verified that the line number was the same. At that point, it was a simple matter of making a bash script to grab all releases since 7.3 and build up a comprehensive list of when that line changed from version to version.

Once that was done, I rolled the whole thing up into a new Perl script called "detect_postgres_version.pl". Here's the script, broken into pieces for explanation. A link to the entire script is at the bottom of the post.

First, we do some standard Perl script things and read in the __DATA__ section at the bottom of the script, which lists at which version the message has changed:

#!/usr/bin/env perl

## Quickly and roughly determine what version of Postgres is running
## greg@endpoint.com

use strict;
use warnings;
use IO::Socket;
use Data::Dumper;
use Getopt::Long;

## __DATA__ looks like this: filname / line / version when it changed
## postmaster.c 1287 7.4.0
## postmaster.c 1293 7.4.2
## postmaster.c 1293 7.4.29
##
## postmaster.c 1408 8.0.0
## postmaster.c 1431 8.0.2

## Build our hash of file-and-line to version matches
my %map;
my ($last,$lastmin,$lastline) = ('',0,0);
while () {
   next if $_ !~ /(\w\S+)\s+(\d+)\s+(.+)/;
   my ($file,$line,$version) = ($1,$2,$3);
   die if $version !~ /(\d+)\.(\d+)\.(\d+)/;
   my ($vmaj,$vmin,$vrev) = ($1,$2,$3);
   my $current = "$file|$vmaj|$vmin";
   if ($current eq $last) {
       my ($lfile,$lmaj,$lmin) = split /\|/ => $last;
       for (my $x = $lastmin+1 ; $x<$vrev; $x++) {
           push @{$map{$file}{$lastline}}
             => ["$lmaj.$lmin","$lmaj.$lmin.$x"];
       }
   }
   push @{$map{$file}{$line}} => ["$vmaj.$vmin",$version];
   $last = $current;
   $lastmin = $vrev;
   $lastline = $line;
}

Next, we allow a few options to the script: port and host. We'll default to a Unix socket if the host is not set, and default to port 5432 if none is given:

## Read in user options and set defaults
my %opt;
GetOptions(\%opt,
          'port=i',
          'host=s',
);

my $port = $opt{port} || 5432;
my $host = $opt{host} || '';

We're ready to connect, using the very standard IO::Socket module. If the host starts with a slash, we assume this is the unix_socket_directory and replace the default '/tmp' location:

## Start the connection, either unix or tcp
my $server;
if (!$host or !index $host, '/') {
   my $path = $host || '/tmp';
   $server = IO::Socket::UNIX->new(
       Type => IO::Socket::SOCK_STREAM,
       Peer => "$path/.s.PGSQL.$port",
   ) or die "Could not connect!: $@";
}
else {
   $server = IO::Socket::INET->new(
       PeerAddr => $host,
       PeerPort => $port,
       Proto    => 'tcp',
       Timeout  => 3,
   ) or warn "Could not connect!: $@";
}

Now we're ready to actually send something over our new socket. Postgres expects the startup packet to be in a certain format. We'll follow that format, but send it an invalid protocol number, 3.1. The rest of the information does not really matter, but we'll also tell it we're connecting as user "pg". Finally, we read back in the message, extract the file and line number, and spit them back out to the user:

## Build and sent the packet
my $packet = pack('nn', 3,1) . "user\0pg\0\0";
$packet = pack('N', length($packet) + 4). $packet;
$server->send($packet, 0);

## Get the message back and extract the filename and line number
my $msg;
recv $server, $msg, 1000, 0;
if ($msg !~ /F([\w\.]+)\0L(\d+)/) {
   die "Could not find a file and line from error message: $msg\n";
}

my ($file,$line) = ($1,$2);

print "File: $file Line: $line\n";

Finally, we try to map the file name and line number we received back to the version of PostgreSQL it came from. If the file is not recognized, or the line number is not known, we bail out early:

$map{$file}
   or die qq{Sorry, I do not know anything about the file "$file"\n};

$map{$file}{$line}
   or die qq{Sorry, I do not know anything about line $line of file "$file"\n};

If there is only one result for this line and file number, we can state what it is and exit.

my $result = $map{$file}{$line};

if (1 == @$result) {
   print "Most likely Postgres version $result->[0][1]\n";
   exit;
}

In most cases, though, we don't know the exact version down to the revision after the second dot, so we'll state what the major version is, and all the possible revisions:

## Walk through and figure out which versions it may be.
## For now, we know that the major version does not overlap
print "Most likely Postgres version $result->[0][0]\n";
print "Specifically, one of these:\n";

for my $row (@$result) {
   print "  Postgres version $row->[1]\n";
}

exit;

The only thing left is the DATA section, which I'll show here to be complete:

__DATA__

## Format: filename line version

postmaster.c 1167 7.3.0
postmaster.c 1167 7.3.21

postmaster.c 1287 7.4.0
postmaster.c 1293 7.4.2
postmaster.c 1293 7.4.29

postmaster.c 1408 8.0.0
postmaster.c 1431 8.0.2
postmaster.c 1441 8.0.5
postmaster.c 1445 8.0.6
postmaster.c 1439 8.0.7
postmaster.c 1443 8.0.9
postmaster.c 1445 8.0.14
postmaster.c 1445 8.0.25

postmaster.c 1449 8.1.0
postmaster.c 1450 8.1.1
postmaster.c 1454 8.1.2
postmaster.c 1448 8.1.3
postmaster.c 1452 8.1.4
postmaster.c 1448 8.1.9
postmaster.c 1454 8.1.10
postmaster.c 1454 8.1.21

postmaster.c 1432 8.2.0
postmaster.c 1437 8.2.1
postmaster.c 1440 8.2.5
postmaster.c 1432 8.2.17

postmaster.c 1497 8.3.0
postmaster.c 1507 8.3.8
postmaster.c 1507 8.3.11

postmaster.c 1570 8.4.0
postmaster.c 1621 8.4.1
postmaster.c 1621 8.4.4

postmaster.c 1664 9.0.0

(Because version 9.0 is not released yet, its line number may still change.)

I found this particular protocol error to be a good one because there is no overlap of line numbers across major versions. Of the approximately 125 different versions released since 7.3.0, only 6 are unique enough to identify to the exact revision. That's okay for this iteration of the script. If you wanted to know the exact revision, you could try other errors, such as an invalid login, as the metasploit code does.

The complete code can be read here: detect_postgres_version.pl

I'll be giving a talk later on this week at PgCon 2010, so say hi if you see me there. I'll probably be giving a lightning talk as well.

Learn more about End Point's Postgres Support, Development, and Consulting.

Thrillist Buys JackThreads

We were excited to learn yesterday that "flash sale" site JackThreads was acquired by Thrillist. Congratulations!

End Point has provided technical assistance for the members-only JackThreads store in numerous ways: provisioning hosting, setting up automated development environments, improving performance for scalability, integrating with third-party systems, and various development projects.

Though the businesses and technology are completely unrelated, End Point has also helped develop and support the growth of Backcountry.com's several trailblazing "one deal at a time" (ODAT) sites, starting with SteepandCheap.com.

The "flash sale" business model is a lot of fun and opens up new opportunities for retail and technical innovation alike and we look forward to more work in this area.

Continuing an interrupted git-svn clone

I've run into the issue before when using git-svn to clone a large svn repo; something interrupts the transfer, and you end up having to restart the git-svn clone process again. Attempting to git-svn clone from a partially transferred svn clone directory results in error messages from git-svn, and it's not immediately clear what you need to do to pick the process back up from where you left off.

In the past I've just blown away the partially-transferred repo and started the clone over, but that's a waste of time and server resources, not to mention extremely frustrating, particularly if you're substantially into the clone process.

Fortunately, this is not necessary; just go into your partially retrieved git-svn repo and execute git-svn fetch. This continues fetching the svn revisions from where you left off. When the process completes, you will have empty directory with just the .git directory present. Looking at git status shows all of the project files deleted (oh noes!), however this is just misdirection. At this point, you just need to issue a git reset --hard to check out the files in the HEAD commit.

More illustratively:

$ git svn clone http://svn.example.com/project/trunk project
# download, download, download, break!
$ cd project; ls -a
.git
$ git svn fetch
# download, download, download, success!
$ ls -a
.git
$ git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD ..." to unstage)
#
#       deleted:    foo.c
#       deleted:    foo.h
#
$ git reset --hard; ls -a1
.git
foo.c
foo.h

Using PostgreSQL Hooks

PostgreSQL is well known for its extensibility; users can build new functions, operators, data types, and procedural languages, among others, without having to modify the core PostgreSQL code. Less well known is PostgreSQL's extensive set of "hooks", available to the more persistent coder. These hooks allow users to interrupt and modify behavior in all kinds of places without having to rebuild PostgreSQL.

Few if any of these hooks appear in the documentation, mostly because the code documents them quite well, and anyone wanting to use them is assumed already to be sufficiently familiar with the code to find the information they'd need to use one. For those interested in getting started using hooks, though, an example can be useful. Fortunately, the contrib source provides one, in the form of passwordcheck, a simple contrib module that checks users' passwords for sufficient strength. These checks include having a length greater than 8 characters, being distinct from the username, and containing both alphabetic and non-alphabetic characters. It can also use CrackLib for more intense password testing, if built against the CrackLib code.

In general, these hooks consist of global function pointers of a specific type, which are initially set to NULL. Whenever PostgreSQL wants actually to use a hook, it checks the function pointer, and if it's not NULL, calls the function it points to. When someone implements a hook, they write a function of the proper type and an initialization function to set the function pointer variable. They then package the functions in a library, and tell PostgreSQL to load the result, often using shared_preload_libraries.

For our example, the important pieces of the PostgreSQL code are in src/backend/commands/user.c and src/include/commands/user.h. First, we need a function pointer type, which in this case is called check_password_hook_type:

typedef void (*check_password_hook_type)
   (const char *username, const char *password,
   int password_type, Datum validuntil_time,
   bool validuntil_null);

extern PGDLLIMPORT check_password_hook_type check_password_hook;

This says the check_password_hook will take arguments for user name, password, password type, and validity information (for passwords valid until certain dates). It also provides an extern declaration of the actual function pointer, called "check_password_hook".

The next important pieces of code are in src/backend/commands/user.c, as follows:

/* Hook to check passwords in CreateRole() and AlterRole() */
check_password_hook_type check_password_hook = NULL;

...which defines the function hook variable, and this:

if (check_password_hook && password)
  (*check_password_hook) (stmt->role, password,
      isMD5(password) ? PASSWORD_TYPE_MD5 : PASSWORD_TYPE_PLAINTEXT,
    validUntil_datum,
    validUntil_null);

...which actually uses the hook. Actually the hook is used twice, with identical code, once in CreateRole() and once in AlterRole(), so as to provide password checking in both places. (Insert D.R.Y. rant here).

In order to take advantage of this hook, the passwordcheck module needs to implement the hook function, and set the check_password_hook variable to point to that function. First, passwordcheck.c needs to include a few things, including "commands/user.h" to ge the definitions of check_password_hook and check_password_hook_type, and call the PG_MODULE_MAGIC macro every PostgreSQL shared library needs. Then, it implements the password checking logic in a function called check_password():

static void
check_password(const char *username,
      const char *password,
      int password_type,
      Datum validuntil_time,
      bool validuntil_null)
{
/* Actual password checking logic goes here */
}

Note that this declaration matches the arguments described in the check_password_hook_type, above.

Now to ensure the check_password_hook variable points to this new check_password() function. When loading a shared library, PostgreSQL looks for a function defined in that library called _PG_init(), and runs it if it exists. In passwordcheck, the _PG_init() function is as simple as this:

void
_PG_init(void)
{
 /* activate password checks when the module is loaded */
 check_password_hook = check_password;
}

Other modules using hooks often check the hook variable for NULL before setting it, in case something else is already using the hook. For instance, the auto_explain contrib module does this in _PG_init() (note that auto_explain uses three different hooks):

prev_ExecutorStart = ExecutorStart_hook;
 ExecutorStart_hook = explain_ExecutorStart;
 prev_ExecutorRun = ExecutorRun_hook;
 ExecutorRun_hook = explain_ExecutorRun;
 prev_ExecutorEnd = ExecutorEnd_hook;
 ExecutorEnd_hook = explain_ExecutorEnd;

auto_explain also resets the hook variables in its _PG_fini() function. Since unloading modules isn't yet supported and thus, _PG_fini() never gets called, this is perhaps unimportant, but is good for the sake of being thorough.

Back to passwordcheck. Having set the hook variable, all that remains is to get PostgreSQL to load this library. The easiest way to do that is to set shared_preload_libraries in postgresql.conf:

josh@eddie:~/devel/pgsrc/pg-eggyknap/contrib/passwordcheck$ psql
psql (9.0devel)
Type "help" for help.

5432 josh@josh# show shared_preload_libraries ;
 shared_preload_libraries 
--------------------------
 passwordcheck
(1 row)

Restarting PostgreSQL loads the library, proven as follows:

5432 josh@josh# create user badpass with password 'bad';
ERROR:  password is too short

There are hooks like this all over the PostgreSQL code base. Simply search for "_hook_type", to find such possibilities as these:

NameDescription
shmem_startup_hookCalled when PostgreSQL initializes its shared memory segment
explain_get_index_name_hookCalled when explain finds indexes' names.
planner_hookRuns when the planner begins, so plugins can monitor or even modify the planner's behavior
get_relation_info_hookAllows modification of expansion of the information PostgreSQL gets from the catalogs for a particular relation, including adding fake indexes

Learn more about End Point's Postgres Support, Development, and Consulting.

PostgreSQL template databases to restore to a known state

Someone asked on the mailing lists recently about restoring a PostgreSQL database to a known state for testing purposes. How to do this depends a little bit on what one means by "known state", so let's explore a few scenarios and their solutions.

First, let's assume you have a Postgres cluster with one or more databases that you create for developers or QA people to mess around with. At some point, you want to "reset" the database to the pristine state it was in before people starting making changes to it.

The first situation is that people have made both DDL changes (such as ALTER TABLE ... ADD COLUMN) and DML changes (such as INSERT/UPDATE/DELETE). In this case, what you want is a complete snapshot of the database at a point in time, which you can then restore from. The easiest way to do this is to use the TEMPLATE feature of the CREATE DATABASE command.

Every time you run CREATE DATABASE, it uses an already existing database as the "template". Basically, it creates a copy of the template database you specify. If no template is specified, it uses "template1" by default, so that these two commands are equivalent:

CREATE DATABASE foobar;
CREATE DATABASE foobar TEMPLATE template1;

Thus, if we want to create a complete copy of an existing database, we simply use it as a template for our copy:

CREATE DATABASE mydb_template TEMPLATE mydb;

Thus, when we want to restore the mydb database to the exact same state as it was when we ran the above command, we simply do:

DROP DATABASE mydb;
CREATE DATABASE mydb TEMPLATE mydb_template;

You may want to make sure that nobody changes your new template database. One way to do this is to not allow any non-superusers to connect to the database by setting the user limit to zero. This can be done either at creation time, or afterwards, like so:

CREATE DATABASE mydb_template TEMPLATE mydb CONNECTION LIMIT 0;

ALTER DATABASE mydb_template CONNECTION LIMIT 0;

You may want to go further by granting the database official "template" status by adjusting the datistemplate column in the pg_database table:

UPDATE pg_database SET datistemplate = TRUE WHERE datname = 'mydb_template';

This will allow anyone to use the database as a template, as long as they have the CREATEDB privilege. You can also restrict *all* connections to the database, even superusers, by adjusting the datallowconn column:

UPDATE pg_database SET datallowconn = FALSE WHERE datname = 'mydb_template';

Another way to restore the database to a known state is to use the pg_dump utility to create a file, then use psql to restore that database. In this case, the command to save a copy would be:

pg_dump mydb --create > mydb.template.pg

The --create option tells pg_dump to create the database itself as the first command in the file. If you look at the generated file, you'll see that it is using template0 as the template database in this case. Why does Postgres have template0 and template1? The template1 database is meant as a user configurable template that you can make changes to that will be picked up by all future CREATE DATABASE commands (a common example is a CREATE LANGUAGE command). The template0 database on the other hand is meant as a "hands off, don't ever change it" stable database that can always safely be used as a template, with no changes from when the cluster was first created. To that end, you are not even allowed to connect to the template0 database (thanks to the datallowconn column metioned earlier).

Now that we have a file (mydb.template.pg), the procedure to recreate the database becomes:

psql -X -c 'DROP DATABASE mydb'

psql -X --set ON_ERROR_STOP=on --quiet --file mydb.template.pg

We use the -X argument to ensure we don't have any surprises lurking inside of psqlrc files. The --set ON_ERROR_STOP=on option tells psql to stop processing the moment it encounters an error, and the --quiet tells psql to not be verbose and only let us know about very important things. (While I normally advocate using the --single-transaction option as well, we cannot in this case as our file contains a CREATE DATABASE line).


What if (as someone posited in the thread) the original poster really wanted only the *data* to be cleaned out, and not the schema (e.g. DDL)?. In this case, what we want to do is remove all rows from all tables. The easiest way to do this is with the TRUNCATE command of course. Because we don't want to worry about which tables need to be deleted before other ones because of foreign key constraints, we'll also use the CASCADE option to TRUNCATE. We'll query the system catalogs for a list of all user tables, generate truncate commands for them, and then play back the commands we just created. First, we create a simple text file containing commands to truncate all the tables:

SELECT 'TRUNCATE TABLE '
 || quote_ident(nspname)
 || '.'
 || quote_ident(relname)
 || ' CASCADE;'
FROM pg_class
JOIN pg_namespace n ON (n.oid = relnamespace)
WHERE nspname !~ '^pg'
AND nspname <> 'information_schema'
AND relkind = 'r';

Once that's saved as truncate_all_tables.pg, resetting the database by removing all rows from all tables becomes as simple as:

psql mydb -X -t -f truncate_all_tables.pg | psql mydb --quiet

We again use the --quiet option to limit the output, as we don't need to see a string of "TRUNCATE TABLE" strings scroll by. The -t option (also written as --tuples-only) prevents the headers and footers from being output, as we don't want to pipe those back in.

It's most likely you'd also want the sequences to be reset to their starting point as well. While sequences generally start at "1", we'll take out the guesswork by using the "ALTER SEQUENCE seqname RESTART" syntax. We'll append the following SQL to the text file we created earlier:

SELECT 'ALTER SEQUENCE '
 || quote_ident(nspname)
 || '.'
 || quote_ident(relname)
 || ' RESTART;'
FROM pg_class
JOIN pg_namespace n ON (n.oid = relnamespace)
WHERE nspname !~ '^pg'
AND nspname <> 'information_schema'
AND relkind = 'S';

The command is run the same as before, but now in addition to table truncation, the sequences are all reset to their starting values.


A final way to restore the database to a known state is a variation on the previous pg_dump command. Rather than save the schema *and* data, we simply want to restore the database without any data:

## Create the template file:
pg_dump mydb --schema-only --create > mydb.template.schemaonly.pg

## Restore it:
psql -X -c 'DROP DATABASE mydb'
psql -X --set ON_ERROR_STOP=on --file mydb.template.schemaonly.pg

Those are a few basic ideas on how to reset your database. There are a few limitations that got glossed over, such as that nobody can be connected to the database that is being used as a template for another one when the CREATE DATABASE command is being run, but this should be enough to get you started.

Learn more about End Point's Postgres Support, Development, and Consulting.

Tail_n_Mail does Windows (log file monitoring)

I've just released version 1.10.1 of tail_n_mail.pl, the handy script for watching over your Postgres logs and sending email when interesting things happen.

Much of the recent work on tail_n_mail has been in improving the parsing of statements in order to normalize them and give reports like this:

[1] From files A to Q Count: 839
First: [A] 2010-05-08T05:10:46-05:00 alpha postgres[13567]
Last:  [Q] 2010-05-09T05:02:27-05:00 bravo postgres[19334]
ERROR: duplicate key violates unique constraint "unique_email_address"
STATEMENT: INSERT INTO email_table (id, email, request, token) VALUES (?)

[2] From files C to E (between lines 12523 of A and 268431 of B, occurs 6159 times)                                          
First: [C] 2010-05-04 16:32:23 UTC [22504]                                                                                    
Last:  [E] 2010-05-05 05:04:53 UTC [23907]                                                                                    
ERROR: invalid byte sequence for encoding "UTF8": 0x????
HINT: This error can also happen if the byte sequence does not 
match the encoding expected by the server, which is controlled 
by "client_encoding".

## The above examples are from two separate instances, the first 
## of which has the "find_line_number" option turned off

However, I've only ever used tail_n_mail on Linux-like systems, so it will not work on Windows systems...until now. Thanks to an error report and patch from Paulo Saudin, this program will now work on Windows. There is an new option, mailmode, which defaults to 'sendmail', for the same behavior as previous versions of tail_n_mail. This assumes you have access to a sendmail binary (which may or may not be from the actual Sendmail program: many mail programs provide a compatible binary of the same name). If you don't have sendmail, you can now specify an argument of 'smtp' to the mailmode argument (you can also simply use --smtp). This switches to using the Net::SMTP::SSL module to send the mail instead of sendmail.

Switching the mailmode is not enough, of course, so there are some additional flags to help the mail go out:

  • --mailserver : the name of the outgoing SMTP server
  • --mailuser : the user to authenticate with
  • --mailpass : the password of the user
  • --mailport : the port to use: defaults to 465

Needless to say, using the --mailpass option from the command line or even in a script is not the best practice, so it is highly recommended that you put the new variables inside a tailnmailrc file. When the script starts, it looks for a file named .tailnmailrc in the current directory. If that is not found, it looks for the same file in your home directory (or technically, whatever the HOME environment variable is set to). If that does not exist, it checks for the file /etc/tailnmailrc. You can override those checks by specifying the file directly with the --tailnmailrc= option, or disable all rc files with the --no-tailnmailrc option.

The tailnmailrc file is very straightforward: each line is a name and value pair, separated by a colon or an equal sign. Lines starting with a '#' indicate a comment and are skipped. So someone using the new Net::SMTP::SSL method might have a .tailnmailrc in their home directory that looks like this:

mailmode=smtp
mailserver=mail.example.com
mailuser=greg@example.com
mailpass=mysupersekretpassword

The tail_n_mail program is open source and BSD licensed. Contributions are always welcome: send a patch, or fork a version through the Github mirror. There is also a Bugzilla system to accept bug reports and feature requests.

Learn more about End Point's Postgres Support, Development, and Consulting.

Cassandra, Thrift, and Fibers in EventMachine

I've been working with Cassandra and EventMachine lately, in an attempt to maximize write throughput for bulk loading situations (and I would prefer to not abandon the pretty Ruby classes I have fronting Cassandra, hence EventMachine rather than hopping over to Java or Scala).

The Thrift client transport for EventMachine requires the use of fibers. The documentation available for how fibers and EventMachine interact is not all that clear just yet, so perhaps documenting my adventures will be of use to somebody else.

A single fiber is traditionally imperative

EventMachine puts the I/O on background threads, but your use of the I/O interface will interact with it as if it's a traditional blocking operation.

#!/usr/bin/env ruby

require 'eventmachine'
require 'thrift_client'
require 'thrift_client/event_machine'
require 'cassandra'

def get_client 
  Cassandra.new('Keyspace1',
                '127.0.0.1:9160',
                :transport_wrapper => nil,
                :transport         => Thrift::EventMachineTransport)
end

def write(client, key, hash)
  puts "Writing #{key}."
  client.insert('Standard1', key, hash)
  puts "Wrote #{key}."
end

EM.run do
  Fiber.new do
    client = get_client
    write(client, 'foo', {'aard' => 'vark'})
    write(client, 'bar', {'platy' => 'pus'})
    EM.stop
  end.resume
end

The Thrift::EventMachine transport performs the actual Thrift network operations (connecting, sending data, receiving data) on a fiber in one of EventMachine's background threads. But it manages the callbacks and errbacks internally so the client behaves in usual blocking manner and does not expose the asyncronous delights going on behind the scenes.

Therefore, in the code snippet above, the "foo" row will be inserted first, and then the "bar" row. Every time. The output always is:

Wrote foo.
Wrote bar.

The above snippet is contrived, but it makes an import point: given two or more Thrift operations (like Cassandra inserts) that are logically independent of each other such that their order does not matter, you're not necessarily gaining a lot if those operations happen in the same fiber.

For concurrency, use multiple fibers

Now let's replace the above code sample's EM.run block with this:

EM.run do
  @done = 0 
  Fiber.new do
    write(get_client, 'foo', {'aard' => 'vark'})
    @done += 1
  end.resume
  Fiber.new do
    write(get_client, 'bar', {'platy' => 'pus'})
    @done += 1                 
  end.resume                   
  EM.add_periodic_timer(1) { EM.stop if @done == 2 } 
end
You don't know how this is going to play out, but the typical output proves the concurrent operation of the two fibers involved:
Writing foo.
Writing bar.
Wrote foo.
Wrote bar.

If we were writing a larger number of rows out to Cassandra, we could expect to see a greater variety of interleaving between the respective fibers.

Note a critical difference between the two examples. In the single-fiber example, we issue the EM.stop as the final step of the fiber. Because the single fiber proceeds serially, this makes sense. In the multi-fiber example, things run asyncronously, so we have no way of knowing for sure which fiber will complete first. Consequently, it's necessary have some means of signifying that work is done and the EM can stop; in this lame example, the @done instance variable acts as this flag. In a more rigorous example, you might use a queue and a queue's size to organize such things.

PostgreSQL startup Debian logging failure

I ran into issues with debugging why a fresh PostgreSQL replica wasn't starting on Debian. This was with a highly-customized postgresql.conf file with custom logging location, data_directory, etc. set.

The system log files were not showing any information about the failed pg_ctlcluster output, nor was there any information in /var/log/postgresql/ or the defined log_directory.

I was able to successfully create a new cluster with pg_createcluster and see logs for the new cluster in /var/log/postgresql/. The utility pg_lsclusters showed both clusters in the listing, but the initial cluster was still down, showing up with a custom log location. After reviewing the Debian wrapper scripts (fortunately written in Perl) I disabled log_filename, log_directory, and logging_collector, leaving log_destination = stderr. I was then finally able to get log information spit out to the terminal.

In this case, it was due to a fresh Amazon EC2 instance lacking appropriate sysctl.conf settings for kernel.shmmax and kernel.shmall. This particular error occurred before the logging was fully set up, which is why we did not get logging information in the postgresql.conf-designated location.

Once I had the log information, it was a short matter to correct the issue. It just goes to show that often finding the problem is 90% of the work. Hopefully this comes in handy to someone else.

Tickle me Postgres: Tcl inside PostgreSQL with pl/tcl and pl/tclu

Although I really love Pl/Perl and find it the most useful language to write PostgreSQL functions in, Postgres has had (for a long time) another set of procedural languages: Pl/Tcl and Pl/TclU. The Tcl language is pronounced "tickle", so those two languages are pronounced as "pee-el-tickle" and "pee-el-tickle-you". The pl/tcl languages have been around since before any others, even pl/perl; for a long time in the early days of Postgres using pl/tclu was the only way to do things "outside of the database", such as making system calls, writing files, sending email, etc.

Sometimes people are surprised when they hear I still use Tcl. Although it's not as widely mentioned as other procedural languages, it's a very clean, easy to read, powerful language that shouldn't be overlooked. Of course, with Postgres, you have a wide variety of languages to write your functions in, including:

The nice thing about Tcl is that not only is it an easy language to write in, it's fully supported by Postgres. Only three languages are maintained inside the Postgres tree itself: Perl, Tcl, and Python. Only two of those have a trusted and untrusted version: Perl and Tcl. All procedural languages in Postgres are untrusted by default, which means they can do things like make system calls. To be a trusted language, there must be some capacity to limit what can be done by the language. With Perl, this is accomplished through the "Safe" Perl module. For Tcl, this is accomplished by having two versions of the Tcl interpreter: a normal one for pltclu and a separate one that uses the "Safe-Tcl mechanism" for pltcl.

Let's take a quick look at what a pltcl function looks like. We'll use pl/tcl to implement the common problem of "SELECT COUNT(*) is very slow" by tracking the row count using triggers as we go along. For this, we'll start with a sample table that we want to be able to find out exactly how many rows are inside of any time, without suffering the delay of COUNT(*). Here's the table definition, and a quick command to populate it with some dummy data:

CREATE SEQUENCE customer_id_seq;

CREATE TABLE customer (
  id      INTEGER     NOT NULL DEFAULT nextval('customer_id_seq') PRIMARY KEY,
  email   TEXT            NULL,
  address TEXT            NULL,
  cdate   TIMESTAMPTZ NOT NULL DEFAULT now()
);

INSERT INTO customer (email, address)
  SELECT 'jsixpack@example.com', '123 Main Street'
  FROM generate_series(1,10000);

A quick review: we create a sequence for use by the table to populate its primary key, the 'id' column. Each customer also has an optional email and address, plus we automatically track when we create the row by using the "DEFAULT now()" trick on the 'cdate' column. Finally, we use the super handy generate_series function to populate the new table with ten thousand rows of data.

Next, we'll create a helper table that will keep track of the rows for us. We'll make it generic so that it can track any number of tables:

CREATE TABLE table_count (
  schemaname TEXT   NOT NULL,
  tablename  TEXT   NOT NULL,
  rows       BIGINT NOT NULL DEFAULT 0
);

INSERT INTO table_count(schemaname,tablename,rows)
  SELECT 'public', 'customer', count(*) FROM customer;

We also populated it with the current number of rows in customer. Of course, this will be out of date as soon as someone updates the table, so let's add our triggers. We don't want to update the table_count table on every single row change, but only at the end of each statement. To do that, we'll make a row-level trigger that stores up the changes inside a global variable, and then a statement-level trigger that uses the global variable to update the table_count table.

CREATE FUNCTION update_table_count_row()
  RETURNS TRIGGER
  SECURITY DEFINER
  VOLATILE
  LANGUAGE pltcl
AS $BC$

  ## Declare tablecount as a global variable so other functions
  ## can access our changes
  variable tablecount

  ## Set the local count of rows changed to 0
  set rows 0

  ## $TG_op indicates what type of command was just run
  ## Modify the local variable rows depending on what we just did
  switch $TG_op {
    INSERT {
      incr rows 1
    }
    UPDATE {
      ## No change in number of rows
      ## We could also leave out the ON UPDATE from the trigger below
    }
    DELETE {
      incr rows -1
    }
  }

  ## The tablecount variable will be an associative array
  ## The index will be this table's name, the value is the rows changed
  ## We should probably be using $TG_schema_name as well, but we'll ignore that

  ## If there is no variable for this table yet, create it, otherwise just change it
  if {![ info exists tablecount($TG_table_name) ] } {
    set tablecount($TG_table_name) $rows
  } else {
    incr tablecount($TG_table_name) $rows
  }

  return OK
$BC$;

CREATE FUNCTION update_table_count_statement()
  RETURNS TRIGGER
  SECURITY DEFINER
  LANGUAGE pltcl
AS $BC$

  ## Make sure we access the global version of the tablecount variable
  variable tablecount

  ## If it doesn't exist yet (for example, when an update changes no 
  ## rows), we simply exit early without making changes
  if { ! [ info exists tablecount ] } {
    return OK
  }
  ## Same logic if our specific entry in the array does not exist
  if { ! [ info exists tablecount($TG_table_name) ] } {
    return OK
  }
  ## If no rows were changed, we simply exit
  if { $tablecount($TG_table_name) == 0 } {
    return OK
  }

  ## Update the table_count table: may be a positive ior negative shift
  spi_exec "
    UPDATE table_count
    SET rows=rows+$tablecount($TG_table_name)
    WHERE tablename = '$TG_table_name'
  "

  ## Reset the global variable for the next round
  set tablecount($TG_table_name) 0

  return OK
$BC$;

CREATE TRIGGER update_table_count_row
  AFTER INSERT OR UPDATE OR DELETE
  ON public.customer
  FOR EACH ROW
  EXECUTE PROCEDURE update_table_count_row();

CREATE TRIGGER update_table_count_statement
  AFTER INSERT OR UPDATE OR DELETE
  ON public.customer
  FOR EACH STATEMENT
  EXECUTE PROCEDURE update_table_count_statement();

(Caveat: because there is a single Tcl interpreter for all pl/tcl functions, these functions are not 100% safe, as there is a theoretical chance that changes made by processes running at the exact same time may step on each other's global variables. In practice, this is unlikely.)

If everything is working correctly, we should see the entries in the table_count table match up with the output of SELECT COUNT(*). Let's take a look via a psql session:

psql=# \t
Showing only tuples.
psql=# \a
Output format is unaligned.

psql=# SELECT * FROM table_count; SELECT COUNT(*) FROM customer;
public|customer|10000
10000

psql=# UPDATE customer SET email=email WHERE id <= 10;
UPDATE 10

psql=# SELECT * FROM table_count; SELECT COUNT(*) FROM customer;
public|customer|10000
10000

psql=# INSERT INTO customer (email, address)
psql-# SELECT email, address FROM customer LIMIT 4;
INSERT 0 4

psql=# SELECT * FROM table_count; SELECT COUNT(*) FROM customer;
public|customer|10004
10004

psql=# DELETE FROM customer WHERE id <= 10;
DELETE 10

psql=# SELECT * FROM table_count; SELECT COUNT(*) FROM customer;
public|customer|9994
9994

psql=# TRUNCATE TABLE customer;
TRUNCATE TABLE

psql=# SELECT * FROM table_count; SELECT COUNT(*) FROM customer;
public|customer|9994
0

Whoops! Everything matched up until that TRUNCATE. On earlier versions of Postgres, there was no way around that problem, but if we have Postgres version 8.4 or better, we can use truncate triggers!

CREATE FUNCTION update_table_count_truncate()
  RETURNS TRIGGER
  SECURITY DEFINER
  LANGUAGE pltcl
AS $BC$

  spi_exec "
    UPDATE table_count
    SET rows=0
    WHERE tablename = '$TG_table_name'
  "

  set tablecount($TG_table_name) 0

 return OK
$BC$;

CREATE TRIGGER update_table_count_truncate
  AFTER TRUNCATE
  ON public.customer
  FOR EACH STATEMENT
  EXECUTE PROCEDURE update_table_count_truncate();

Pretty straightforward, let's make sure it works:

psql=# TRUNCATE TABLE customer;
TRUNCATE TABLE

psql=# SELECT * FROM table_count; SELECT COUNT(*) FROM customer;
public|customer|0
0

Success! This was a fairly contrived example, but Tcl (and especially pl/tclU) offers a lot more functionality. If you want to examine pl/tcl and pl/tclu for yourself, you'll need to make sure it's compiled into the Postgres you are using. If using a packaging system, it's as simple as doing this (or something like it, depending on what packaging system you use):

yum install postgresql-pltcl

If compiling from source, just pass the --with-tcl option to configure. You'll probably also need to install the Tcl development package, e.g. with yum install tcl-devel

Once installed, installing it into a specific database is as simple as:

$ CREATE LANGUAGE pltcl;
CREATE LANGUAGE
$ CREATE LANGUAGE pltclu;
CREATE LANGUAGE

For more about Tcl, check out the The Tcl Wiki, the Tcl tutorial, or this Tcl reference. For more about pl/tcl and pl/tclu. visit the Postgres pltcl documentation