Welcome to End Point’s blog

Ongoing observations by End Point people

Creating a PL/Perl RPM linked against a custom Perl build

My and everyone else's Planet PostgreSQL blogs have been down for some time now. I recently needed to refer to a post I made on March 7, 2007, showing how to build a PL/Perl RPM linked against a custom Perl build. Of course it wasn't there! The Wayback Machine came to my rescue.

A few things have changed since that time, so I've reworked it here, updated for local Perl 5.10.0 built into RPMs:

We sometimes have to install a custom Perl build without thread support, and to have some specific newer and/or older versions of CPAN modules, and we don't want to affect the standard distribution Perl that lives in /usr/bin/perl and /usr/lib/perl5. We use standard PGDG RPMs to install PostgreSQL. We also use PL/Perl, and want PL/Perl to link against our custom Perl build in /usr/local/bin and /usr/local/lib/perl5.

It's easy to achieve this with a small patch to the source RPM spec file:

--- postgresql-8.3.spec 2008-10-31 17:34:34.000000000 +0000
+++ postgresql-8.3.custom.spec  2008-11-30 02:10:09.000000000 +0000
@@ -315,6 +315,7 @@
 CFLAGS=`echo $CFLAGS|xargs -n 1|grep -v ffast-math|xargs -n 100`
 export LIBNAME=%{_lib}
+export PATH=/usr/local/bin:$PATH
 %configure --disable-rpath \
 %if %beta
    --enable-debug \
@@ -322,6 +323,7 @@
 %if %plperl
    --with-perl \
+   --with-libraries=/usr/local/lib64/perl5/5.10.0/x86_64-linux/CORE/ \
 %if %plpython
    --with-python \

Since we build RPMs of our local Perl, we want this PL/Perl RPM to depend on them, so we make this additional patch before building:

--- postgresql-8.3.spec 2008-10-31 17:34:34.000000000 +0000
+++ postgresql-8.3.custom.spec  2008-11-30 02:10:09.000000000 +0000
@@ -100,7 +100,7 @@
 Patch6:        postgresql-perl-rpath.patch
 Patch8:        postgresql-prefer-ncurses.patch
-Buildrequires: perl glibc-devel bison flex 
+Buildrequires: local-perl = 4:5.10.0, glibc-devel, bison, flex
 Requires:  /sbin/ldconfig initscripts
 %if %plpython
@@ -227,7 +227,7 @@
 %package plperl
 Summary:   The Perl procedural language for PostgreSQL
 Group:     Applications/Databases
-Requires:  postgresql-server = %{version}-%{release}
+Requires:  postgresql-server = %{version}-%{release}, local-perl = 4:5.10.0
 %ifarch ppc ppc64
 BuildRequires:  perl-devel

After applying the patch(es) (adjusted for your own custom Perl build, of course), rebuild the RPM, and install the postgresql-plperl (8.2 or newer) or postgresql-pl (8.1 and earlier) RPM. With a "service postgresql restart", you're ready to go.

For pre-built PostgreSQL 8.3 RPMs that link against unthreaded local Perl 5.10.0, for Red Hat Enterprise Linux 5 x86_64, see the Yum repositories.

Multiple reverse DNS pointers per IP address

I recently ran across an IP address that had two PTR (reverse DNS) records in DNS. I've always thought that each IP address is limited to only a single PTR record, and I've seen this rule enforced by many ISPs, but I don't remember ever seeing it conclusively stated.

I was going to note the problem to the responsible person but thought it'd be good to test my assumption first. Lo and behold, it's not true. The Wikipedia "Reverse DNS lookup" page and a source it cites, an IETF draft on reverse DNS, note that multiple PTR records per IP address have always been allowed.

There is apparently plenty of software out there that can't properly deal with more than one PTR record per IP address, and with too many PTR records, a DNS query response will no longer fit inside a single UDP packet, forcing a TCP response instead, which can cause trouble of its own. And as I noted, many ISPs won't allow more than one PTR record, so in those cases it's an academic question.

But it's not invalid, and I saved myself and someone else a bit of wasted time by doing a quick bit of research. It was a good reminder of the value of checking assumptions.

OpenSQL Camp 2008

I attended the OpenSQL Camp last weekend, which ran Friday night to Sunday, November 14-16th. This was the first "unconference" I had been to, and Baron Schwartz did a great job in pulling this all together. I drove down with Bruce Momjian who said that this is the first cross-database conference of any kind since at least the year 2000.

The conference was slated to start at 6 pm, and Bruce and I arrived at our hotel a few minutes before then. Our hotel was at one end of the Charlottesville Downtown Mall, and the conference was at the other end, so we got a quick walking tour of the mall. Seems like a great place - lots of shops, people walking, temporary booths set out, outdoor seating for the restaurants. It reminded me a lot of Las Ramblas, but without the "human statue" performance artists. Having a hotel within walking distance of a conference is a big plus in my book, and I'll go out of my way to find one.

The first night was simply mingling with other people and designing the next day's sessions. There was a grid of talk slots on a wall, with large sticky notes stuck to some of them to indicate already-scheduled sessions. Next to the grid were two sections, where people added sticky notes for potential lightning talks, and for potential regular talks. There were probably about 20 of each type of talk by the end of the night. The idea was to put a check next to any talk you were interested in, although I don't think everyone really got the message about that, judging by the number of checks vs. the number of people. At one point, we gathered in a circle and gave a quick 5 word introduction about ourselves. Mine was "Just Another Perl Postgres Hacker." There were probably around 50-60 or so people there, and the vast majority were from Sun/MySQL. A smaller group of people were non-Sun MySQL people, such as Baron and Sheeri. Coming in at a minority of two was Bruce and myself, representing Postgres (although Saturday saw our numbers swell to three, with the addition of Kelly McDonald). However, the smallest minority was the SQLite contingent, consisting solely of Dr. Richard Hipp (whom it was great to meet in person). Needless to say, I met a lot of MySQL people at this conference! All were very friendly and receptive to Bruce and myself, and it did feel mostly like an open source database conference rather than a MySQL one. Seven of the twenty one talks were by non-MySQL people, which means we were technically overrepresented. Or had more interesting talks! ;)

After heading back to the room and reviewing my notes before bed, I got up the next day and caught the keynote, given by Brian Aker, about the future of open-source databases. Thanks for the Skype/Postgres shout out, Brian! :) A comment by Jim Starkey at the end of the talk led to an interesting discussion on bot nets, the current kings of cloud computing.

My talk on MVCC was the first talk of the day, which of course means lots of technical difficulties. As usual, my laptop refused to cooperate with the overhead projector. In anticipation of this, I had copied the presentation in PDF format to a USB disk, and ended up using someone else's Mac laptop to give the presentation. (I don't remember whose it was, but thank you!) I've given the talk before, but this was a major rewrite to suit the audience: much less Postgres-specific material, and some details about how other systems implement MVCC, as well as the advantages and disadvantages of both ways. Both Oracle and InnoDB update the actual value on disk, and save changes elsewhere, optimistically assuming that a rollback won't happen. This makes a rollback expensive, as the old diffs must be looked up and applied to the main table. Postgres is pessimistic, in that rollbacks are not as expensive as we simply add an entire new row on update, and a rollback simply marks it as no longer valid. Both ways involve some sort of cleaning up of old rows, and handle tradeoffs in different ways. There was some interesting discussions during and after the talk, as Jim Starkey and Ann Harrison weighed in on how other systems (Falcon and Firebird) perform MVCC, and the costs and tradeoffs involved. After the talk, I had some interesting talks with Ann about garbage collection and vacuuming in general.

The next talk was by Dr. Hipp, entitled "How SQL Database Engines Work", which was fascinating as it gave a glance into the inner working and philosophy of SQLite, whose underlying assumptions about power usage, memory, transactions, portability, and resource usage are radically different from most other database systems. Again there was some interesting discussions about certain slides from the audience within the talk.

The competing talk for that time slot was "Libdrizzle" by Eric Day. While I missed this talk, I did get to talk to him the night before about libdrizzle, among other things. Patrick Galbraith and I tried to explain the monstrosity that is XS to Eric (as he and I maintain DBD::mysql and DBD::Pg respectively), and Eric showed us how PHP does something similar.

My DBIx::Cache talk was sabotaged by Bruce having a better session at the same time, so I attended that instead of giving mine. I'll post the slides for the DBIx::Cache talk on the OpenSQL Camp wiki soon, however. I liked Bruce's talk ("Moving Application Logic Into the Database"), mostly becasuse he was preaching to the choir when talking about putting business logic into the database. There was an interesting discussion about the borrowing of LIMIT and OFFSET from MySQL and putting it into Postgres, and we even helped Richard figure out that he was unknowingly supporting the broken and deprecated Postgres "comma-comma" syntax. Bruce's talk was very polished and interesting. I suspect he may have given talks before. :)

Lunch was catered in, and I talked to many people while eating lunch, indeed over the conference itself. Apparently MySQL 5.1 is finally going to be released, this time for sure, according to first Giuseppe and then Dups. Post-lunch were the lightning talks, which I normally would not miss, but their overall MySQL-centricness and my interest in another session, entitled "MySQL Unconference" by Sheeri K. Cabral, drew me away. Bruce, Sheeri, Giuseppe Maxia, and myself talked about the details of such a conference. It was a very interesting perpective: MySQL has the problem of a "one company, and no community" perception, while Postgres suffers from a "all community, and no company" perception. Neither perception is accurate, of course, but there are some seeds of truth to both.

Bruce's second presentation, "Postgres Talks", turned into mostly a wide-ranging discussion between those present (myself, Bruce, Ann, Kelly, Richard, others?) about materialized views, vacuum, building query trees, and other topics.

I bailed out on my fellow Postgres talk "Postgres Extensions" by Kelly McDonald (sorry Kelly). I had already picked his brain about it earlier, so I felt not too much guilt in attending "Atomic Commit In SQLite" by Dr. Hipp. Again, it's fascinating to see things from the SQLite perspective. Not only technically, but how their development is structured is different as well.

I was not feeling well, so I ran back to the hotel to drop off my backpack with super-heavy laptop inside, and thus missed my next planned talk, "Unix Command Line Productivity Tips". If anyone went and can pass on some tips in the comments below, please do so! :)

The final talk I went to was "Join-Fu" by Jay Pipes. I honestly had no idea what this talk would be about, but I actually found it very interesting (and entertaining). Jay is a great speaker, and is not shy about pointing out some of MySQL's weaknesses. The talk was basically a collection of best practices for MySQL, and I actually learned not only things about MySQL I can put to use, but things to apply to Postgres as well. He spent some time on the MySQL query cache as well, which is particularly interesting to me as I'd love to see Postgres get something similar (and until then, people can use DBIx::Cache of course!).

After the final set of presentations was more mingling, eating of some pizza with funky toppings, and planning for the nexy day's hackathon. All the proposed ideas were MySQL-specific, as to be expected, but Bruce and I actually got some work done that night by looking over the pg_memcached code, prompted by Brian. I had looked it over a little bit a few months ago, but Bruce and I managed to fix a bug and, more importantly, found other people to continue working on it. Don't forget to take the credit when they finish their work, Bruce! :)

All in all, a great time. I would have liked to see the presentations stretched out over two days, and to have seen a greater Postgres turnout, but there's always next year. Thanks to Baron for creating a unique event!

Google Sponsored AFS Hack-A-Thon

Day One:

Woke up an hour early, due to having had a bit of confusion as to the start time (the initial email was a bit optimistic as to what time AFS developers wanted to wake up for the conference).

Met up with Mike Meffie (an AFS Developer of Sine Nomine) and got a shuttle from the hotel to the 'Visitors Lobby'; only to find out that each building has a visitors lobby. One neat thing, Google provides free bikes (beach cruisers) to anyone who needs them. According to the receptionist, any bike that isn't locked down is considered public property at Google. However, it's hard to pedal a bike and hold a briefcase; so off we went hiking several blocks to the correct building. Mike was smart enough to use a backpack, but hiked with me regardless.

The food was quite good, a reasonably healthy breakfast including fresh fruit (very ripe kiwi, and a good assortment). The coffee was decent as well! After much discussion, it was decided that Mike & I would work towards migrating the community CVS repository over to git. Because git sees the world as 'patch sets' instead of just individual file changes, migrating it from a view of the 'Deltas' makes the most sense. The new git repo. (when complete) should match 1:1 to the Delta history. There was a good amount of teasing as to whether Mike and I could make any measurable progress in 2 days. Derrick was able to provide pre-processed delta patches and the bare CVS repo. (though we spent a good amount of the day just transferring things around and determining what machine should be used for development).

Lunch (rather tasty sandwiches) and after lunch snacks were provided; Google definitely doesn't skimp on the catering. Made good progress for one day of combined work, we now have a clear strategy for processing the deltas and initial code that is showing strong promise. Much teasing ensued that Mike & I should not be allowed to eat if we did not have the git repo. ready for use. Dinner was a big group affair of food, beer, and Kerberos.

Day Two:

After arriving with Mike Meffie via the shuttle, we found out that Tom Keiser (also of Sine Nomine) had been left behind! The shuttle driver was kind enough to go pick up Tom (who ended up at a related, but different hotel than the conference recommended) and bring him for questioning (or development, as the case may be). Determined that the major issue in applying the deltas was simply due to inconsistencies in what the 'base' import should consist of... After several rounds of cleanup, all but a few of the deltas (and those were fixed by hand) applied cleanly!

On the food side, Google outdid itself with these cornbread 'pizzas' that were extremely good. Once we started having a few branches to play with, things came together quickly... generating much buzz and excitement (at least, for us). We all split off for dinner, with a few of us escorting Tom to his train then getting some Indian food (on a rather busy day, as it was the 'Festival of Lights').

In Conclusion:

We were able to get a clean specification with consensus for how we want to produce the public git repository. The specifications are even available on the OpenAFS wiki. The tools (found at '/afs/') to produce this repo. are all in a rough working form, with only the 'merge' tool still needing some development effort. All of these efforts were definitely facilitated by Google providing a comfortable work environment, a solid internet connection and good food to keep us fueled through it all.

Things to do now:

  • Clean up and document the existing tools
  • Improve the merge process to simplify folding the branches
  • Actually produce the Git repository
  • Validate the consistency of the Git repository against the CVS repository
  • Determine how tags are to be ported over and apply them
  • Publish repo. publicly

10,000 databases on a PostgreSQL cluster

One of our clients wanted confirmation that PostgreSQL will have no problem handling 1000 or 2000 databases on a single database cluster. I remember testing some years ago, probably on Postgres 7.2 or 7.3, creating 1000 or so databases and finding that it worked fine. But that was a long time ago, the software has changed, and I thought I should make sure my old experiment results still stand.

There's a PostgreSQL FAQ question, "What is the maximum size for a row, a table, and a database?" but no mention of the maximum number (or more importantly, maximum practical number) of databases per cluster. So I threw together a test script to create 10,000 databases, each with between (randomly) 1-5 tables with 2 columns each (INTEGER and TEXT), each getting randomly between 1-10 inserts with random data up to 100 or so characters in the TEXT field.

I ran the test on PostgreSQL 8.1, the default that ships with Red Hat Enterprise Linux 5 (x86_64). The hardware was a desktop-class HP with an Intel Core 2 @ 1.86 GHz that wasn't always idle.

The short answer: Postgres 8.1 handles 10,000 databases just fine. \l in psql generates a long list of databases, of course, but returns quickly enough. Ad-hoc concurrency testing was fine. Running queries, inserts, etc. on a handpicked group of the various play databases worked fine, including while new databases were being created. During the creation process, the last database creates seemed about as fast the first. It took 2.75 hours to run.

This all is hardly a big surprise, but maybe by documenting it I'll save someone the bother of running your own test in the future.

Addendum: The actual limit on this platform is probably 31995 databases, because each database occupies a subdirectory in data/base/ and the ext3 filesystem has a limit of 31998 sub-directories per one directory, stemming from its limit of 32000 links per inode. The other 5 would be ., .., template0, template1, and postgres. (Thanks, Wikipedia.)