News

Welcome to End Point’s blog

Ongoing observations by End Point people

Comparing databases with check_postgres

One of the more recent additions to check_postgres, the all-singing, all-dancing Postgres monitoring tool, is the "same_schema" action. This was necessitated by clients who wanted to make sure that their schemas were identical across different servers. The two use cases I've seen are servers that are being replicated by Bucardo or Slony, and servers that are doing horizontal sharding (e.g. same schema and database on different servers: which server you go to depends on (for example) your customer id). Oft times a new index fails to make it to one of the slaves, or some function is tweaked on one server by a developer, who then forgets to change it back or propagate it. This program allows a quick and automatable check for such problems.

The idea behind the same_schema check is simple: we walk the schema and check for any differences, then throw a warning if any are found. In this case, we're using the term "schema" in the classic sense of a description of your database objects. Thus, one of the things we check is that all the schemas (in the classic RDBMS sense of a container of other database objects) are the same, when running the "same_schema" check. Only slightly confusing. :)

Not only is this program nice for monitoring (e.g. as a Nagios check), but if you pass in a --verbose argument, you get a simple not-all-on-one-line breakdown of all the differences between the two databases. Let's do a quick example.

First, we download and install check_postgres. We'll pull straight from a git repository for check_postgres. While we have our own repo at bucardo.org, we also are keeping it in sync with a tree at github.org, so we'll use that one:

git clone git://github.com/bucardo/check_postgres.git
cd check_postgres
perl Makefile.PL
make
make test
sudo make install

Let's create a Postgres cluster with the initdb command, start it up, then create two new databases to compare to each other.

initdb -D cptest
echo port=5555 >> cptest/postgresql.conf
pg_ctl -D cptest -l cp.log start
psql -p 5555 -c 'CREATE DATABASE yin'
psql -p 5555 -c 'CREATE DATABASE yang'

We're ready to run the script. By default, it outputs things in a Nagios-friendly manner. We should see an 'OK' because the two databases are identical:

./check_postgres.pl --action=same_schema --dbport=5555 --dbname=yin --dbport2=5555 --dbname2=yang

POSTGRES_SAME_SCHEMA OK: DB "yin" (port=5555 => 5555) Both databases have identical items | time=0.01

The message could be clearer and show both database names, but the check worked and showed that things are exactly the same. Let's throw some differences in and run it again:

psql -p 5555 -d yin -c 'create table foobar(a int primary key, b text, c text)'
psql -p 5555 -d yang -c 'create table foobar(a int, b text, c varchar(99))'
psql -p 5555 -d yin -c 'create schema yinonly'
psql -p 5555 -d yang -c 'create table pineapple(id int)'

./check_postgres.pl --action=same_schema --dbport=5555 --dbname=yin --dbport2=5555 --dbname2=yang

POSTGRES_SAME_SCHEMA CRITICAL: DB "yin" (port=5555 => 5555) Databases were different. Items not matched: 5 | time=0.01
Schema in 1 but not 2: yinonly  Table in 2 but not 1: public.pineapple  Column "a" of "public.foobar": nullable is NO on 1, but YES on 2.  Column "c" of "public.foobar": type is text on 1, but character varying on 2.  Table "public.foobar" on 1 has constraint "public.foobar_pkey", but 2 does not. 

It works, but a little messy for human consumption. Nagios requires everything to be in a single line, but we'll add a --verbose argument to ask the script for prettier formatting:

./check_postgres.pl --action=same_schema --dbport=5555 --dbname=yin --dbport2=5555 --dbname2=yang

POSTGRES_SAME_SCHEMA CRITICAL: DB "yin" (port=5555 => 5555) Databases were different. Items not matched: 5 | time=0.01
Schema in 1 but not 2: yinonly
Table in 2 but not 1: public.pineapple
Column "a" of "public.foobar": nullable is NO on 1, but YES on 2.
Column "c" of "public.foobar": type is text on 1, but character varying on 2.
Table "public.foobar" on 1 has constraint "public.foobar_pkey", but 2 does not.

There are also ways to filter the output, for times when you have known differences. For example, to exclude any tables with the word 'bucardo' in them, you could add this argument:

--warning="notable=bucardo"

The online documentation has more details about all the filtering options.

So what kind of things do we check for? Right now, we are checking:

  • users (existence and powers, i.e. createdb, superuser)
  • schemas
  • tables
  • sequences
  • views
  • triggers
  • constraints
  • columns
  • functions (including volatility, strictness, etc.)

Got something else we aren't covering? Send in a patch, or a quick request, to the mailing list.

4 comments:

Jorge Godoy said...

How's it possible to compare databases running on two different hosts?

I've tried --host=host1,host2 as well as --host=host1 --host2=host2, but none of these worked.

Jorge Godoy said...

Actually, it was my fault: one should always remove the "--help" after using it. :-)

Now I have a different query: how to connect to hosts using SSL.

Jorge Godoy said...

Solved. Hint: look for pg_service.conf documentation.

Erik said...

This reminds me of something I created once -- dbdiff.pl -- but that was targeting Oracle.

Instead of explaining the differences in English, it instead spit out DDL statements to bring database A into sync with database B. For example, if A had a column that wasn't in B, it would emit something like:

ALTER TABLE tab1 ADD COLUMN col4 VARCHAR(30);

If column types were different it might add a comment with the old type and code to change it to the new type:

-- was VARCHAR(99)
ALTER TABLE foobar MODIFY c TEXT;

I didn't use it for monitoring, but instead I ran it between development databases and QA or production databases to generate simple migration scripts (that would usually be hand-edited along the way) to get the latest changes into QA and production.

I imagine an output mode like that in your script could be useful for some.