Welcome to End Point’s blog

Ongoing observations by End Point people

Bucardo multi-master for PostgreSQL

The original Bucardo

The next version of Bucardo, a replication system for Postgres, is almost complete. The scope of the changes required a major version bump, so this Bucardo will start at version 5.0.0. Much of the innards was rewritten, with the following goals:

Multi-master support

Where "multi" means "as many as you want"! There are no more pushdelta (master to slaves) or swap (master to master) syncs: there is simply one sync where you tell it which databases to use, and what role they play. See examples below.

Ease of use

The bucardo program (previously known as 'bucardo_ctl') has been greatly improved, making all the administrative tasks such as adding tables, creating syncs, etc. much easier.


Much of the underlying architecture was improved, and sometimes rewritten, to make things go much faster. Most striking is the difference between the old multi-master "swap syncs" and the new method, which has been described as "orders of magnitudes" faster by early testers. We use async database calls whenever possible, and no longer have the bottleneck of a single large bucardo_delta table.

Improved logging

Not only are more details provided, there is now the ability to control how verbose the logs are. Just set the log_level parameter to terse, normal, verbose, or debug. Those who had busy systems, which was the equivalent of a 'debug' firehose, will really appreciate this.

Different targets

Who says your slave (target) databases need to be Postgres? In addition to the ability to write text SQL files (for say, shipping to a different system), you can have Bucardo push to other systems as well. Stay tuned for more details on this. (Update: there is a blog post about using MongoDB as a target)

This new version is not quite at beta yet, but you can try out a demo of multi-master on Postgres quie easily. Let's see if we can do it in ten steps.

I. Download all prerequisites

To run Bucardo, you will need a Postgres database (obviously), the DBIx::Safe module, the DBI and DBD::Pg modules, and (for the purposes of this demo) the pgbench utility. Systems vary, but on aptitude-based systems, one can grab all of the above like this:

aptitude install postgresql-server \
perl-DBIx-Safe \
perl-DBD-Pg \

II. Grab the latest Bucardo

git clone git://

III. Install the program

cd bucardo
perl Makefile.PL
sudo make install

You can ignore any errors that come up about ExtUtils::MakeMaker not being recent.

IV. Setup an instance of Bucardo

This step assumes there is a running Postgres available to connect to.

sudo mkdir /var/run/bucardo
sudo chown $USER /var/run/bucardo
bucardo install

V. Use the pgbench program to create some test tables

psql -c 'CREATE DATABASE btest1'
pgbench -i btest1
psql -c 'CREATE DATABASE btest2 TEMPLATE btest1'
psql -c 'CREATE DATABASE btest3 TEMPLATE btest1'
psql -c 'CREATE DATABASE btest4 TEMPLATE btest1'
psql -c 'CREATE DATABASE btest5 TEMPLATE btest1'

VI. Tell Bucardo about the databases and tables you are going to use

bucardo add db t1 dbname=btest1
bucardo add db t2 dbname=btest2
bucardo add db t3 dbname=btest3
bucardo add db t4 dbname=btest4
bucardo add db t5 dbname=btest5
bucardo list dbs

bucardo add table pgbench_accounts pgbench_branches pgbench_tellers herd=therd
bucardo list tables

A herd is simply a logical grouping of tables. We did not add the other pgbench table, pgbench_history, because it has no primary key or unique index.

VII. Group the databases together and set their roles

bucardo add dbgroup tgroup t1:source t2:source t3:source t4:source t5:target

We've grouped all five databases together, and made four of them masters (aka source), and one of them a slave (aka target). You can any combination of master and slaves you want, as long as there is at least one master.

VII. Create the Bucardo sync

bucardo add sync foobar herd=therd dbs=tgroup ping=false

Here we simply create a new sync, which is a controllable replication event, telling it which tables we want to replicate, and which databases we are going to use. We also set ping to false, which means that we will not create triggers to automatically fire off replication on any changes, but will do it manually. In a real world scenario, you generally do want those triggers, or want to set Bucardo to check periodically.

VIII. Start up Bucardo

bucardo start

If all went well, you should see some information in the log.bucardo file in the current directory.

IX. Make a bunch of changes on all the source databases.

pgbench -t 10000 btest1
pgbench -t 10000 btest2
pgbench -t 10000 btest3
pgbench -t 10000 btest4

Here, we've told pgbench to run ten thousand transactions against each of the first four databases. Triggers on these tables have captured the changes.

X. Kick off the sync and watch the fun.

bucardo kick foobar

You can now tail the log.bucardo file to see the fun, or simply run:

bucardo status see what it is doing, and the final counts when we are done. Don't forget to stop Bucardo when you are done testing:

bucardo stop

The output of bucardo status, after the sync has completed, should look like this:

bucardo status
Name     State    Last good    Time    Last I/D/C           Last bad    Time
foobar | Good   | 17:58:37   | 3m2s  | 131836/131836/4785 | none      |

Here we see that this syncs has never failed ("Last bad"), the time of day of the last good run, how long ago it was from right now (3 minutes and 2 seconds), as well as details of the last successful run. Last I/D/C stands for number of inserts, deletes, and collisions across all databases for this syncs. This is just an overview of all syncs at a high level, but we can also give status an argument of a sync name to see more details like so:

bucardo status foobar
Last good                       : Jun 02, 2011 17:57:47 (time to run: 42s)
Rows deleted/inserted/conflicts : 131,836 / 131,836 / 4,785
Sync name                       : foobar
Current state                   : Good
Source herd/database            : therd / t1
Tables in sync                  : 3
Status                          : active
Check time                      : none
Overdue time                    : 00:00:00
Expired time                    : 00:00:00
Stayalive/Kidsalive             : yes / yes
Rebuild index                   : 0
Ping                            : no
Onetimecopy                     : 0
Post-copy analyze               : Yes
Last error:                     :

This gives us a little more information about the sync itself, as well as another important metric, how long the sync itself took to run, in this case, 42 seconds. That particular metric might make its way back to the overall "status" view above. Try things out and help us find bugs and improve Bucardo!


Anonymous said...

Obvious trivial typo in VI:
bucardo add db t2 dbname=btest1

Should be
bucardo add db t2 dbname=btest2

Same for the following lines.

Greg Sabino Mullane said...

Thanks, fixed!

Denis said...

Let's assume we have a master/master replication. db1<=>db2
While the DB1 wants to send data to DB1, the network connection is down.
Will DB2 try to send data again after the connection is re-established?
What happens exactly.
Thank you.

Greg Sabino Mullane said...

Yes, the data will be sent once the connection is back up. Bucardo will constantly reconnect until it gets a working connection.

Denis said...

When starting bucardo using 'bucardo_ctl start', it displays an error => version mismatch ....
SOLUTION : edit the bucardo_ctl file
find "4.4.6" and change it to "4.4.7".

Is there a way to force sync at time t in order to make all my db's balanced (with the same data).
I have 3 db's to sync using BUCARDO

host B has BUCARDO installed.
When DML's are done on B, A and C are sync'd.
When DML's are done on A, B is sync'd but not C.
When DML's are done on C, B is sync'd but not A.

HOw to sync my 3 db's.


Greg Sabino Mullane said...

The 4.4.6 mismatch error is resolved in the latest version, 4.4.8.

Greg Sabino Mullane said...

Not sure exactly what you mean by the other question: the bucardo-general mailing list is the best place to ask such questions:

Denis said...

bucardo version :4.4.8
postgresql version : 9.1.2

I use bucardo with multiple databases. (eg: A, B, C)
I have these syncs A->B , A->C.
When C is down, bucardo doesn't start properly
bucardo_ctl ping : CRITICAL: Timed out (3 s), no ping response from MCP

So the sync A->B (B is OK), doesn't work.
Is that normal or is there an option I forgot.

bucardo_ctl deactivate syncname doesn't work.



Kevin Behr said...
This comment has been removed by the author.
Kevin Behr said...

Is 'bucardo add dbgroup tgroup t1:source t2:source t3:source t4:source t5:target' an example of a fullcopy, pushdelta, or a swap? I'm assuming that it is swap...If so, what conflict resolution does your example default to? Also, does a target have to exist in order to do conflict resolution?

I'm looking to do multi-master replication with just two databases. I was planning on 'bucardo add dbgroup tgroup t1:source t2:source' but I'm not sure how to specify the conflict resolution. Thanks!

Kevin Behr said...

Sorry, I just read the top paragraph. It looks like things changed in the latest version..."There are no more pushdelta (master to slaves) or swap (master to master) syncs: there is simply one sync where you tell it which databases to use, and what role they play."

I guess my only question then is I still need to specify any conflict resolution and, if so, how?


Dave Jenkins said...

Kevin, this is shameless of me but: if you want some help on this, we can offer some consulting time (say, a bucket of 10 or 20 hours). Please ping me if you're interested.

djenkins at endpoint dot com

Anonymous said...

bucardo add dbgroup tgroup t1:source t2:source t3:source t4:source t5:target

what happens if I want to do 2 masters? Assume t1 and t2 are set
so do I have to do twice
bucardo add dbgroup t1:source t2:target
bucardo add dbgroup t2:source t1:target

or I am missing something? Thanks for any replies

Joshua Tolley said...

Here's a good example of how to set up multiple masters.

Juned Khan said...

this is a very nice post but i wonder how to do this with remote master database i mean i want to sync the database which is on different servers ?

Also i am getting "Please specify a database with db=
" message while executing "bucardo_ctl add table pgbench_accounts pgbench_branches pgbench_tellers herd=therd" command. so here which database i have to specify.

Please suggest.

Joshua Tolley said...

To tell Bucardo about a database on another server, use the "host" parameter to the "bucardo add db" command. For instance, "bucardo add db something dbname=foo host=bar".

When you add a table, you have to tell Bucardo the name of a database it can find the table in. "bucardo add table whatever db=some_database"

Juned Khan said...

Thanks for answer, here the thing which confuses me is with the reference of your blog, here which database i should specify here. do i need to add each databases ?

Joshua Tolley said...

Yes, if you're replicating from a database called foo to another called bar, you need to add both foo and bar to Bucardo.

The bucardo add db command can be confusing. If I say this:

bucardo add db alpha dbname=beta

... that means the actual database name is "beta", but within Bucardo I'll refer to it as "alpha". This can seem like a source of lots of confusion, but often you'll have two different servers you're replicating, and the database name within PostgreSQL will be the same on both, so you will want to use different names within Bucardo:

bucardo add db master_production dbname=production host=master-server
bucardo add db slave_production dbname=production host=slave-server

Juned Khan said...

I have successfully configured above, the problem was i was using different version.

But now i am not getting how to test replication of this five database. i tried to update and delete some records in pgbench_tellers table but data is not updated in all other database.

Am i doing something wrong here ?

Please suggest.

ratherbesurfing7 said...

I'm using 5.1.1. I set up a sync using the command:
bucardo add sync bard dbs=ex_master:source,ex_slave:source tables=all
I then start up bucardo and kick off the sync. If I add a line to a table, it gets added on the other database too. If I update a line in DB0 that wasn't in DB1, the whole line gets inserted into DB1.
Here's my question: If the DBs had different rows in their tables before the sync got set up, how do I get the sync to copy all of those over? It only propagates changes, and I want it to make both DBs match. I can write a script to update all rows in all tables with their same values, and the rows will get propagated to the other DB, but that seems like a huge hack.