<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-7997313029981170997</id><updated>2010-07-30T18:41:07.317-04:00</updated><title type='text'>End Point Blog</title><subtitle type='html'>Ongoing observations by End Point people.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://blog.endpoint.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/-/postgres'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/search/label/postgres'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/-/postgres/-/postgres?start-index=26&amp;max-results=25'/><author><name>Jon Jensen</name><uri>http://www.blogger.com/profile/18273388885281263476</uri><email>noreply@blogger.com</email></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>98</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-5796856138732355755</id><published>2010-07-29T14:44:00.001-04:00</published><updated>2010-07-29T16:46:15.924-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ruby'/><category scheme='http://www.blogger.com/atom/ns#' term='open-source'/><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>Distributed Transactions and Two-Phase Commit</title><content type='html'>&lt;p&gt;The typical example of a transaction involves Alice and Bob, and their bank. Alice pays Bob $100, and the bank needs to debit Alice and credit Bob. Easy enough, provided the server doesn't crash. But what happens if the bank debits Alice, and then before crediting Bob, the server goes down? Or what if they credit Bob first, and then try to debit Alice only to find she doesn't have enough funds? A transaction allows the debit and credit operations to happen as a package ("atomically" is the word commonly used), so either both operations happen or neither happens, even if the server crashes halfway through the transaction. That way the bank never credits Bob without debiting Alice, or vice versa.&lt;/p&gt;
&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Q7kwFwwUJ0o/TFHnqrwIiQI/AAAAAAAAAU8/pAJGBOmvUHE/s1600/one_db.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 220px; height: 124px;" src="http://2.bp.blogspot.com/_Q7kwFwwUJ0o/TFHnqrwIiQI/AAAAAAAAAU8/pAJGBOmvUHE/s320/one_db.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5499431340481218818" /&gt;&lt;/a&gt;
&lt;p&gt;That's simple enough, but the situation can become more complex. What if, for instance, for buzzword-compliance purposes, the bank has "sharded" its accounts database by splitting it in pieces and putting each piece on a different server (whether this is would be smart or not is outside the scope of this post). The typical transaction handles statements issued only for one database, so we can't wrap the debit and credit operations within a single BEGIN/COMMIT if Alice's account information lives on one server and Bob's lives on another.&lt;/p&gt;
&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_Q7kwFwwUJ0o/TFHm0FUCvvI/AAAAAAAAAU0/vj0nnTLwfkc/s1600/shard1.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 255px; height: 184px;" src="http://2.bp.blogspot.com/_Q7kwFwwUJ0o/TFHm0FUCvvI/AAAAAAAAAU0/vj0nnTLwfkc/s320/shard1.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5499430402449915634" /&gt;&lt;/a&gt;
&lt;p&gt; Enter "distributed transactions". A distributed transaction allows applications to group multiple transaction-aware systems into a single transaction. These systems might be different databases, or they might include other systems such as message queues, in which case the transaction concept means a message would get delivered if and only if the rest of the transaction completed. So with a distributed transaction, the bank could debit Alice's account in one database and credit Bob's in another, atomically.&lt;/p&gt;
&lt;p&gt;All this comes at some cost. Distributed transactions require a "transaction manager", an application which handles the special semantics required to commit a distributed transaction. Second, the systems involved must support "two-phase commit" (which was added to PostgreSQL in version 8.1). Distributed transactions are committed using PREPARE TRANSACTION 'foo' (phase 1), and COMMIT PREPARED 'foo' or ROLLBACK PREPARED 'foo' (phase 2), rather than the usual COMMIT or ROLLBACK.&lt;/p&gt;
&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_Q7kwFwwUJ0o/TFHmiW6OA9I/AAAAAAAAAUs/s9SUT5uxmcI/s1600/xact_mgr.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 231px; height: 237px;" src="http://3.bp.blogspot.com/_Q7kwFwwUJ0o/TFHmiW6OA9I/AAAAAAAAAUs/s9SUT5uxmcI/s320/xact_mgr.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5499430097935795154" /&gt;&lt;/a&gt;
&lt;p&gt;The beginning of a distributed transaction looks just like any other transaction: the application issues a BEGIN statement (optional in PostgreSQL), followed by normal SQL statements. When the transaction manager is instructed to commit, it runs the first commit phase by saying "PREPARE TRANSACTION 'foo'" (where "foo" is some arbitrary identifier for this transaction) on each system involved in the distributed transaction. Each system does whatever it needs to do to determine whether or not this transaction can be committed and to make sure it can be committed even if the server crashes, and reports success or failure. If all systems succeed, the transaction manager follows up with "COMMIT PREPARED 'foo'", and if a system reports failure, the transaction manager can roll back all the other systems using either ROLLBACK (for those transactions it hasn't yet prepared), or "ROLLBACK PREPARED 'foo'". Using two-phase commit is obviously slower than committing transactions on only one database, but sometimes the data integrity it provides justifies the extra cost.&lt;/p&gt;
&lt;p&gt;In PostgreSQL, two-phase commit is supported provided max_prepared_transactions is nonzero. A PREPARE TRANSACTION statement persists the current transaction to disk, and  dissociates it from the current session. That way it can survive even if the database goes down. The current session no longer has an active transaction. However, the prepared transaction acts like any other open transaction in that all locks held by the prepared transaction remain held, and VACUUM cannot reclaim storage from that transaction. So it's not a good idea to leave prepared transactions open for a long time.&lt;/p&gt;
&lt;p&gt;Distributed transactions are most common, it seems, in Java applications. Full J2EE application servers typically come with a transaction manager component. For my examples I'll use an open source, standalone transaction manager, called &lt;a href="http://docs.codehaus.org/display/BTM/Home"&gt;Bitronix&lt;/a&gt;. I'm not particularly fond of using Java for simple scripts, though, so I've used JRuby for this &lt;a href="http://josh.endpoint.com/bitronix.rb"&gt;demonstration code&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This script uses two databases, which I've called "athos" and "porthos". Each has &lt;a href="http://josh.endpoint.com/athos.sql"&gt;same schema&lt;/a&gt;, which provides a simple framework for the sharded bank example described above. This schema provides a table for account names, another for ledger information, and a simple trigger to raise an exception when a transaction would bring a person's balance below $0. I'll first populate athos with Alice's account information. She gets $200 to start. Bob will go in the porthos database, with no initial balance.&lt;/p&gt;

&lt;pre&gt;
5432 josh@athos# insert into accounts values ('Alice');
INSERT 0 1
5432 josh@athos*# insert into ledger values ('Alice', 200);
INSERT 0 1
5432 josh@athos*# commit;
COMMIT5432 josh@athos# \c porthos
You are now connected to database "porthos".
5432 josh@porthos# insert into accounts values ('Bob');
INSERT 0 1
5432 josh@porthos*# commit;
COMMIT
&lt;/pre&gt;

&lt;p&gt;Use of Bitronix is pretty straightforward. After setting up a few constants for easier typing, I create a Bitronix data source for each PostgreSQL database. Here I have to use the PostgreSQL JDBC driver's org.postgresql.xa.PGXADataSource class; "XA" is Java's protocol for two-phase commit, and requires JDBC driver support. Here's the code for setting up one data source; the other is just the same.&lt;/p&gt;
&lt;pre&gt;
ds1 = PDS.new
ds1.set_class_name 'org.postgresql.xa.PGXADataSource'
ds1.set_unique_name 'pgsql1'
ds1.set_max_pool_size 3
ds1.get_driver_properties.set_property 'databaseName', 'athos'
ds1.get_driver_properties.set_property 'user', 'josh'
ds1.init
&lt;/pre&gt;
&lt;p&gt;Then I simply get a connection from each data source, instantiate a Bitronix TransactionManager object, and begin a transaction.&lt;/p&gt;
&lt;pre&gt;
c1 = ds1.get_connection
c2 = ds2.get_connection
btm = TxnSvc.get_transaction_manager
btm.begin
&lt;/pre&gt;
&lt;p&gt;Within my transaction, I just use normal JDBC commands to debit Alice and credit Bob, after which I commit the transaction through the TransactionManager object. If this transaction fails, it raises an exception, which I can capture using Ruby's begin/rescue exception handling, and roll back the transaction.&lt;/p&gt;
&lt;pre&gt;
begin
  s2 = c2.prepare_statement "INSERT INTO ledger VALUES ('Bob', 100)"
  s2.execute_update
  s2.close

  s1 = c1.prepare_statement "INSERT INTO ledger VALUES ('Alice', -100)"
  s1.execute_update
  s1.close

  btm.commit
  puts "Successfully committed"
rescue
  puts "Something bad happened: " + $!
  btm.rollback
end
&lt;/pre&gt;
&lt;p&gt;When I run this, Bitronix gives me a bunch of output, which I haven't bothered to suppress, but among it all is the "Successfully committed" string I told it to print on success. Since Alice is debited $100 each time we run this, and she started with $200, we can run it twice before hitting errors. On the third time, we get this:&lt;/p&gt;
&lt;pre&gt;
Something bad happened: org.postgresql.util.PSQLException: ERROR: Rejecting operation; account owner Alice's balance would drop below 0
&lt;/pre&gt;
&lt;p&gt;This is our trigger firing, to tell us that we can't debit Alice any more. If I look in the two databases, I can see that everything worked as planned:&lt;/p&gt;
&lt;pre&gt;
5432 josh@athos*# select get_balance('Alice');
 get_balance 
-------------
           0
(1 row)

5432 josh@athos*# \c porthos 
You are now connected to database "porthos".
5432 josh@porthos# select get_balance('Bob');
 get_balance 
-------------
         200
(1 row)
&lt;/pre&gt;
&lt;p&gt;Remember I've run my script three times, but Bob has only been credited $200, because that's all Alice had to start with.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-5796856138732355755?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/5796856138732355755/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=5796856138732355755' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/5796856138732355755'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/5796856138732355755'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/07/distributed-transactions-and-two-phase.html' title='Distributed Transactions and Two-Phase Commit'/><author><name>Joshua Tolley</name><uri>http://www.blogger.com/profile/08481531515300677240</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='13545358980860046445'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_Q7kwFwwUJ0o/TFHnqrwIiQI/AAAAAAAAAU8/pAJGBOmvUHE/s72-c/one_db.png' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-3371495103932890220</id><published>2010-07-28T19:21:00.004-04:00</published><updated>2010-07-28T20:31:29.317-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>PostgreSQL: per-version .psqlrc</title><content type='html'>&lt;p&gt;File this under "you learn something new every day."  I came across this little tidbit while browsing the source code for &lt;tt&gt;psql&lt;/tt&gt;: you can have a per-version &lt;tt&gt;.psqlrc&lt;/tt&gt; file which will be executed only by the psql associated with that major version.  Just name the file &lt;tt&gt;.psqlrc-$version&lt;/tt&gt;, substituting the major version for the &lt;tt&gt;$version&lt;/tt&gt; token.  So for PostgreSQL 8.4.4, it would look for a file named &lt;tt&gt;.psqlrc-8.4.4&lt;/tt&gt; in your &lt;tt&gt;$HOME&lt;/tt&gt; directory.&lt;/p&gt;

&lt;p&gt;It's worth noting that the version-specific &lt;tt&gt;.psqlrc&lt;/tt&gt; file requires the full minor version, so you cannot currently define (say) an 8.4-only version which applies to all 8.4 &lt;tt&gt;psql&lt;/tt&gt;s.  I don't know if this feature gets enough mileage to make said modification worth it, but it would be easy enough to just use a symlink from the &lt;tt&gt;.psqlrc-$majorversion&lt;/tt&gt; to the specific &lt;tt&gt;.psqlrc&lt;/tt&gt; file with minor version.&lt;/tt&gt;&lt;/p&gt;

&lt;p&gt;This seems of most interest to developers, who may simultaneously run many versions of &lt;tt&gt;psql&lt;/tt&gt; which may have incompatible settings, but also could come in handy to regular users as well.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-3371495103932890220?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/3371495103932890220/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=3371495103932890220' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3371495103932890220'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3371495103932890220'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/07/postgresql-per-version-psqlrc.html' title='PostgreSQL: per-version .psqlrc'/><author><name>David Christensen</name><uri>http://www.blogger.com/profile/14237702125131095813</uri><email>david@endpoint.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='01514418822641247743'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-1997831032922698613</id><published>2010-07-27T23:01:00.004-04:00</published><updated>2010-07-27T23:23:35.339-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>PostgreSQL: Dynamic SQL Function</title><content type='html'>&lt;p&gt;Sometimes when you're doing something in SQL, you find yourself doing something repetitive, which naturally lends itself to the desire to abstract out the boring parts.  This pattern is often prevalent when doing maintenance-related tasks such as creating or otherwise modifying DDL in a systematic kind of way.  If you've ever thought, "Hey, I could write a query to handle this," then you're probably looking for dynamic SQL.&lt;/p&gt;

&lt;p&gt;The standard approach to using dynamic SQL in PostgreSQL is plpgsql's EXECUTE function, which takes a text argument as the SQL statement to execute.  One technique fairly well-known on the &lt;tt&gt;#postgresql&lt;/tt&gt; IRC channel is to create a function which essentially wraps the EXECUTE statement, commonly known as &lt;tt&gt;exec()&lt;/tt&gt;.  Here is the definition of &lt;tt&gt;exec()&lt;/tt&gt;:&lt;/p&gt;

&lt;pre&gt;
CREATE FUNCTION exec(text) RETURNS text AS $$ BEGIN EXECUTE $1; RETURN $1; END $$ LANGUAGE plpgsql;
&lt;/pre&gt;

&lt;p&gt;Using &lt;tt&gt;exec()&lt;/tt&gt; then takes the form of a SELECT query with the appropriately generated query to be executed passed as the sole argument.  We return the generated query text as an ease in auditing the actually executed results.  Some examples:&lt;/p&gt;

&lt;pre&gt;
SELECT exec('CREATE TABLE partition_' || generate_series(1,100) || ' (LIKE original_table)');
SELECT exec('ALTER TABLE ' || quote_identifier(attrelid::regclass) || ' DROP COLUMN foo') FROM pg_attribute WHERE attname = 'foo';
&lt;/pre&gt;

&lt;p&gt;Some notes about the &lt;tt&gt;exec()&lt;/tt&gt; function: since the generated SQL statement is being run inside a function, it is not run in a top-level transaction, so some commands will not work, including CREATE/DROP DATABASE, ALTER TABLESPACE, VACUUM, etc.&lt;/p&gt;

&lt;p&gt;Starting in PostgreSQL 9.0, the plpgsql language will be pre-installed in all new databases, which will make this recipe even easier to use.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-1997831032922698613?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/1997831032922698613/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=1997831032922698613' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1997831032922698613'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1997831032922698613'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/07/postgresql-dynamic-sql-function.html' title='PostgreSQL: Dynamic SQL Function'/><author><name>David Christensen</name><uri>http://www.blogger.com/profile/14237702125131095813</uri><email>david@endpoint.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='01514418822641247743'/></author><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-8879828808839785237</id><published>2010-07-22T19:37:00.008-04:00</published><updated>2010-07-22T21:02:03.398-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='scalability'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>PostgreSQL: Migration Support Checklist</title><content type='html'>&lt;p&gt;A database migration (be it from some other database to PostgreSQL, or even from an older version of PostgreSQL to a nice shiny new one) can be a complicated procedure with many details and many moving parts.  I've found it helpful to construct a list of questions in order to make sure that you're considering all aspects of the migrations and gauge the scope of what will be involved.   This list includes questions we ask our clients; feel free to contribute your own additional considerations or suggestions.&lt;/p&gt;

&lt;p&gt;Technical questions:&lt;/p&gt;

&lt;ol&gt;

&lt;li&gt;&lt;strong&gt;Database servers:&lt;/strong&gt; How many database servers do you have?  For each, what are the basic system specifications (OS, CPU architecture, 32- vs 64-bit, RAM, disk, etc)?  What kind of storage are you using for the existing database, and what do you plan to use for the new database? Direct-attached storage (SAS, SATA, etc.), SAN (what vendor?), or other?  Do you use any configuration management system such as Puppet, Chef, etc.?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Application servers and other remote access:&lt;/strong&gt; How many application servers do you have?  For each, what are the basic system specifications (OS, CPU architecture, 32- vs 64-bit, RAM, disk, etc)?   Do you use any configuration management system such as Puppet, Chef, etc.?  What other network considerations are there?  Is ODBC used, or SSL transport, any VPNs?  Are multiple datacenters involved?  How about egress/ingress firewalls?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Middleware:&lt;/strong&gt; Do you currently use any sort of connection pooling, load balancing, or other middleware between your application and database servers?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Data needs:&lt;/strong&gt; Can you describe your data access patterns?  i.e., is the majority of your data historical and rarely accessed?  Are there any existing reporting needs that will need to be duplicated on the PostgreSQL system?  Do you already have reports of database usage, including traffic levels, frequent or intensive queries, etc?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Size:&lt;/strong&gt; What kind of transaction volume do you see?  How large are your databases?  How many tables do you have and what is the size of the larger ones?  How many users or database connections will you need to support?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Backups:&lt;/strong&gt; What are your current backup policies/procedures?  How will these need to change with the move to PostgreSQL?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Replication/load balancing:&lt;/strong&gt; What kind of system redundancy do you currently have/need?  Do you have any kind of database load-balancing or master-slave replication?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Monitoring:&lt;/strong&gt; What is the current monitoring/in-house support infrastructure?  What needs to be duplicated, and can any portion of this facility be reused?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Interfaces:&lt;/strong&gt; What language are your applications written in, and what drivers exist to connect to your current database?  Will there be a compatible driver available in your language of choice in order?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Extensions:&lt;/strong&gt; Are you currently using any in-database procedures or functionality (i.e., in PL/SQL or another embedded language of choice)?  If so, how many?  What will the difficulty be in porting these functions to PostgreSQL?&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;And a couple of business-related questions:&lt;/p&gt;

&lt;ol start="11"&gt;

&lt;li&gt;&lt;strong&gt;Scheduling:&lt;/strong&gt; What is the timeframe for transition?  When can appropriate downtime be scheduled?  How much database downtime can you afford?&lt;/li&gt;

&lt;li&gt;&lt;strong&gt;Staffing:&lt;/strong&gt; Do you currently have in-house DBAs to manage the servers, etc on a day-to-day basis?  Is there anyone with PostgreSQL experience or familiarity on staff?&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;Being able to answer all of these questions is critical to formulating a migration plan and carrying out a migration successfully.&lt;/p&gt;

&lt;p&gt;Particularly with the impending (July 2010) end of life for previous PostgreSQL releases 7.4, 8.0 and (in November 2010) 8.1, a database migration may be on your radar.  End Point is one of many professional PostgreSQL support companies who would be happy to assist you in your transition.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-8879828808839785237?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/8879828808839785237/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=8879828808839785237' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/8879828808839785237'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/8879828808839785237'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/07/postgresql-migration-support-checklist.html' title='PostgreSQL: Migration Support Checklist'/><author><name>David Christensen</name><uri>http://www.blogger.com/profile/14237702125131095813</uri><email>david@endpoint.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='01514418822641247743'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-1468915235240718015</id><published>2010-07-13T15:26:00.007-04:00</published><updated>2010-07-14T17:09:38.695-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>Views across many similar tables</title><content type='html'>&lt;p&gt;An application I'm working on has a host of (a dozen or so) status tables, each containing various rows that reflect the state of associated rows in other tables. For instance:&lt;/p&gt;&lt;pre&gt;Table "public.inventory"
...
status_code      | character varying(50)       | not null

Table "public.inventory_statuses"
code          | character varying(50)       | not null
display_label | character varying(70)       | not null

SELECT * FROM inventory_statuses;

  code    | display_label
-----------+---------------
ordered   | Ordered
shipped   | Shipped
returned  | Returned
repaired  | Repaired&lt;/pre&gt;etc.&lt;/p&gt;&lt;p&gt;Several of the codes are common to several tables. For instance, "void" is a status that occurs in seven tables. The application cares about this; there are code-level triggers that will respond to a change of status to "void" in one table, and pass that information along to another table higher up the chain.&lt;/p&gt;&lt;p&gt;Since I wasn't present at the birth of the system (nor do I have unlimited memory to keep 180+ codes in my head), I needed a way to answer the question, "In which table(s) does status 'foo' occur?" This was made rather easier by attention to detail early on: each of the status tables was named "*_statuses"; each primary key was named "code"; and each human-readable description field was named "display_label". I wrote a Pl/PgSQL function to create a view spanning all the tables. (I could have just created the SQL by hand, but I wanted a way to reproduce this effort later, if tables are added, dropped, or modified.)&lt;/p&gt;&lt;pre&gt;CREATE FUNCTION create_all_statuses()
RETURNS VOID
LANGUAGE 'plpgsql'
AS $$
DECLARE
   stmt TEXT;
   tbl RECORD;
BEGIN
   stmt := '';
   FOR tbl IN EXECUTE $SQL$
SELECT DISTINCT table_name
FROM information_schema.columns a
JOIN information_schema.columns b
USING (table_name)
JOIN information_schema.tables t
USING (table_name)
WHERE a.column_name = 'code'
AND   b.column_name = 'display_label'
AND   table_name ~ '_statuses$'
AND   t.table_type  = 'BASE TABLE'
$SQL$
   LOOP
       IF (LENGTH(stmt) &gt; 0)
       THEN
           stmt := stmt || ' UNION ';
       END IF;
       stmt := stmt || 'SELECT code, display_label, ' ||
           quote_literal(tbl.table_name) ||
           ' AS table_name FROM ' ||
           quote_ident(tbl.table_name);
   END LOOP;

   EXECUTE 'CREATE VIEW all_statuses AS ' || stmt;
   RETURN;
END;
$$;&lt;/pre&gt;Now it's easy to answer the question:&lt;pre&gt;select * from all_statuses where code = 'void';

code | display_label |              table_name
------+---------------+--------------------------------------
void | Void          | inventory_statuses
void | Void          | parcel_statuses
void | Void          | pick_list_statuses&lt;/pre&gt;etc.&lt;/p&gt;&lt;p&gt;If your database uses boilerplate columns such as "last_modified" or "date_created" to record timestamps on rows, you could use similar logic to create a view that would tell you which tables were the most recently modified.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-1468915235240718015?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/1468915235240718015/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=1468915235240718015' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1468915235240718015'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1468915235240718015'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/07/views-across-many-similar-tables.html' title='Views across many similar tables'/><author><name>Jeff Boes</name><uri>http://www.blogger.com/profile/15252208460270474613</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08893228874683100863'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-2308958588086190329</id><published>2010-06-10T15:30:00.004-04:00</published><updated>2010-06-17T23:31:22.658-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>pgcrypto pg_cipher_exists errors on upgrade from PostgreSQL 8.1</title><content type='html'>&lt;style&gt;span{font-family:sans-serif;}span.c{color:red;}span.t{font-family:sans-serif;color:green;}span.o{font-family:sans-serif;color:blue;}&lt;/style&gt;&lt;p&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_BSsdd9WIV2k/TBE7vDbm4DI/AAAAAAAAACQ/-CeYVMPfjXM/s1600/enigma.jpg"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 240px; height: 320px;" src="http://2.bp.blogspot.com/_BSsdd9WIV2k/TBE7vDbm4DI/AAAAAAAAACQ/-CeYVMPfjXM/s320/enigma.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5481227901047595058" /&gt;&lt;/a&gt; While migrating a client from a 8.1 Postgres database to a 8.4 Postgres database, I came across a very annoying pgcrypto problem. (&lt;a href="http://www.postgresql.org/docs/current/static/pgcrypto.html"&gt;pgcrypto&lt;/a&gt; is a very powerful and useful contrib module that contains many functions for encryption and hashing.) Specifically, the following functions were removed from pgcrypto as of version 8.2 of Postgres:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;pg_cipher_exists&lt;/li&gt;&lt;li&gt;pg_digest_exists&lt;/li&gt;&lt;li&gt;pg_hmac_exists&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;While the functions listed above &lt;a href="http://www.mail-archive.com/pgsql-hackers@postgresql.org/msg81136.html"&gt;were deprecated&lt;/a&gt;, and marked as such for a while, their complete removal from 8.2 presents problems when upgrading via a simple pg_dump. Specifically, even though the client was not using those functions, they were still there as part of the dump. Here's what the error message looked like:&lt;/p&gt;&lt;pre&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="o"&gt;pg_dump mydb --create | psql -X -p 5433 -f - &gt;pg.stdout 2&gt;pg.stderr&lt;/span&gt;&lt;span class="t"&gt;
...
psql:&lt;stdin&gt;:2654: ERROR:  could not find function "pg_cipher_exists"
  in file "/var/lib/postgresql/8.4/lib/pgcrypto.so"
psql:&lt;stdin&gt;:2657: ERROR:  function public.cipher_exists(text) does not exist
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;While it doesn't stop the rest of the dump from importing, I like to remove any errors I can. In this case, it really &lt;em&gt;was&lt;/em&gt; a &lt;a href="http://en.wikipedia.org/wiki/Small_matter_of_programming"&gt;SMOP&lt;/a&gt;. Inside the Postgres 8.4 source tree, in the &lt;tt&gt;&lt;b&gt;contrib/pgcrypto&lt;/b&gt;&lt;/tt&gt; directory, I added the following declarations to &lt;tt&gt;&lt;b&gt;pgcrypto.h&lt;/b&gt;&lt;/tt&gt;:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
Datum       pg_cipher_exists(PG_FUNCTION_ARGS);
Datum       pg_digest_exists(PG_FUNCTION_ARGS);
Datum       pg_hmac_exists(PG_FUNCTION_ARGS);
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Then I added three simple functions to the bottom of the &lt;tt&gt;&lt;b&gt;pgcrypto.c&lt;/b&gt;&lt;/tt&gt; file that simply throw an error if they are invoked, letting the user know that the functions are deprecated. This is a much friendlier way than simply removing the functions, IMHO.&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
/* SQL function: pg_cipher_exists(text) returns boolean */
PG_FUNCTION_INFO_V1(pg_cipher_exists);

Datum
pg_cipher_exists(PG_FUNCTION_ARGS)
{
    ereport(ERROR,
            (errcode(ERRCODE_EXTERNAL_ROUTINE_INVOCATION_EXCEPTION),
             errmsg("pg_cipher_exists is a deprecated function")));
    PG_RETURN_TEXT_P("0");
}

/* SQL function: pg_cipher_exists(text) returns boolean */
PG_FUNCTION_INFO_V1(pg_digest_exists);

Datum
pg_digest_exists(PG_FUNCTION_ARGS)
{

    ereport(ERROR,
            (errcode(ERRCODE_EXTERNAL_ROUTINE_INVOCATION_EXCEPTION),
             errmsg("pg_digest_exists is a deprecated function")));
    PG_RETURN_TEXT_P("0");
}
/* SQL function: pg_hmac_exists(text) returns boolean */
PG_FUNCTION_INFO_V1(pg_hmac_exists);

Datum
pg_hmac_exists(PG_FUNCTION_ARGS)
{

    ereport(ERROR,
            (errcode(ERRCODE_EXTERNAL_ROUTINE_INVOCATION_EXCEPTION),
             errmsg("pg_hmac_exists is a deprecated function")));
    PG_RETURN_TEXT_P("0");
}

&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;After running &lt;tt&gt;&lt;b&gt;make install&lt;/b&gt;&lt;/tt&gt; from the pgcrypto directory, the dump proceeded without any further pgcrypto errors. From this point forward, if the anyone attempts to use one of the functions, it will be quite obvious that the function is deprecated, rather than leaving the user wondering if they typed the function name incorrectly or wondering if pgcrypto is perhaps not installed.&lt;/p&gt;&lt;p&gt;Why not just add some dummy SQL functions to the &lt;tt&gt;&lt;b&gt;pgcrypto.sql&lt;/b&gt;&lt;/tt&gt; file instead of hacking the C code? Because pg_dump by default will create the database as a copy of template0. While there are other ways around the problem (such as putting the SQL functions into template1 and forcing the load to use that instead of template0, or by creating the database, adding the SQL functions, and then loading the data), this was the simplest approach.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;a href="http://www.flickr.com/photos/mwichary/2297915254/"&gt;Photo&lt;/a&gt; of &lt;a href="http://en.wikipedia.org/wiki/Enigma_machine"&gt;Enigma machine&lt;/a&gt; by &lt;a href="http://www.flickr.com/photos/mwichary/"&gt;Marcin Wichary&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-2308958588086190329?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/2308958588086190329/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=2308958588086190329' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/2308958588086190329'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/2308958588086190329'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/06/pgcrypto-pgcipherexists-errors-on.html' title='pgcrypto pg_cipher_exists errors on upgrade from PostgreSQL 8.1'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_BSsdd9WIV2k/TBE7vDbm4DI/AAAAAAAAACQ/-CeYVMPfjXM/s72-c/enigma.jpg' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-7302737131645785558</id><published>2010-06-01T09:00:00.004-04:00</published><updated>2010-06-17T23:33:30.830-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>Tracking Down Database Corruption With psql</title><content type='html'>&lt;p&gt;I love broken Postgres.  Really.  Well, not nearly as much as I love the usual working Postgres, but it's still a fantastic learning opportunity.  A crash can expose a slice of the inner workings you wouldn't normally see in any typical case.  And, assuming you have the resources to poke at it, that can provide some valuable insight without lots and lots of studying internals (still on my TODO list.)&lt;/p&gt;

&lt;p&gt;As a member of the PostgreSQL support team at End Point a number of diverse situations tend to cross my desk.  So imagine my excitement when I get an email containing a bit of log output that would normally make a DBA tremble in fear:&lt;/p&gt;
&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;pre&gt;LOG:  server process (PID 10023) was terminated by signal 11
LOG:  terminating any other active server processes
FATAL:  the database system is in recovery mode
LOG:  all server processes terminated; reinitializing
&lt;/pre&gt;&lt;/span&gt;
&lt;p&gt;Oops, signal 11 is SIGSEGV, Segmentation Fault.  Really not supposed to happen, especially in day to day activities.  That'll cause Postgres to drop all of its current sessions and restart itself, as the log lines indicate.  That crash was in response to a specific query their application was running, which essentially runs a process on a column across an entire table.  Upon running pg_dump they received a different error:&lt;/p&gt;
&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;pre&gt;ERROR:  invalid memory alloc request size 2667865904
STATEMENT:  COPY public.different_table (etc, etc) TO stdout
&lt;/pre&gt;&lt;/span&gt;
&lt;p&gt;Different, but still very annoying and in the way of their data.  So we have (at least) two areas of corruption.  But therein lies the bigger problem: Neither of these messages give us any clues about where in these potentially very large tables it's encountering a problem.&lt;/p&gt;

&lt;p&gt;Yes, my hope is that the corruption is not widespread.  I know this database tends to not see a whole lot of churn, relatively speaking, and that they look at most if not all the data rather frequently.  So the expectation is that it was caught not long after the disk controller or some memory or something went bad, and that whatever's wrong is isolated to a handful of pages.&lt;/p&gt;

&lt;p&gt;Our good and trusty psql command line client to the rescue!  One of the options available in psql is FETCH_COUNT, which if set will wrap a SELECT query in a cursor then automatically and repeatedly fetch the specified number of rows from it.  This option is there primarily to allow psql to show the results of large queries without having to dedicate so much memory up front.  But in this case it lets us see the output of a table scan as it happens:&lt;/p&gt;
&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;pre&gt;testdb=# \set FETCH_COUNT 1
testdb=# \pset pager off
Pager usage is off.
testdb=# SELECT ctid, * FROM gs;
 ctid  | generate_series 
-------+-----------------
 (0,1) |               0
 (0,2) |               1
(scroll, scroll, scroll...)
&lt;/pre&gt;&lt;/span&gt;
&lt;p&gt;(You did start that in a screen session, right?  No need to have it send all the data over to your terminal, especially if you're working remotely.  Set screen to watch for the output to go idle, Ctrl-A, _ keys by default, and switch to a different window.  Oh, and this of course isn't the client's database, but one where I've intentionally introduced some corruption.)&lt;/p&gt;

&lt;p&gt;We select the system column ctid to tell us the page where the problem occurs.  Or more specifically, the page and positions leading up to the problem:&lt;/p&gt;
&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;pre&gt; (439,226) |           99878
 (439,227) |           99879
server closed the connection unexpectedly
        This probably means the server terminated abnormally
        before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
:|!&gt;?
&lt;/pre&gt;&lt;/span&gt;
&lt;p&gt;Yup, there it is.  Some point after item pointer 227 on page 439, which probably actually means page 440.  At this point we can reconnect, and possibly through a bit of trial and error narrow down the affected area a little more.  But for now let's run with page 440 being suspect; let's take a closer look.  And it here it should be noted that if you're going to try anything, shut down Postgres and take a file-level backup of the data directory.  Anyway, first we need to find the underlying file for our table...&lt;/p&gt;
&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;pre&gt;testdb=# select oid from pg_database where datname = 'testdb';
  oid  
-------
 16393
(1 row)

testdb=#* select relfilenode from pg_class where relname = 'gs';
 relfilenode 
-------------
       16394
(1 row)

testdb=#* \q
demo:~/p82$ dd if=data/base/16393/16394 bs=8192 skip=440 count=1 | hexdump -C | less
...
000001f0  00 91 40 00 e0 90 40 00  00 00 00 00 00 00 00 00  |..@...@.........|
00000200  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000  1f 8b 08 08 00 00 00 00  02 03 70 6f 73 74 67 72  |..........postgr|
00001010  65 73 71 6c 2d 39 2e 30  62 65 74 61 31 2e 74 61  |esql-9.0beta1.ta|
00001020  72 00 ec 7d 69 63 1b b7  d1 f0 f3 55 fb 2b 50 8a  |r..}ic.....U.+P.|
00001030  2d 25 96 87 24 5f 89 14  a6 a5 25 5a 56 4b 1d 8f  |-%..$_....%ZVK..|
00001040  28 27 4e 2d 87 5a 91 2b  6a 6b 72 97 d9 25 75 c4  |('N-.Z.+jkr..%u.|
00001050  f6 fb db df 39 00 2c b0  bb a4 28 5b 71 d2 3e 76  |....9.,...([q.&gt;v|
00001060  1b 11 8b 63 30 b8 06 83  c1 60 66 1c c6 93 41 e4  |...c0....`f...A.|
...
&lt;/pre&gt;&lt;/span&gt;
&lt;p&gt;Huh, so through perhaps either a kernel bug, a disk controller problem, or bizarre action on the part of a sysadmin, the last bit of our table has been overwritten by the 9.0beta1 tarball distribution.  Incidentally this is not one of the recommended ways of upgrading your database.&lt;/p&gt;

&lt;p&gt;With a corrupt page identified, if it's fairly clear the invalid data covers most or all of the page it's probably not too likely we'll be able to recover any rows from it.  Our best bet is to "zero out" the page so that Postgres will skip over it and let us pull the rest of the data from the table.  We can use `dd` to seek to the corrupt block in the table and write out an 8k block of zero-bytes in its place.  Shut down Postgres (just to make sure it doesn't re-overwrite your work later) and note the conv=notrunc that'll keep dd from truncating the rest of the table.&lt;/p&gt;
&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;pre&gt;demo:~/p82$ dd if=/dev/zero of=data/base/16393/16394 bs=8192 seek=440 count=1 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000141498 s, 57.9 MB/s
demo:~/p82$ dd if=data/base/16393/16394 bs=8192 skip=440 count=1 | hexdump -C
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000147993 s, 55.4 MB/s
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00002000
&lt;/pre&gt;&lt;/span&gt;
&lt;p&gt;Cool, it's now an empty, uninitialized page that Postgres should be fine skipping right over.  Let's test it, start Postgres back up and run psql again...&lt;/p&gt;
&lt;span style="color: rgb(0, 0, 0);"&gt;&lt;pre&gt;testdb=# select count(*) from gs;
 count 
-------
 99880
(1 row)
&lt;/pre&gt;&lt;/span&gt;
&lt;p&gt;No crash, hurray!  We've clearly lost some rows from the table, but that should now allow us to rescue any of the surrounding data.  As always it's worth dumping out all the data you can, running initdb, and loading it back in.  You never know what else might have been affected in the original database.  This is of course no substitute for a real backup, but if you're in a pinch at least there is some hope.  For now, PostgreSQL is happy again!&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-7302737131645785558?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/7302737131645785558/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=7302737131645785558' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7302737131645785558'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7302737131645785558'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/06/tracking-down-database-corruption-with.html' title='Tracking Down Database Corruption With psql'/><author><name>Josh Williams</name><uri>http://www.blogger.com/profile/10667170385197485182</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='15260210630504920885'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-3903772217282648795</id><published>2010-05-25T11:17:00.007-04:00</published><updated>2010-06-17T23:33:37.433-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open-source'/><category scheme='http://www.blogger.com/atom/ns#' term='community'/><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>The PGCon "Hall Track"</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_Q7kwFwwUJ0o/S_vvGhQRvuI/AAAAAAAAAUM/HfFvlyBfTww/s1600/pgcon2010.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 128px; height: 128px;" src="http://1.bp.blogspot.com/_Q7kwFwwUJ0o/S_vvGhQRvuI/AAAAAAAAAUM/HfFvlyBfTww/s320/pgcon2010.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5475232667283865314" /&gt;&lt;/a&gt;
&lt;p&gt;One of my favorite parts of PGCon is always the "hall track", a general term for the sideline discussions and brainstorming sessions that happen over dinner, between sessions (or sometimes during sessions), and pretty much everywhere else during the conference. This year's hall track topics seemed to be set by the developers' meeting; everywhere I went, someone was talking about hooks for external security modules, MERGE, predicate locking, extension packaging and distribution, or exposing transaction order for replication. Other developers' pet projects that didn't appear in the meeting showed up occasionally, including unlogged tables and range types. Even more than, for instance, the &lt;a href="http://wiki.postgresql.org/wiki/PgCon_2010_Developer_Meeting"&gt;wiki pages describing the things people plan to work on&lt;/a&gt;, these interstitial discussions demonstrate the vibrancy of the community and give a good idea just how active our development really is.&lt;/p&gt;
&lt;p&gt;This year I shared rooms with Robert Haas, so I got a good overview of his plans for &lt;a href="http://rhaas.blogspot.com/2010/05/global-temporary-and-unlogged-tables.html"&gt;global temporary and unlogged tables&lt;/a&gt;. I spent a while with &lt;a href="http://thoughts.j-davis.com/"&gt;Jeff Davis&lt;/a&gt; looking through the code for exclusion constraints and deciding whether it was realistically possible to cause a starvation problem with many concurrent insertions into a table with an exclusion constraint. I didn't spend the time I should have talking with &lt;a href="http://blog.tapoueh.org/blog.dim.html"&gt;Dimitri Fontaine&lt;/a&gt; about his PostgreSQL extensions project, but if time permits I'd like to see if I could help out with it. Nor did I find the time I'd have liked to work on &lt;a href="http://pl.parrot.org/"&gt;PL/Parrot&lt;/a&gt;, but I was glad to meet Jonathan Leto, who has done most of the coding work thus far on that project.&lt;/p&gt;
&lt;p&gt;In contrast to other conferences, I didn't have a particular itch of my own to scratch between sessions. During past conferences I've been eager to discuss ideas for multi-column statistics; though that work continues, slowly, time hasn't permitted enough recent development even for the topic to be fresh in my mind, much less worthy of in-depth discussion. This lack of one overriding subject turned out to be a refreshing change, however, as it left the other hall track subjects less filtered.&lt;/p&gt;
&lt;p&gt;Finally, it was nice to spend time with co-workers, and in fact to meet (finally) in person the &lt;a href="http://www.endpoint.com/team/greg_sabino_mullane"&gt;one of the "Greg"s&lt;/a&gt; I'd talked to on the phone many times, but never actually met in person. Various engagements in my family or his have gotten in the way in the past. One of the quirks of working for a distributed organization...&lt;/p&gt;
&lt;p&gt;Update: Fixed link to developers' meeting wiki page, thanks to comment from roppert&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-3903772217282648795?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/3903772217282648795/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=3903772217282648795' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3903772217282648795'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3903772217282648795'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/pgcon-hall-track.html' title='The PGCon &quot;Hall Track&quot;'/><author><name>Joshua Tolley</name><uri>http://www.blogger.com/profile/08481531515300677240</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='13545358980860046445'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://1.bp.blogspot.com/_Q7kwFwwUJ0o/S_vvGhQRvuI/AAAAAAAAAUM/HfFvlyBfTww/s72-c/pgcon2010.png' height='72' width='72'/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-1752348666110206483</id><published>2010-05-24T16:15:00.010-04:00</published><updated>2010-06-17T23:33:44.427-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open-source'/><category scheme='http://www.blogger.com/atom/ns#' term='community'/><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>Postgres Conference - PGCon2010 - Day Two</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_BSsdd9WIV2k/S_bOszbzPeI/AAAAAAAAABw/yYTq8Q2srFE/s1600/pgcon2010.png"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 128px; height: 128px;" src="http://3.bp.blogspot.com/_BSsdd9WIV2k/S_bOszbzPeI/AAAAAAAAABw/yYTq8Q2srFE/s320/pgcon2010.png" alt="" id="BLOGGER_PHOTO_ID_5473789666231795170" border="0" /&gt;
&lt;/a&gt;&lt;p&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_BSsdd9WIV2k/S_bOszbzPeI/AAAAAAAAABw/yYTq8Q2srFE/s1600/pgcon2010.png"&gt;Day two of&lt;/a&gt; &lt;a href="http://www.pgcon.org/2010/"&gt;the PostgreSQL Conference&lt;/a&gt; started a little later than the previous day in obvious recognition of the fact that many people were up very, very late the night before. (Technically, this is day four, as the first two days consisted of tutorials; this was the second day of "talks").&lt;/p&gt;

&lt;p&gt;The first talk I went to was &lt;a href="http://www.pgcon.org/2010/schedule/events/251.en.html"&gt;PgMQ: Embedding messaging in PostgreSQL&lt;/a&gt; by Chris Bohn. It was well attended, although there were definitely a lot of late-comers and bleary eyes. A tough slot to fill! Chris is from Etsy.com and I've worked with him there, although I had no interaction with the PgMQ project, which looks pretty cool. From the talk description:&lt;/p&gt;
&lt;blockquote&gt;PgMQ (PostgreSQL Message Queueing) is an add-on that embeds a messaging client inside PostgreSQL. It supports the AMQP, STOMP and OpenWire messaging protocols, meaning that it can work with all of the major messaging systems such as ActiveMQ and RabbitMQ. PgMQ enables two replication capabilities: "Eventually Consistent" Replication and sharding.&lt;/blockquote&gt;

&lt;p&gt;As near as I can tell, "eventually consistent" is the same as "asynchronous replication": the slave won't be the same as the master right away, but will be eventually. As with Bucardo and Slony, the actual lag is very small in practice: a handful of seconds at the most. I like the fact that it supports all those common messaging protocols. Chris mentioned in the talk that it should be possible for other systems like Bucardo to support something similar. I'll have to play around with PgMQ a bit and see about doing just that. :)&lt;/p&gt;

&lt;div style="float: right; margin: 0pt 0pt 10px 10px"&gt;&lt;img style="cursor: pointer; width: 320px; height: 240px;" src="http://4.bp.blogspot.com/_BSsdd9WIV2k/S_rhxi1VT-I/AAAAAAAAACI/KL4jL9p30kw/s320/post_talk.jpg" alt="The typical post-talk gatherings" id="BLOGGER_PHOTO_ID_5474936538302599138" border="0" /&gt;&lt;br /&gt;&lt;em&gt;&lt;tt&gt;The typical post-talk gatherings&lt;/tt&gt;&lt;/em&gt;&lt;/div&gt;

&lt;p&gt;The next "talk" was the enigmatically labeled &lt;a href="http://www.pgcon.org/2010/schedule/events/268.en.html"&gt;Replication Panel&lt;/a&gt;. Enigmatic in this case as it had no description whatsoever. It's a good thing I had decided to check it out anyway (I'm a sucker for any talk related to replication, in case it wasn't obvious yet). I was apparently nominated to be on the panel, representing Bucardo! So much for getting all my speaking done and over with the first day. The panel represented a pretty wide swatch of Postgres replication technologies, and by the people who are very deep in the development of each one. From left to right on a cluster of stools at the front of the room was:&lt;/p&gt;

&lt;ul&gt;

&lt;li&gt;&lt;a href="http://skytools.projects.postgresql.org/doc/londiste.cmdline.html"&gt;Londiste&lt;/a&gt; (Marko Kreen)&lt;/li&gt; &lt;li&gt;&lt;a href="http://slony.info/"&gt;Slony&lt;/a&gt; (Jan Wieck)&lt;/li&gt; &lt;li&gt;&lt;a href="http://pgpool.projects.postgresql.org/"&gt;pgpool-II&lt;/a&gt; (&lt;a href="http://www.pgcon.org/2008/schedule/speakers/95.en.html"&gt;Tatsuo Ishii&lt;/a&gt;)&lt;/li&gt; &lt;li&gt;&lt;a href="http://wiki.postgresql.org/wiki/Hot_Standby"&gt;Hot standby&lt;/a&gt; and &lt;a href="http://wiki.postgresql.org/wiki/Streaming_Replication"&gt;Streaming replication&lt;/a&gt; (Heikki Linnakangas)&lt;/li&gt; &lt;li&gt;&lt;a href="http://bucardo.org/"&gt;Bucardo&lt;/a&gt; (Greg Sabino Mullane)&lt;/li&gt; &lt;li&gt;&lt;a href="http://code.google.com/p/golconde/"&gt;Golconde&lt;/a&gt; (Gavin M. Roy)&lt;/li&gt;&lt;/ul&gt;

&lt;p&gt;After a quick one-minute each intro describing who we were and what our replication system was, we took questions from the audience. Rather, Dan Langille played the part of the moderator and gathered written questions from the audience which he read to us, and we each took turns answering. We managed to get through 16 questions. All were interesting, even if some did not apply to all the solutions. Some of the more relevant ones I remember:&lt;/p&gt;&lt;ul&gt;&lt;p&gt;&lt;em&gt;"If your replication solution was not available, which of the other replication solutions would you recommend?"&lt;/em&gt; This was my favorite question. My answer was: if using Bucardo in multi-master mode, switch to pgpool. If using in master-slave mode, use Slony.&lt;/p&gt;&lt;p&gt;&lt;em&gt;"How will PG 9.0 affect your solution? Will your solution still remain relevant?"&lt;/em&gt; This most heavily affects Bucardo, Slony, and Londiste, and we all agreed that we're happy to lose users who simply need a read-only copy of their database. Their remains plenty of use cases that 9.0 will not solve however.&lt;/p&gt;&lt;p&gt;&lt;em&gt;"For multi-master solutions: How are database collisions resolved? Do you recommend your solution for geographically remote locations?"&lt;/em&gt; This one is pretty much for me alone. :) I gave a quick overview of Bucardo's built-in conflict resolution systems, and how custom ones built on business logic works. Since Bucardo was originally built to support servers over a non-optimal network, the second part was an easy Yes.&lt;/p&gt;&lt;p&gt;&lt;em&gt;"Is there a way to standardize and reduce the number of replication systems and focus on making the subset more robust, efficient, and versatile?"&lt;/em&gt; The general answer was no, as the use cases for all of them are so wildly different. I thought the only possible reduction was to combine Slony and Londiste, as they are very close technically and have pretty much identical use cases.&lt;/p&gt;&lt;p&gt;&lt;em&gt;"How easy is it to switch masters? Are you planning on improving the tools to do so?"&lt;/em&gt; With Bucardo, switching is as easy as pointing to a different database if using master-master. However, Bucardo master-slave has no built in support at all for failover (like Slony does). So the answer is "not easy at all" and yes, we want to provide tools to do so.&lt;/p&gt;&lt;p&gt;&lt;em&gt;"What is your biggest bug, problem, or limitation you are fixing now?"&lt;/em&gt; All three of the async trigger solutions (Bucardo, Slony, and Londiste) answered "DDL triggers". Which is hopefully coming for 9.1 (stop reading this blog and get to work on that, Jan).&lt;/p&gt;&lt;p&gt;All in all, I really liked the panel, and I think the audience did as well. Hopefully we'll see more things like at future conferences. Since we did not know the questions before hand, and took everything from the audience, it was the polar opposite of someone giving a talk with prepared slides.&lt;/p&gt;&lt;p&gt;I had some people come up to me afterwards to ask for more details about Bucardo, because (as they pointed out), it's the only multi-master replication system for Postgres (not technically true, as pg-pool and rubyrep provide multi-master use cases as well, but the former is synchronous and fairly complex, while the latter is very new and lacking some features). Maybe next year I should give a whole talk on Bucardo rather than just blabbing about it here on the blog. :)&lt;/p&gt;&lt;p&gt;After that, I popped into the &lt;a href="http://www.pgcon.org/2010/schedule/events/257.en.html"&gt;Check Please! What Your Postgres Databases Wishes You Would Monitor&lt;/a&gt; talk by Robert Treat (who I also used to work with). It was a good talk, but pretty much review for me, as watching over and monitoring databases is what I spend a lot of my time doing. :) Here's the description:&lt;/p&gt;&lt;blockquote&gt;Compared to many proprietary systems, Postgres tends to be pretty straight forward to run. However, if you want to get the most from your database, you shouldn't just set it and forget it, you need to monitor a few key pieces of information to keep performance going. This talk will review several key metrics you should be aware of, and explain under which scenarios you may need additional monitoring.&lt;/blockquote&gt;&lt;p&gt;The final talk I went to was &lt;a href="http://www.pgcon.org/2010/schedule/events/244.en.html"&gt;Deploying and testing triggers and functions in multiple databases&lt;/a&gt; by Norman Yamada. This was an interesting talk for me because he was using a lot of the code from the same_schema action in the check_postgres program to do the actual comparison. Indeed, I made some patches while at the conference to allow for better index comparison's at Norman's request. I also managed to get some work done on tail_n_mail and Bucardo while there - something about being surrounded by all that Postgres energy made me productive despite having very little free time.&lt;/p&gt;&lt;p&gt;I had to catch an early flight, and was not able to catch the final talk slot of the day, nor the closing session or the BOFs that night. Hopefully someone who did catch those will blog about it and let me know how it went. I hear the t-shirt we signed at the developer's meeting went for a sweet ransom.&lt;/p&gt;&lt;p&gt;If you went to PgCon, I have two requests for you. &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_BSsdd9WIV2k/S_rg0RrAu-I/AAAAAAAAACA/rjQ2SUDZ0nI/s1600/elephants_sunset.jpg"&gt;&lt;img style="margin: 0pt 10px 10px 0pt; float: left; cursor: pointer; width: 228px; height: 266px;" src="http://2.bp.blogspot.com/_BSsdd9WIV2k/S_rg0RrAu-I/AAAAAAAAACA/rjQ2SUDZ0nI/s400/elephants_sunset.jpg" alt="" id="BLOGGER_PHOTO_ID_5474935485723884514" border="0" /&gt;&lt;/a&gt; First, please fill out the feedback for each talk you went to. It takes less than a minute per talk, and is invaluable for both the speakers and the conference organizers. Second, please blog about PgCon. It's helpful for people who did not get to go to see the conference through other people's eyes. And do it now, while things are still fresh.&lt;/p&gt;&lt;p&gt;If you did not go to PgCon, I have one request for you: go next year! Perhaps next year at PgCon 2011 we'll break the 200 person mark. Thanks to Dan Langille as always for creating PgCon and keeping it running smooth year after year.&lt;/p&gt;
&lt;/ul&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-1752348666110206483?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/1752348666110206483/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=1752348666110206483' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1752348666110206483'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1752348666110206483'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/postgres-conference-pgcon2010-day-two.html' title='Postgres Conference - PGCon2010 - Day Two'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_BSsdd9WIV2k/S_bOszbzPeI/AAAAAAAAABw/yYTq8Q2srFE/s72-c/pgcon2010.png' height='72' width='72'/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-4481509869781397911</id><published>2010-05-21T13:47:00.014-04:00</published><updated>2010-06-17T23:33:50.864-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open-source'/><category scheme='http://www.blogger.com/atom/ns#' term='community'/><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>PostgreSQL Conference - PGCon 2010 - Day One</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_BSsdd9WIV2k/S_bOszbzPeI/AAAAAAAAABw/yYTq8Q2srFE/s1600/pgcon2010.png"&gt;&lt;img style="float:left; margin:0 10px 10px 0;cursor:pointer; cursor:hand;width: 128px; height: 128px;" src="http://3.bp.blogspot.com/_BSsdd9WIV2k/S_bOszbzPeI/AAAAAAAAABw/yYTq8Q2srFE/s320/pgcon2010.png" border="0" alt=""id="BLOGGER_PHOTO_ID_5473789666231795170" /&gt;&lt;/a&gt;
&lt;p&gt;The first day of talks for &lt;a href="http://www.pgcon.org/2010/"&gt;PGCon 2010&lt;/a&gt; is now over, here's a recap of the parts that I attended.&lt;/p&gt;&lt;p&gt;On Wednesday, the &lt;a href="http://wiki.postgresql.org/wiki/PgCon_2010_Developer_Meeting"&gt;developer's meeting&lt;/a&gt; took place. It was basically 20 of us gathered around a long conference table, with Dave Page keeping us to a strict schedule. While there were a few side conversations and contentious issues, overall we covered an amazing amount of things in a short period of time, and actually made action items out of almost all of them. My favorite *decision* we made was to finally move to git, something myself and others have been championing for years. The other most interesting parts for me were the discussion of what features we will try to focus on for 9.1 (it's an ambitious list, no doubt), and DDL triggers! It sounds like Jan Wieck has already given this a lot of thought, so I'm looking forward to working with him in implementing these triggers (or at least &lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_BSsdd9WIV2k/S_bJh8xK-JI/AAAAAAAAABo/sr7DFIHeGAk/s1600/devmeeting.jpg"&gt;&lt;img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 320px; height: 240px;" src="http://2.bp.blogspot.com/_BSsdd9WIV2k/S_bJh8xK-JI/AAAAAAAAABo/sr7DFIHeGAk/s320/devmeeting.jpg" alt="" id="BLOGGER_PHOTO_ID_5473783982200649874" border="0" /&gt;&lt;/a&gt; nagging him about it if he slows down). These triggers will be immensely useful to replication systems like Bucardo and Slony, which implement DDL replication in a very manual and unsatisfactory way. These triggers will not be like the current triggers, in that they will not be directly attached to system tables. Instead, they will be associated with certain DDL events, such that you could have a trigger on any CREATE events (or perhaps also allowing something finer grained such as a trigger on a CREATE TABLE event). Whenever it comes in, I'll make sure that Bucardo supports it, of course!&lt;/p&gt;
&lt;p&gt;The first day of talks kicked off the the plenary by Gavin Roy called "Perspectives on NoSQL" (&lt;a href="http://www.pgcon.org/2010/schedule/events/219.en.html"&gt;description and slides are available&lt;/a&gt;). Gavin actually took the time to *gasp* research the topic, and gave a quick rundown of some of the more popular "NoSQL" solutions, including CouchDB, MongoDB, Cassandra, Project Voldemort, Redis, and Tokyo Tyrant. He then benchmarked all of them against Postgres for various tasks - and did it against both "regular safe" Postgres and "running with scissors" fsync-off Postgres. The results? Postgres scales, very well, and more than holds it own against the NoSQL newcomers. MongoDB did surprisingly well: see the slides for the details. His slides also had the unfortunate portmanteau of "YeSQL", which only helps to empahsize how silly our "PostgreSQL" name is. :)&lt;/p&gt;
&lt;p&gt;The next talk was &lt;a href="http://www.pgcon.org/2010/schedule/events/238.en.html"&gt;Postgres (for non-Postgres people)&lt;/a&gt; by Greg Sabino Mullane (me!). Unlike previous years, my slides are already online. Yes, at first blush, it seems a strange talk to give at a conference like this, but we always have a good number of people from other database systems that are considering Postgres, are in the process of migrating to Postgres, or are just new to Postgres. The talk was in three parts: the first was about the mechanics of migrating your application to Postgres: the data types that Postgres uses, how we implement indexes, the best way to migrate your data, and many other things, with an eye towards common migration problems (especially when coming from MySQL). The second part of the talk discussed some of the quirks of Postgres people coming from DB2, Oracle, etc. should be aware of. Some things discussed: how Postgres does MVCC and need for vacuum, our really smart planner and lack of hints, the automatic (and against the spec) lowercasing, and our concept of schemas. I also touched on what I see as some of our drawbacks: tuned for a toaster, no true in place upgrade, the unpronounceable name, the lack of marketing. and what some of our perceived-but-not-real drawbacks are: lack of replication, poor speed. What would a list of drawbacks be without a list of strengths?: transactional DDL, very friendly and helpful community, PostGIS, authentication options, awesome query planner, the ability to create your own custom database objects, and our distributed nature that ensures the project cannot be bought out or destroyed. The last part of the talk went over the Postgres project itself: the community, the developers, the philosophy, and how it all fits together. I ran out of time so did not get to tell my "longest patch process ever" story for \dfS (six years!) but I don't think I missed anything important and gave time for some questions.&lt;/p&gt;&lt;p&gt;The next talk was &lt;a href="http://www.pgcon.org/2010/schedule/events/233.en.html"&gt;Hypothetical Indexes towards self-tuning in PostgreSQL&lt;/a&gt; by &lt;a href="http://www.pgcon.org/2010/schedule/speakers/165.en.html"&gt;Sergio Lifschitz&lt;/a&gt;. In the words of Sergio:&lt;/p&gt;&lt;blockquote&gt;Hypothetical indexes are simulated index structures created solely in the database catalog. This type of index has no physical extension and, therefore, cannot be used to answer actual queries. The main benefit is to provide a means for simulating how query execution plans would change if the hypothetical indexes were actually created in the database. This feature is quite useful for database tuners and DBAs.&lt;/blockquote&gt;&lt;p&gt;It was a very interesting talk. Robert Haas asked him to put it in the PostgreSQL license so we can easily put it into the project as needed. Sergio promised to make the change immediately after the talk!&lt;/p&gt;&lt;p&gt;After lunch, the next talk was &lt;a href="http://www.pgcon.org/2010/schedule/events/216.en.html"&gt;pg_statsinfo - More useful statistics information for DBAs&lt;/a&gt; by &lt;a href="http://www.pgcon.org/2010/schedule/speakers/160.en.html"&gt;Tatsuhito Kasahara&lt;/a&gt;. This talk was a little hard to follow along, but had some interesting ideas about monitoring Postgres, a lot of which overlapped with some of my projects such as &lt;a href="http://bucardo.org/wiki/Tail_n_mail"&gt;tail_n_mail&lt;/a&gt; and &lt;a href="http://bucardo.org/wiki/Check_postgres"&gt;check_postgres&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;The next talk was &lt;a href="http://www.pgcon.org/2010/schedule/events/234.en.html"&gt;Forensic Analysis of Corrupted Databases&lt;/a&gt; by Greg Stark. This was a neat little talk; many of the error messages he displayed were all too familiar to me. It was nice overview of how to track down the exact location of a problem in a corrupted database, and some strategies for fixing it, including the old "using dd to write things from /dev/zero directly into your Postgres files" trick. There was even a discussion about the possibility of zeroing out specific parts of a page header, with the consensus that it would not work as one would hope.&lt;/p&gt;&lt;p&gt;After a quick hacky sack break with Robert Treat and some Canadian locals, I went to the final real talk of the day: &lt;a href="http://www.pgcon.org/2010/schedule/events/208.en.html"&gt;The PostgreSQL Query Planner&lt;/a&gt; by &lt;a href="http://www.pgcon.org/2010/schedule/speakers/140.en.html"&gt;Robert Haas&lt;/a&gt;. I had seen this talk recently, but wanted to see it again as I missed some of the beginning of the talk when I saw it at &lt;a href="http://www.postgresqlconference.org/2010/east"&gt;Pg East 2010&lt;/a&gt; in Philly. Robert gave a good talk, and was very good at repeating the audience's questions. I didn't learn all that much, but it was a very good overview of the planner, including some of the new planner tricks (such as join removal) in 9.0 and 9.1.&lt;/p&gt;&lt;p&gt;After that, the &lt;a href="http://www.pgcon.org/2010/schedule/events/267.en.html"&gt;lightning talks&lt;/a&gt; started. I really like lightning talks, and thankfully they weren't held on the last day of the conference this time (a common mistake). The MC was &lt;a href="http://www.chesnok.com/daily/"&gt;Selena Deckelmann&lt;/a&gt;, who did a great job of making sure all the slides were gathered up beforehand, and strictly enforced the five minute time limit. The list of slides is &lt;a href="http://wiki.postgresql.org/wiki/PgCon_2010_Lightning_talks"&gt;on the Postgres wiki&lt;/a&gt;. I talked on my latest favorite project, tail_n_mail - the slides are &lt;a href="http://wiki.postgresql.org/wiki/Image:Tnm.pdf"&gt;available on the wiki&lt;/a&gt;. I didn't make it through all my slides, so if you were at the talks, check out the PDF for the final two that were not shown. There seemed to be good interest in the project, and I had several people tell me afterwards they would try it out.&lt;/p&gt;&lt;p&gt;The night ended with the EnterpriseDB sponsored party. I spoke to a lot of people there, about replication, PITR scripts, log monitoring, the problem with a large number of inherited objects, and many other topics. Note to EDB: I don't think that venue is going to scale, as the conference gets bigger each year! The total number of people at the conference this year was 184, a new record.&lt;/p&gt;&lt;p&gt;A very good first day: I learned a lot, met new people, saw old friends, and hopefully sold Postgres to some non-Postgres people :). I also managed to git push some changes to tail_n_mail, check_postgres, and Bucardo. It's hard to say no to feature requests when someone asks you in person. :)&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-4481509869781397911?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/4481509869781397911/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=4481509869781397911' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/4481509869781397911'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/4481509869781397911'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/postgresql-conference-pgcon-2010-day.html' title='PostgreSQL Conference - PGCon 2010 - Day One'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://3.bp.blogspot.com/_BSsdd9WIV2k/S_bOszbzPeI/AAAAAAAAABw/yYTq8Q2srFE/s72-c/pgcon2010.png' height='72' width='72'/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-5110024839449798491</id><published>2010-05-19T17:10:00.002-04:00</published><updated>2010-05-20T11:55:58.823-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='git'/><category scheme='http://www.blogger.com/atom/ns#' term='configuration-management'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>PostgreSQL switches to Git</title><content type='html'>&lt;p&gt;Looks like the Postgres project is finally going to be bite the bullet and switch to git as the canonical VCS. Some details are yet to be hashed out, but the decision has been made and a new repo will be built soon. Now to lobby to get that commit-with-inline-patches list to be created...&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-5110024839449798491?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/5110024839449798491/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=5110024839449798491' title='7 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/5110024839449798491'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/5110024839449798491'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/postgresql-switches-to-git.html' title='PostgreSQL switches to Git'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><thr:total>7</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-1072623665447990902</id><published>2010-05-19T00:44:00.005-04:00</published><updated>2010-06-17T23:34:01.959-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='redhat'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>PostgreSQL 8.4 on RHEL 4: Teaching an old dog new tricks</title><content type='html'>&lt;p&gt;So a client has been running a &lt;em&gt;really&lt;/em&gt; old version of PostgreSQL
in production for a while.  We finally got the approval to upgrade
them from 7.3 to the latest 8.4.  Considering the age of the
installation, it should come as little surprise that they had been
running a similarly ancient OS: RHEL 4.&lt;/p&gt;

&lt;p&gt;Like the installed PostgreSQL version, RHEL 4 is ancient -- 5 years
old.  I anticipated that in order to get us to a current version of
PostgreSQL, we'd need to resort to a source build or rolling our own
PostgreSQL RPMs.  Neither approach was particularly appealing.&lt;/p&gt;

&lt;p&gt;While the age/decrepitude of the current machine's OS came as
little surprise, what did come as a surprise was that there were
supported RPMs available for RHEL 4 in the community yum rpm
repository, located at &lt;a href="http://yum.pgrpms.org/8.4/redhat/rhel-4-i386/repoview/"&gt;
http://yum.pgrpms.org/8.4/redhat/rhel-4-i386/repoview/&lt;/a&gt; (modulo your architecture of choice).&lt;/p&gt;

&lt;p&gt;In order to get things installed, I followed the instructions for
installing the specific yum repo.  There were a few seconds where I
was confused because the installation command was giving a "permission
denied" error when attempting to install the 8.4 PGDG rpm as root.  A
little brainstorming and a &lt;tt&gt;lsattr&lt;/tt&gt; later revealed that a
previous administrator, apparently in the quest for über-security, had
performed a &lt;tt&gt;chattr +i&lt;/tt&gt; on the &lt;tt&gt;/etc/yum.repo.d&lt;/tt&gt;
directory.&lt;/p&gt;

&lt;p&gt;Evil having been thwarted, in the interest of über-usability I did
a quick &lt;tt&gt;chattr -i /etc/yum.repo.d&lt;/tt&gt; and installed the PGDG rpm.
Away we went.  From that point, the install was completely
straightforward; I had a PostgreSQL 8.4.4 system running in no time,
and could &lt;em&gt;finally&lt;/em&gt; get off that 7.3 behemoth.  Now to talk my way
into an OS upgrade...&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-1072623665447990902?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/1072623665447990902/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=1072623665447990902' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1072623665447990902'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1072623665447990902'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/postgresql-84-on-rhel-4-teaching-old.html' title='PostgreSQL 8.4 on RHEL 4: Teaching an old dog new tricks'/><author><name>David Christensen</name><uri>http://www.blogger.com/profile/14237702125131095813</uri><email>david@endpoint.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='01514418822641247743'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-4884951622126208684</id><published>2010-05-17T18:37:00.013-04:00</published><updated>2010-06-17T23:34:09.984-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='security'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>Finding the PostgreSQL version - without logging in!</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_BSsdd9WIV2k/S_HJexUNozI/AAAAAAAAAA8/90nRl6rrFMo/s1600/fingerprint.jpg"&gt;&lt;img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 180px; height: 239px;" src="http://4.bp.blogspot.com/_BSsdd9WIV2k/S_HJexUNozI/AAAAAAAAAA8/90nRl6rrFMo/s320/fingerprint.jpg" alt="" id="BLOGGER_PHOTO_ID_5472376552703238962" border="0" /&gt;&lt;/a&gt;
&lt;style&gt;span.t{font-family:monospace,sans-serif;color:darkgreen;font-weight:bold;}&lt;/style&gt;
&lt;p&gt;Metasploit used the error messages given by a PostgreSQL server to find out the version without actually having to log in and issue a "SELECT version()" command. The original article is at &lt;a href="http://blog.metasploit.com/2010/02/postgres-fingerprinting.html"&gt;http://blog.metasploit.com/2010/02/postgres-fingerprinting.html&lt;/a&gt; and is worth a read. I'll wait.&lt;/p&gt;&lt;p&gt;The basic idea is that because version 3 of the Postgres protocol gives you the file and the line number in which the error is generated, you can use the information to figure out what version of Postgres is running, as the line numbers change from version to version. In effect, each version of Postgres reveals enough in its error message to fingerprint it. This was a neat little trick, and I wanted to explore it more myself. The first step was to write a quick Perl script to connect and get the error string out. The original Metasploit script focuses on failed login attempts, but after some experimenting I found an easier way was to send an invalid protocol number (Postgres expects "2.0" or "3.0"). Sending a startup packet with an invalid protocol of "3.1" gave me back the following string:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
E|SFATALC0A000Munsupported frontend protocol 3.1: 
server supports 1.0 to 3.0Fpostmaster.cL1507RProcessStartupPacket
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The important part of the string was the parts indicating the file and line number:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
Fpostmaster.cL1507
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;In this case, we can clearly see that line 1507 of postmaster.c was throwing the error. After firing up a few more versions of Postgres and recording the line numbers, I found that all versions since 7.3 were hitting the same chunk of code from postmaster.c:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
/* Check we can handle the protocol the frontend is using. */

if (PG_PROTOCOL_MAJOR(proto) &lt;&gt; PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST) ||
  (PG_PROTOCOL_MAJOR(proto) == PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST) &amp;amp;&amp;amp;
   PG_PROTOCOL_MINOR(proto) &gt; PG_PROTOCOL_MINOR(PG_PROTOCOL_LATEST)))
  ereport(FATAL,
  (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
    errmsg("unsupported frontend protocol %u.%u: server supports %u.0 to %u.%u",
      PG_PROTOCOL_MAJOR(proto), PG_PROTOCOL_MINOR(proto),
      PG_PROTOCOL_MAJOR(PG_PROTOCOL_EARLIEST),
      PG_PROTOCOL_MAJOR(PG_PROTOCOL_LATEST),
      PG_PROTOCOL_MINOR(PG_PROTOCOL_LATEST))));
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Line numbers were definitely different across major versions of Postgres (e.g. 8.2 vs. 8.3), and were even different sometimes across revisions. Rather than fire up every possible revision of Postgres and run my program against it, I simply took advantage of the cvs tags (aka symbolic names) and did this:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
cvs update -rREL8_3_0 -p postmaster.c | grep -Fn 'LATEST))))'
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;This showed me that the string occurred on line 1497 of postmaster.c. I created a Postgres instance and verified that the line number was the same. At that point, it was a simple matter of making a bash script to grab all releases since 7.3 and build up a comprehensive list of when that line changed from version to version.&lt;/p&gt;&lt;p&gt;Once that was done, I rolled the whole thing up into a new Perl script called "detect_postgres_version.pl". Here's the script, broken into pieces for explanation. A link to the entire script is at the bottom of the post.&lt;/p&gt;&lt;p&gt;First, we do some standard Perl script things and read in the __DATA__ section at the bottom of the script, which lists at which version the message has changed:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
#!/usr/bin/env perl

## Quickly and roughly determine what version of Postgres is running
## greg@endpoint.com

use strict;
use warnings;
use IO::Socket;
use Data::Dumper;
use Getopt::Long;

## __DATA__ looks like this: filname / line / version when it changed
## postmaster.c 1287 7.4.0
## postmaster.c 1293 7.4.2
## postmaster.c 1293 7.4.29
##
## postmaster.c 1408 8.0.0
## postmaster.c 1431 8.0.2

## Build our hash of file-and-line to version matches
my %map;
my ($last,$lastmin,$lastline) = ('',0,0);
while (&lt;data&gt;) {
   next if $_ !~ /(\w\S+)\s+(\d+)\s+(.+)/;
   my ($file,$line,$version) = ($1,$2,$3);
   die if $version !~ /(\d+)\.(\d+)\.(\d+)/;
   my ($vmaj,$vmin,$vrev) = ($1,$2,$3);
   my $current = "$file|$vmaj|$vmin";
   if ($current eq $last) {
       my ($lfile,$lmaj,$lmin) = split /\|/ =&gt; $last;
       for (my $x = $lastmin+1 ; $x&lt;$vrev; $x++) {
           push @{$map{$file}{$lastline}}
             =&gt; ["$lmaj.$lmin","$lmaj.$lmin.$x"];
       }
   }
   push @{$map{$file}{$line}} =&gt; ["$vmaj.$vmin",$version];
   $last = $current;
   $lastmin = $vrev;
   $lastline = $line;
}
&lt;/data&gt;&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Next, we allow a few options to the script: port and host. We'll default to a Unix socket if the host is not set, and default to port 5432 if none is given:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
## Read in user options and set defaults
my %opt;
GetOptions(\%opt,
          'port=i',
          'host=s',
);

my $port = $opt{port} || 5432;
my $host = $opt{host} || '';
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;We're ready to connect, using the very standard IO::Socket module. If the host starts with a slash, we assume this is the unix_socket_directory and replace the default '/tmp' location:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
## Start the connection, either unix or tcp
my $server;
if (!$host or !index $host, '/') {
   my $path = $host || '/tmp';
   $server = IO::Socket::UNIX-&gt;new(
       Type =&gt; IO::Socket::SOCK_STREAM,
       Peer =&gt; "$path/.s.PGSQL.$port",
   ) or die "Could not connect!: $@";
}
else {
   $server = IO::Socket::INET-&gt;new(
       PeerAddr =&gt; $host,
       PeerPort =&gt; $port,
       Proto    =&gt; 'tcp',
       Timeout  =&gt; 3,
   ) or warn "Could not connect!: $@";
}
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Now we're ready to actually send something over our new socket. Postgres expects the startup packet to be in a certain format. We'll follow that format, but send it an invalid protocol number, 3.1. The rest of the information does not really matter, but we'll also tell it we're connecting as user "pg". Finally, we read back in the message, extract the file and line number, and spit them back out to the user:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
## Build and sent the packet
my $packet = pack('nn', 3,1) . "user\0pg\0\0";
$packet = pack('N', length($packet) + 4). $packet;
$server-&gt;send($packet, 0);

## Get the message back and extract the filename and line number
my $msg;
recv $server, $msg, 1000, 0;
if ($msg !~ /F([\w\.]+)\0L(\d+)/) {
   die "Could not find a file and line from error message: $msg\n";
}

my ($file,$line) = ($1,$2);

print "File: $file Line: $line\n";
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Finally, we try to map the file name and line number we received back to the version of PostgreSQL it came from. If the file is not recognized, or the line number is not known, we bail out early:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
$map{$file}
   or die qq{Sorry, I do not know anything about the file "$file"\n};

$map{$file}{$line}
   or die qq{Sorry, I do not know anything about line $line of file "$file"\n};

&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;If there is only one result for this line and file number, we can state what it is and exit.&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
my $result = $map{$file}{$line};

if (1 == @$result) {
   print "Most likely Postgres version $result-&gt;[0][1]\n";
   exit;
}
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;In most cases, though, we don't know the exact version down to the revision after the second dot, so we'll state what the major version is, and all the possible revisions:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
## Walk through and figure out which versions it may be.
## For now, we know that the major version does not overlap
print "Most likely Postgres version $result-&gt;[0][0]\n";
print "Specifically, one of these:\n";

for my $row (@$result) {
   print "  Postgres version $row-&gt;[1]\n";
}

exit;
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The only thing left is the DATA section, which I'll show here to be complete:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
__DATA__

## Format: filename line version

postmaster.c 1167 7.3.0
postmaster.c 1167 7.3.21

postmaster.c 1287 7.4.0
postmaster.c 1293 7.4.2
postmaster.c 1293 7.4.29

postmaster.c 1408 8.0.0
postmaster.c 1431 8.0.2
postmaster.c 1441 8.0.5
postmaster.c 1445 8.0.6
postmaster.c 1439 8.0.7
postmaster.c 1443 8.0.9
postmaster.c 1445 8.0.14
postmaster.c 1445 8.0.25

postmaster.c 1449 8.1.0
postmaster.c 1450 8.1.1
postmaster.c 1454 8.1.2
postmaster.c 1448 8.1.3
postmaster.c 1452 8.1.4
postmaster.c 1448 8.1.9
postmaster.c 1454 8.1.10
postmaster.c 1454 8.1.21

postmaster.c 1432 8.2.0
postmaster.c 1437 8.2.1
postmaster.c 1440 8.2.5
postmaster.c 1432 8.2.17

postmaster.c 1497 8.3.0
postmaster.c 1507 8.3.8
postmaster.c 1507 8.3.11

postmaster.c 1570 8.4.0
postmaster.c 1621 8.4.1
postmaster.c 1621 8.4.4

postmaster.c 1664 9.0.0
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;(Because version 9.0 is not released yet, its line number may still change.)&lt;/p&gt;&lt;p&gt;I found this particular protocol error to be a good one because there is no overlap of line numbers across major versions. Of the approximately 125 different versions released since 7.3.0, only 6 are unique enough to identify to the exact revision. That's okay for this iteration of the script. If you wanted to know the exact revision, you could try other errors, such as an invalid login, as the metasploit code does.&lt;/p&gt;&lt;p&gt;The complete code can be read here: &lt;a href="http://gtsm.com/detect_postgres_version.pl"&gt;detect_postgres_version.pl&lt;/a&gt;&lt;/p&gt;&lt;p&gt;I'll be giving &lt;a href="http://www.pgcon.org/2010/schedule/events/238.en.html"&gt;a talk&lt;/a&gt; later on this week at &lt;a href="http://www.pgcon.org/2010/"&gt;PgCon 2010&lt;/a&gt;, so say hi if you see me there. I'll probably be giving a &lt;a href="http://www.pgcon.org/2010/schedule/events/267.en.html"&gt;lightning talk&lt;/a&gt; as well.&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-4884951622126208684?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/4884951622126208684/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=4884951622126208684' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/4884951622126208684'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/4884951622126208684'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/finding-postgresql-version-without.html' title='Finding the PostgreSQL version - without logging in!'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://4.bp.blogspot.com/_BSsdd9WIV2k/S_HJexUNozI/AAAAAAAAAA8/90nRl6rrFMo/s72-c/fingerprint.jpg' height='72' width='72'/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-7814297437826915726</id><published>2010-05-12T17:00:00.003-04:00</published><updated>2010-06-17T23:34:16.311-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>Using PostgreSQL Hooks</title><content type='html'>&lt;p&gt;PostgreSQL is well known for its extensibility; users can build new functions, operators, data types, and procedural languages, among others, without having to modify the core PostgreSQL code. Less well known is PostgreSQL's extensive set of "hooks", available to the more persistent coder. These hooks allow users to interrupt and modify behavior in all kinds of places without having to rebuild PostgreSQL.&lt;/p&gt;
&lt;p&gt;Few if any of these hooks appear in the documentation, mostly because the code documents them quite well, and anyone wanting to use them is assumed already to be sufficiently familiar with the code to find the information they'd need to use one. For those interested in getting started using hooks, though, an example can be useful. Fortunately, the contrib source provides one, in the form of &lt;a href="http://www.postgresql.org/docs/9.0/static/passwordcheck.html"&gt;passwordcheck&lt;/a&gt;, a simple contrib module that checks users' passwords for sufficient strength. These checks include having a length greater than 8 characters, being distinct from the username, and containing both alphabetic and non-alphabetic characters. It can also use &lt;a href="http://sourceforge.net/projects/cracklib/"&gt;CrackLib&lt;/a&gt; for more intense password testing, if built against the CrackLib code.&lt;/p&gt;
&lt;p&gt;In general, these hooks consist of global function pointers of a specific type, which are initially set to NULL. Whenever PostgreSQL wants actually to use a hook, it checks the function pointer, and if it's not NULL, calls the function it points to. When someone implements a hook, they write a function of the proper type and an initialization function to set the function pointer variable. They then package the functions in a library, and tell PostgreSQL to load the result, often using &lt;a href="http://www.postgresql.org/docs/9.0/static/runtime-config-resource.html#GUC-SHARED-PRELOAD-LIBRARIES"&gt;shared_preload_libraries&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For our example, the important pieces of the PostgreSQL code are in src/backend/commands/user.c and src/include/commands/user.h. First, we need a function pointer type, which in this case is called check_password_hook_type:&lt;/p&gt;
&lt;pre&gt;
typedef void (*check_password_hook_type)
   (const char *username, const char *password,
   int password_type, Datum validuntil_time,
   bool validuntil_null);

extern PGDLLIMPORT check_password_hook_type check_password_hook;
&lt;/pre&gt;
&lt;p&gt;This says the check_password_hook will take arguments for user name, password, password type, and validity information (for passwords valid until certain dates). It also provides an extern declaration of the actual function pointer, called "check_password_hook".&lt;/p&gt;
&lt;p&gt;The next important pieces of code are in src/backend/commands/user.c, as follows:&lt;/p&gt;
&lt;pre&gt;
/* Hook to check passwords in CreateRole() and AlterRole() */
check_password_hook_type check_password_hook = NULL;
&lt;/pre&gt;
&lt;p&gt;...which defines the function hook variable, and this:&lt;/p&gt;
&lt;pre&gt;
 if (check_password_hook &amp;&amp; password)
  (*check_password_hook) (stmt-&gt;role, password,
      isMD5(password) ? PASSWORD_TYPE_MD5 : PASSWORD_TYPE_PLAINTEXT,
    validUntil_datum,
    validUntil_null);
&lt;/pre&gt;
&lt;p&gt;...which actually uses the hook. Actually the hook is used twice, with identical code, once in CreateRole() and once in AlterRole(), so as to provide password checking in both places. (Insert D.R.Y. rant here).&lt;/p&gt;
&lt;p&gt;In order to take advantage of this hook, the passwordcheck module needs to implement the hook function, and set the check_password_hook variable to point to that function. First, passwordcheck.c needs to include a few things, including "commands/user.h" to ge the definitions of check_password_hook and check_password_hook_type, and call the PG_MODULE_MAGIC macro every PostgreSQL shared library needs. Then, it implements the password checking logic in a function called check_password():&lt;/p&gt;
&lt;pre&gt;

static void
check_password(const char *username,
      const char *password,
      int password_type,
      Datum validuntil_time,
      bool validuntil_null)
{
/* Actual password checking logic goes here */
}
&lt;/pre&gt;
&lt;p&gt;Note that this declaration matches the arguments described in the check_password_hook_type, above.&lt;/p&gt;
&lt;p&gt;Now to ensure the check_password_hook variable points to this new check_password() function. When loading a shared library, PostgreSQL looks for a function defined in that library called _PG_init(), and runs it if it exists. In passwordcheck, the _PG_init() function is as simple as this:&lt;/p&gt;
&lt;pre&gt;
void
_PG_init(void)
{
 /* activate password checks when the module is loaded */
 check_password_hook = check_password;
}
&lt;/pre&gt;
&lt;p&gt;Other modules using hooks often check the hook variable for NULL &lt;b&gt;before&lt;/b&gt; setting it, in case something else is already using the hook. For instance, the &lt;a href="http://www.postgresql.org/docs/9.0/static/auto-explain.html"&gt;auto_explain&lt;/a&gt; contrib module does this in _PG_init() (note that auto_explain uses three different hooks):&lt;/p&gt;
&lt;pre&gt;
 prev_ExecutorStart = ExecutorStart_hook;
 ExecutorStart_hook = explain_ExecutorStart;
 prev_ExecutorRun = ExecutorRun_hook;
 ExecutorRun_hook = explain_ExecutorRun;
 prev_ExecutorEnd = ExecutorEnd_hook;
 ExecutorEnd_hook = explain_ExecutorEnd;
&lt;/pre&gt;
&lt;p&gt;auto_explain also resets the hook variables in its _PG_fini() function. Since unloading modules isn't yet supported and thus, _PG_fini() never gets called, this is perhaps unimportant, but is good for the sake of being thorough.&lt;/p&gt;
&lt;p&gt;Back to passwordcheck. Having set the hook variable, all that remains is to get PostgreSQL to load this library. The easiest way to do that is to set shared_preload_libraries in postgresql.conf:&lt;p&gt;
&lt;pre&gt;
josh@eddie:~/devel/pgsrc/pg-eggyknap/contrib/passwordcheck$ psql
psql (9.0devel)
Type "help" for help.

5432 josh@josh# show shared_preload_libraries ;
 shared_preload_libraries 
--------------------------
 passwordcheck
(1 row)
&lt;/pre&gt;
&lt;p&gt;Restarting PostgreSQL loads the library, proven as follows:&lt;/p&gt;
&lt;pre&gt;

5432 josh@josh# create user badpass with password 'bad';
ERROR:  password is too short
&lt;/pre&gt;
&lt;p&gt;There are hooks like this all over the PostgreSQL code base. Simply search for "_hook_type", to find such possibilities as these:&lt;/p&gt;
&lt;table&gt;
&lt;tr&gt;&lt;td&gt;&lt;b&gt;Name&lt;/b&gt;&lt;/td&gt;&lt;td&gt;&lt;b&gt;Description&lt;/b&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;shmem_startup_hook&lt;/td&gt;&lt;td&gt;Called when PostgreSQL initializes its shared memory segment&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;explain_get_index_name_hook&lt;/td&gt;&lt;td&gt;Called when explain finds indexes' names.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;planner_hook&lt;/td&gt;&lt;td&gt;Runs when the planner begins, so plugins can monitor or even modify the planner's behavior&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;get_relation_info_hook&lt;/td&gt;&lt;td&gt;Allows modification of expansion of the information PostgreSQL gets from the catalogs for a particular relation, including adding fake indexes&lt;/td&gt;&lt;/tr&gt;&lt;/table&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-7814297437826915726?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/7814297437826915726/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=7814297437826915726' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7814297437826915726'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7814297437826915726'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/using-postgresql-hooks.html' title='Using PostgreSQL Hooks'/><author><name>Joshua Tolley</name><uri>http://www.blogger.com/profile/08481531515300677240</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='13545358980860046445'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-5507253406254744795</id><published>2010-05-12T15:04:00.013-04:00</published><updated>2010-06-17T23:34:25.768-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='testing'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>PostgreSQL template databases to restore to a known state</title><content type='html'>&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://2.bp.blogspot.com/_BSsdd9WIV2k/S-sCnahDJHI/AAAAAAAAAA0/fujkN32aPRo/s1600/elephant_upgrade.jpg"&gt;&lt;img style="float:right; margin:0 0 10px 10px;cursor:pointer; cursor:hand;width: 320px; height: 200px;" src="http://2.bp.blogspot.com/_BSsdd9WIV2k/S-sCnahDJHI/AAAAAAAAAA0/fujkN32aPRo/s320/elephant_upgrade.jpg" border="0" alt=""id="BLOGGER_PHOTO_ID_5470469048527692914" /&gt;&lt;/a&gt;
&lt;style&gt;span{font-family:sans-serif;}span.t{font-family:sans-serif;color:green;}&lt;/style&gt;&lt;p&gt;Someone asked on the mailing lists recently about restoring a PostgreSQL database to a known state for testing purposes. How to do this depends a little bit on what one means by "known state", so let's explore a few scenarios and their solutions.&lt;/p&gt;&lt;/p&gt;&lt;p&gt;First, let's assume you have a Postgres &lt;a href="http://www.postgresql.org/docs/current/static/creating-cluster.html"&gt;cluster&lt;/a&gt; with one or more databases that you create for developers or QA people to mess around with. At some point, you want to "reset" the database to the pristine state it was in before people starting making changes to it.&lt;/p&gt;&lt;p&gt;The first situation is that people have made both DDL changes (such as ALTER TABLE ... ADD COLUMN) and DML changes (such as INSERT/UPDATE/DELETE). In this case, what you want is a complete snapshot of the database at a point in time, which you can then restore from. The easiest way to do this is to use the TEMPLATE feature of the CREATE DATABASE command.&lt;/p&gt;&lt;p&gt;Every time you run CREATE DATABASE, it uses an already existing database as the &lt;a href="http://www.postgresql.org/docs/current/static/manage-ag-templatedbs.html"&gt;"template"&lt;/a&gt;. Basically, it creates a copy of the template database you specify. If no template is specified, it uses "template1" by default, so that these two commands are equivalent:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
CREATE DATABASE foobar;
CREATE DATABASE foobar TEMPLATE template1;
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Thus, if we want to create a complete copy of an existing database, we simply use it as a template for our copy:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
CREATE DATABASE mydb_template TEMPLATE mydb;
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Thus, when we want to restore the &lt;strong&gt;mydb&lt;/strong&gt; database to the exact same state as it was when we ran the above command, we simply do:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
DROP DATABASE mydb;
CREATE DATABASE mydb TEMPLATE mydb_template;
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;You may want to make sure that nobody changes your new template database. One way to do this is to not allow any non-superusers to connect to the database by setting the user limit to zero. This can be done either at creation time, or afterwards, like so:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
CREATE DATABASE mydb_template TEMPLATE mydb CONNECTION LIMIT 0;

ALTER DATABASE mydb_template CONNECTION LIMIT 0;
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;You may want to go further by granting the database official "template" status by adjusting the &lt;strong&gt;datistemplate&lt;/strong&gt; column in the pg_database table:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
UPDATE pg_database SET datistemplate = TRUE WHERE datname = 'mydb_template';
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;This will allow anyone to use the database as a template, as long as they have the CREATEDB privilege. You can also restrict &lt;b&gt;*all*&lt;/b&gt; connections to the database, even superusers, by adjusting the &lt;strong&gt;datallowconn&lt;/strong&gt; column:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
UPDATE pg_database SET datallowconn = FALSE WHERE datname = 'mydb_template';
&lt;/span&gt;&lt;/pre&gt;
&lt;hr /&gt;&lt;p&gt;Another way to restore the database to a known state is to use the &lt;a href="http://www.postgresql.org/docs/current/static/app-pgdump.html"&gt;pg_dump utility&lt;/a&gt; to create a file, then use &lt;a href="http://www.postgresql.org/docs/current/static/app-psql.html"&gt;psql&lt;/a&gt; to restore that database. In this case, the command to save a copy would be:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
pg_dump mydb --create &gt; mydb.template.pg
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The &lt;strong&gt;--create&lt;/strong&gt; option tells pg_dump to create the database itself as the first command in the file. If you look at the generated file, you'll see that it is using &lt;strong&gt;template0&lt;/strong&gt; as the template database in this case. Why does Postgres have template0 &lt;em&gt;and&lt;/em&gt; template1? The template1 database is meant as a user configurable template that you can make changes to that will be picked up by all future CREATE DATABASE commands (a common example is a &lt;a href="http://www.postgresql.org/docs/current/static/sql-createlanguage.html"&gt;CREATE LANGUAGE&lt;/a&gt; command). The template0 database on the other hand is meant as a "hands off, don't ever change it" stable database that can always safely be used as a template, with no changes from when the cluster was first created. To that end, you are not even allowed to connect to the template0 database (thanks to the &lt;tt&gt;datallowconn&lt;/tt&gt; column metioned earlier).&lt;/p&gt;&lt;p&gt;Now that we have a file (mydb.template.pg), the procedure to recreate the database becomes:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
psql -X -c 'DROP DATABASE mydb'

psql -X --set ON_ERROR_STOP=on --quiet --file mydb.template.pg
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;We use the &lt;strong&gt;-X&lt;/strong&gt; argument to ensure we don't have any surprises lurking inside of psqlrc files. The &lt;strong&gt;--set ON_ERROR_STOP=on&lt;/strong&gt; option tells psql to stop processing the moment it encounters an error, and the &lt;strong&gt;--quiet&lt;/strong&gt; tells psql to not be verbose and only let us know about very important things. (While I normally advocate using the &lt;strong&gt;--single-transaction&lt;/strong&gt; option as well, we cannot in this case as our file contains a CREATE DATABASE line).&lt;/p&gt;
&lt;hr /&gt;&lt;p&gt;What if (as someone posited in the thread) the original poster really wanted only the *data* to be cleaned out, and not the schema (e.g. DDL)?. In this case, what we want to do is remove all rows from all tables. The easiest way to do this is with the TRUNCATE command of course. Because we don't want to worry about which tables need to be deleted before other ones because of foreign key constraints, we'll also use the CASCADE option to TRUNCATE. We'll query the system catalogs for a list of all user tables, generate truncate commands for them, and then play back the commands we just created. First, we create a simple text file containing commands to truncate all the tables:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
SELECT 'TRUNCATE TABLE '
 || quote_ident(nspname)
 || '.'
 || quote_ident(relname)
 || ' CASCADE;'
FROM pg_class
JOIN pg_namespace n ON (n.oid = relnamespace)
WHERE nspname !~ '^pg'
AND nspname &lt;&gt; 'information_schema'
AND relkind = 'r';
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Once that's saved as &lt;strong&gt;truncate_all_tables.pg&lt;/strong&gt;, resetting the database by removing all rows from all tables becomes as simple as:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
psql mydb -X -t -f truncate_all_tables.pg | psql mydb --quiet
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;We again use the &lt;strong&gt;--quiet&lt;/strong&gt; option to limit the output, as we don't need to see a string of "TRUNCATE TABLE" strings scroll by. The &lt;strong&gt;-t&lt;/strong&gt; option (also written as &lt;strong&gt;--tuples-only&lt;/strong&gt;) prevents the headers and footers from being output, as we don't want to pipe those back in.&lt;/p&gt;&lt;p&gt;It's most likely you'd also want the sequences to be reset to their starting point as well. While sequences generally start at "1", we'll take out the guesswork by using the "ALTER SEQUENCE seqname RESTART" syntax. We'll append the following SQL to the text file we created earlier:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
SELECT 'ALTER SEQUENCE '
 || quote_ident(nspname)
 || '.'
 || quote_ident(relname)
 || ' RESTART;'
FROM pg_class
JOIN pg_namespace n ON (n.oid = relnamespace)
WHERE nspname !~ '^pg'
AND nspname &lt;&gt; 'information_schema'
AND relkind = 'S';
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;The command is run the same as before, but now in addition to table truncation, the sequences are all reset to their starting values.&lt;/p&gt;
&lt;hr /&gt;&lt;p&gt;A final way to restore the database to a known state is a variation on the previous pg_dump command. Rather than save the schema &lt;em&gt;*and*&lt;/em&gt; data, we simply want to restore the database without any data:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
## Create the template file:
pg_dump mydb --schema-only --create &gt; mydb.template.schemaonly.pg

## Restore it:
psql -X -c 'DROP DATABASE mydb'
psql -X --set ON_ERROR_STOP=on --file mydb.template.schemaonly.pg
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Those are a few basic ideas on how to reset your database. There are a few limitations that got glossed over, such as that nobody can be connected to the database that is being used as a template for another one when the CREATE DATABASE command is being run, but this should be enough to get you started.&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-5507253406254744795?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/5507253406254744795/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=5507253406254744795' title='5 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/5507253406254744795'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/5507253406254744795'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/postgresql-template-databases-to.html' title='PostgreSQL template databases to restore to a known state'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><media:thumbnail xmlns:media='http://search.yahoo.com/mrss/' url='http://2.bp.blogspot.com/_BSsdd9WIV2k/S-sCnahDJHI/AAAAAAAAAA0/fujkN32aPRo/s72-c/elephant_upgrade.jpg' height='72' width='72'/><thr:total>5</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-2548769437400577144</id><published>2010-05-09T12:04:00.005-04:00</published><updated>2010-06-17T23:34:35.912-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='monitoring'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>Tail_n_Mail does Windows (log file monitoring)</title><content type='html'>&lt;p&gt;I've just released version 1.10.1 of &lt;a href="http://bucardo.org/wiki/Tail_n_mail"&gt;tail_n_mail.pl&lt;/a&gt;, the handy script for watching over your Postgres logs and sending email when interesting things happen.&lt;/p&gt;&lt;p&gt;Much of the recent work on tail_n_mail has been in improving the parsing of statements in order to normalize them and give reports like this:&lt;/p&gt;&lt;b&gt;&lt;pre&gt;

[1] From files A to Q Count: 839
First: [A] 2010-05-08T05:10:46-05:00 alpha postgres[13567]
Last:  [Q] 2010-05-09T05:02:27-05:00 bravo postgres[19334]
ERROR: duplicate key violates unique constraint "unique_email_address"
STATEMENT: INSERT INTO email_table (id, email, request, token) VALUES (?)

[2] From files C to E (between lines 12523 of A and 268431 of B, occurs 6159 times)                                          
First: [C] 2010-05-04 16:32:23 UTC [22504]                                                                                    
Last:  [E] 2010-05-05 05:04:53 UTC [23907]                                                                                    
ERROR: invalid byte sequence for encoding "UTF8": 0x????
HINT: This error can also happen if the byte sequence does not 
match the encoding expected by the server, which is controlled 
by "client_encoding".

## The above examples are from two separate instances, the first 
## of which has the "find_line_number" option turned off
&lt;/b&gt;&lt;/pre&gt;&lt;p&gt;However, I've only ever used tail_n_mail on Linux-like systems, so it will not work on Windows systems...until now. Thanks to an error report and patch from Paulo Saudin, this program will now work on Windows. There is an new option, &lt;b&gt;&lt;tt&gt;mailmode&lt;/tt&gt;&lt;/b&gt;, which defaults to 'sendmail', for the same behavior as previous versions of tail_n_mail. This assumes you have access to a &lt;b&gt;&lt;tt&gt;sendmail&lt;/tt&gt;&lt;/b&gt; binary (which may or may not be from the actual Sendmail program: many mail programs provide a compatible binary of the same name). If you don't have sendmail, you can now specify an argument of 'smtp' to the mailmode argument (you can also simply use &lt;b&gt;&lt;tt&gt;--smtp&lt;/tt&gt;&lt;/b&gt;). This switches to using the &lt;a href="http://search.cpan.org/~cwest/Net-SMTP-SSL-1.01/lib/Net/SMTP/SSL.pm"&gt;Net::SMTP::SSL module&lt;/a&gt; to send the mail instead of sendmail.&lt;/p&gt;&lt;p&gt;Switching the mailmode is not enough, of course, so there are some additional flags to help the mail go out:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;b&gt;&lt;tt&gt;--mailserver&lt;/tt&gt;&lt;/b&gt; : the name of the outgoing SMTP server&lt;/li&gt;&lt;li&gt;&lt;b&gt;&lt;tt&gt;--mailuser&lt;/tt&gt;&lt;/b&gt; : the user to authenticate with&lt;/li&gt;&lt;li&gt;&lt;b&gt;&lt;tt&gt;--mailpass&lt;/tt&gt;&lt;/b&gt; : the password of the user&lt;/li&gt;&lt;li&gt;&lt;b&gt;&lt;tt&gt;--mailport&lt;/tt&gt;&lt;/b&gt; : the port to use: defaults to 465&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Needless to say, using the --mailpass option from the command line or even in a script is not the best practice, so it is highly recommended that you put the new variables inside a &lt;b&gt;tailnmailrc&lt;/b&gt; file. When the script starts, it looks for a file named &lt;b&gt;&lt;tt&gt;.tailnmailrc&lt;/tt&gt;&lt;/b&gt; in the current directory. If that is not found, it looks for the same file in your home directory (or technically, whatever the &lt;b&gt;HOME&lt;/b&gt; environment variable is set to). If that does not exist, it checks for the file &lt;strong&gt;&lt;tt&gt;/etc/tailnmailrc&lt;/tt&gt;&lt;/strong&gt;. You can override those checks by specifying the file directly with the &lt;b&gt;&lt;tt&gt;--tailnmailrc=&lt;/tt&gt;&lt;/b&gt; option, or disable all rc files with the &lt;b&gt;&lt;tt&gt;--no-tailnmailrc&lt;/tt&gt;&lt;/b&gt; option.&lt;/p&gt;&lt;p&gt;The tailnmailrc file is very straightforward: each line is a name and value pair, separated by a colon or an equal sign. Lines starting with a '#' indicate a comment and are skipped. So someone using the new Net::SMTP::SSL method might have a .tailnmailrc in their home directory that looks like this:&lt;/p&gt;&lt;b&gt;&lt;pre&gt;
mailmode=smtp
mailserver=mail.example.com
mailuser=greg@example.com
mailpass=mysupersekretpassword
&lt;/pre&gt;&lt;/b&gt;&lt;p&gt;The tail_n_mail program is open source and BSD licensed. Contributions are always welcome: send a patch, or fork a version through &lt;a href="http://github.com/bucardo/tail_n_mail"&gt;the Github mirror&lt;/a&gt;. There is also a &lt;a href="http://bucardo.org/bugzilla/"&gt;Bugzilla system&lt;/a&gt; to accept &lt;a href="http://bucardo.org/bugzilla/buglist.cgi?product=tail_n_mail"&gt;bug reports&lt;/a&gt; and feature requests.&lt;/p&gt;

&lt;p&gt;Learn more about End Point's &lt;a href="http://www.endpoint.com/services/postgresql"&gt;Postgres Support, Development, and Consulting&lt;/a&gt;.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-2548769437400577144?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/2548769437400577144/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=2548769437400577144' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/2548769437400577144'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/2548769437400577144'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/tailnmail-does-windows-log-file.html' title='Tail_n_Mail does Windows (log file monitoring)'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-7196718131209156092</id><published>2010-05-05T02:32:00.005-04:00</published><updated>2010-05-05T14:58:38.650-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>PostgreSQL startup Debian logging failure</title><content type='html'>&lt;p&gt;I ran into issues with debugging why a fresh PostgreSQL replica wasn't starting on Debian.  This was with a highly-customized &lt;tt&gt;postgresql.conf&lt;/tt&gt; file with custom logging location, data_directory, etc. set.&lt;/p&gt;

&lt;p&gt;The system log files were not showing any information about the failed &lt;tt&gt;pg_ctlcluster&lt;/tt&gt; output, nor was there any information in &lt;tt&gt;/var/log/postgresql/&lt;/tt&gt; or the defined &lt;tt&gt;log_directory&lt;/tt&gt;.&lt;/p&gt;

&lt;p&gt;I was able to successfully create a new cluster with &lt;tt&gt;pg_createcluster&lt;/tt&gt; and see logs for the new cluster in &lt;tt&gt;/var/log/postgresql/&lt;/tt&gt;.  The utility &lt;tt&gt;pg_lsclusters&lt;/tt&gt; showed both clusters in the listing, but the initial cluster was still down, showing up with a custom log location.  After reviewing the Debian wrapper scripts (fortunately written in Perl) I disabled &lt;tt&gt;log_filename&lt;/tt&gt;, &lt;tt&gt;log_directory&lt;/tt&gt;, and &lt;tt&gt;logging_collector&lt;/tt&gt;, leaving &lt;tt&gt;log_destination = stderr&lt;/tt&gt;.  I was then finally able to get log information spit out to the terminal.&lt;/p&gt;

&lt;p&gt;In this case, it was due to a fresh Amazon EC2 instance lacking appropriate &lt;tt&gt;sysctl.conf&lt;/tt&gt; settings for &lt;tt&gt;kernel.shmmax&lt;/tt&gt; and &lt;tt&gt;kernel.shmall&lt;/tt&gt;.  This particular error occurred before the logging was fully set up, which is why we did not get logging information in the &lt;tt&gt;postgresql.conf&lt;/tt&gt;-designated location.&lt;/p&gt;

&lt;p&gt;Once I had the log information, it was a short matter to correct the issue.  It just goes to show that often finding the problem is 90% of the work.  Hopefully this comes in handy to someone else.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-7196718131209156092?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/7196718131209156092/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=7196718131209156092' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7196718131209156092'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7196718131209156092'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/postgresql-startup-debian-logging.html' title='PostgreSQL startup Debian logging failure'/><author><name>David Christensen</name><uri>http://www.blogger.com/profile/14237702125131095813</uri><email>david@endpoint.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='01514418822641247743'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-3898500191192053835</id><published>2010-05-04T20:06:00.004-04:00</published><updated>2010-05-05T21:25:48.014-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>Tickle me Postgres: Tcl inside PostgreSQL with pl/tcl and pl/tclu</title><content type='html'>&lt;style&gt;span{font-family:sans-serif;}span.c{color:red;}span.t{font-family:sans-serif;color:green;}span.o{font-family:sans-serif;color:blue;}&lt;/style&gt;&lt;p&gt;Although I really love Pl/Perl and find it the most useful language to write PostgreSQL functions in, Postgres has had (for a long time) another set of procedural languages: &lt;b&gt;Pl/Tcl&lt;/b&gt; and &lt;b&gt;Pl/TclU&lt;/b&gt;. The Tcl language is pronounced "tickle", so those two languages are pronounced as "pee-el-tickle" and "pee-el-tickle-you". The pl/tcl languages have been around since before any others, even pl/perl; for a long time in the early days of Postgres using pl/tclu was the only way to do things "outside of the database", such as making system calls, writing files, sending email, etc.&lt;/p&gt;&lt;p&gt;Sometimes people are surprised when they hear I still use Tcl. Although it's not as widely mentioned as other procedural languages, it's a very clean, easy to read, powerful language that shouldn't be overlooked. Of course, with Postgres, you have a wide variety of languages to write your functions in, including:&lt;/p&gt;&lt;ul&gt; &lt;li&gt;&lt;a href="http://www.postgresql.org/docs/current/static/plperl.html"&gt;Perl&lt;/a&gt; (my favorite)&lt;/li&gt; &lt;li&gt;&lt;a href="http://www.postgresql.org/docs/current/interactive/pltcl-overview.html"&gt;Tcl&lt;/a&gt; (second favorite)&lt;/li&gt; &lt;li&gt;&lt;a href="http://www.postgresql.org/docs/current/interactive/plpython.html"&gt;Python&lt;/a&gt; (distant third)&lt;/li&gt; &lt;li&gt;&lt;a href="http://moulon.inra.fr/ruby/plruby.html"&gt;Ruby&lt;/a&gt;&lt;/li&gt; &lt;li&gt;&lt;a href="http://plscheme.projects.postgresql.org/"&gt;Scheme&lt;/a&gt;&lt;/li&gt; &lt;li&gt;&lt;a href="http://pgfoundry.org/projects/pllolcode"&gt;LOLCODE&lt;/a&gt;&lt;/li&gt; &lt;li&gt;&lt;a href="https://www.commandprompt.com/community/plphp/"&gt;PHP&lt;/a&gt;&lt;/li&gt; &lt;li&gt;&lt;a href="http://plsh.projects.postgresql.org/"&gt;sh&lt;/a&gt;&lt;/li&gt; &lt;li&gt;&lt;a href="http://pgfoundry.org/projects/pljava/"&gt;Java&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;The nice thing about &lt;a href="http://wiki.tcl.tk/299"&gt;Tcl&lt;/a&gt; is that not only is it an easy language to write in, it's fully supported by Postgres. Only three languages are maintained inside the Postgres tree itself: Perl, Tcl, and Python. Only two of those have a trusted and untrusted version: Perl and Tcl. All procedural languages in Postgres are untrusted by default, which means they can do things like make system calls. To be a trusted language, there must be some capacity to limit what can be done by the language. With Perl, this is accomplished through the "Safe" Perl module. For Tcl, this is accomplished by having two versions of the Tcl interpreter: a normal one for pltclu and a separate one that uses the &lt;a href="http://www.tcl.tk/software/plugin/safetcl.html"&gt;"Safe-Tcl mechanism"&lt;/a&gt; for pltcl.&lt;/p&gt;&lt;p&gt;Let's take a quick look at what a pltcl function looks like. We'll use pl/tcl to implement the common problem of "SELECT COUNT(*) is very slow" by tracking the row count using &lt;a href="http://www.postgresql.org/docs/current/interactive/trigger-definition.html"&gt;triggers&lt;/a&gt; as we go along. For this, we'll start with a sample table that we want to be able to find out exactly how many rows are inside of any time, without suffering the delay of COUNT(*). Here's the table definition, and a quick command to populate it with some dummy data:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
CREATE SEQUENCE customer_id_seq;

CREATE TABLE customer (
  id      INTEGER     NOT NULL DEFAULT nextval('customer_id_seq') PRIMARY KEY,
  email   TEXT            NULL,
  address TEXT            NULL,
  cdate   TIMESTAMPTZ NOT NULL DEFAULT now()
);

INSERT INTO customer (email, address)
  SELECT 'jsixpack@example.com', '123 Main Street'
  FROM generate_series(1,10000);

&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;A quick review: we create a &lt;a href="http://www.postgresql.org/docs/current/interactive/functions-sequence.html"&gt;sequence&lt;/a&gt; for use by the table to populate its primary key, the 'id' column. Each customer also has an optional email and address, plus we automatically track when we create the row by using the "DEFAULT now()" trick on the 'cdate' column. Finally, we use the super handy &lt;a href="http://www.postgresql.org/docs/current/interactive/functions-srf.html"&gt;generate_series&lt;/a&gt; function to populate the new table with ten thousand rows of data.&lt;/p&gt;&lt;p&gt;Next, we'll create a helper table that will keep track of the rows for us. We'll make it generic so that it can track any number of tables:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
CREATE TABLE table_count (
  schemaname TEXT   NOT NULL,
  tablename  TEXT   NOT NULL,
  rows       BIGINT NOT NULL DEFAULT 0
);

INSERT INTO table_count(schemaname,tablename,rows)
  SELECT 'public', 'customer', count(*) FROM customer;

&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;We also populated it with the current number of rows in customer. Of course, this will be out of date as soon as someone updates the table, so let's add our triggers. We don't want to update the table_count table on every single row change, but only at the end of each statement. To do that, we'll make a row-level trigger that stores up the changes inside a global variable, and then a statement-level trigger that uses the global variable to update the table_count table.&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
CREATE FUNCTION update_table_count_row()
  RETURNS TRIGGER
  SECURITY DEFINER
  VOLATILE
  LANGUAGE pltcl
AS $BC$

  ## Declare tablecount as a global variable so other functions
  ## can access our changes
  variable tablecount

  ## Set the local count of rows changed to 0
  set rows 0

  ## $TG_op indicates what type of command was just run
  ## Modify the local variable rows depending on what we just did
  switch $TG_op {
    INSERT {
      incr rows 1
    }
    UPDATE {
      ## No change in number of rows
      ## We could also leave out the ON UPDATE from the trigger below
    }
    DELETE {
      incr rows -1
    }
  }

  ## The tablecount variable will be an associative array
  ## The index will be this table's name, the value is the rows changed
  ## We should probably be using $TG_schema_name as well, but we'll ignore that

  ## If there is no variable for this table yet, create it, otherwise just change it
  if {![ info exists tablecount($TG_table_name) ] } {
    set tablecount($TG_table_name) $rows
  } else {
    incr tablecount($TG_table_name) $rows
  }

  return OK
$BC$;

CREATE FUNCTION update_table_count_statement()
  RETURNS TRIGGER
  SECURITY DEFINER
  LANGUAGE pltcl
AS $BC$

  ## Make sure we access the global version of the tablecount variable
  variable tablecount

  ## If it doesn't exist yet (for example, when an update changes no 
  ## rows), we simply exit early without making changes
  if { ! [ info exists tablecount ] } {
    return OK
  }
  ## Same logic if our specific entry in the array does not exist
  if { ! [ info exists tablecount($TG_table_name) ] } {
    return OK
  }
  ## If no rows were changed, we simply exit
  if { $tablecount($TG_table_name) == 0 } {
    return OK
  }

  ## Update the table_count table: may be a positive ior negative shift
  spi_exec "
    UPDATE table_count
    SET rows=rows+$tablecount($TG_table_name)
    WHERE tablename = '$TG_table_name'
  "

  ## Reset the global variable for the next round
  set tablecount($TG_table_name) 0

  return OK
$BC$;

CREATE TRIGGER update_table_count_row
  AFTER INSERT OR UPDATE OR DELETE
  ON public.customer
  FOR EACH ROW
  EXECUTE PROCEDURE update_table_count_row();

CREATE TRIGGER update_table_count_statement
  AFTER INSERT OR UPDATE OR DELETE
  ON public.customer
  FOR EACH STATEMENT
  EXECUTE PROCEDURE update_table_count_statement();

&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;(Caveat: because there is a single Tcl interpreter for all pl/tcl functions, these functions are not 100% safe, as there is a theoretical chance that changes made by processes running at the exact same time may step on each other's global variables. In practice, this is unlikely.)&lt;/p&gt;&lt;p&gt;If everything is working correctly, we should see the entries in the table_count table match up with the output of SELECT COUNT(*). Let's take a look via a psql session:&lt;/p&gt;&lt;pre&gt;
&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; \t&lt;/span&gt;
&lt;span class="o"&gt;Showing only tuples.&lt;/span&gt;
&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; \a&lt;/span&gt;
&lt;span class="o"&gt;Output format is unaligned.&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; SELECT * FROM table_count; SELECT COUNT(*) FROM customer;&lt;/span&gt;
&lt;span class="o"&gt;public|customer|10000
10000&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; UPDATE customer SET email=email WHERE id &lt;= 10;&lt;/span&gt;
&lt;span class="o"&gt;UPDATE 10&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; SELECT * FROM table_count; SELECT COUNT(*) FROM customer;&lt;/span&gt;
&lt;span class="o"&gt;public|customer|10000
10000&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; INSERT INTO customer (email, address)&lt;/span&gt;
&lt;span class="c"&gt;psql-#&lt;/span&gt;&lt;span class="t"&gt;   SELECT email, address FROM customer LIMIT 4;&lt;/span&gt;
&lt;span class="o"&gt;INSERT 0 4&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; SELECT * FROM table_count; SELECT COUNT(*) FROM customer;&lt;/span&gt;
&lt;span class="o"&gt;public|customer|10004
10004&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; DELETE FROM customer WHERE id &lt;= 10;&lt;/span&gt;
&lt;span class="o"&gt;DELETE 10&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; SELECT * FROM table_count; SELECT COUNT(*) FROM customer;&lt;/span&gt;
&lt;span class="o"&gt;public|customer|9994
9994&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; TRUNCATE TABLE customer;&lt;/span&gt;
&lt;span class="o"&gt;TRUNCATE TABLE&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; SELECT * FROM table_count; SELECT COUNT(*) FROM customer;&lt;/span&gt;
&lt;span class="o"&gt;public|customer|9994
0&lt;/span&gt;

&lt;/pre&gt;&lt;p&gt;Whoops! Everything matched up until that TRUNCATE. On earlier versions of Postgres, there was no way around that problem, but if we have Postgres version 8.4 or better, we can use truncate triggers!&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
CREATE FUNCTION update_table_count_truncate()
  RETURNS TRIGGER
  SECURITY DEFINER
  LANGUAGE pltcl
AS $BC$

  spi_exec "
    UPDATE table_count
    SET rows=0
    WHERE tablename = '$TG_table_name'
  "

  set tablecount($TG_table_name) 0

 return OK
$BC$;

CREATE TRIGGER update_table_count_truncate
  AFTER TRUNCATE
  ON public.customer
  FOR EACH STATEMENT
  EXECUTE PROCEDURE update_table_count_truncate();

&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Pretty straightforward, let's make sure it works:&lt;/p&gt;&lt;pre&gt;
&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; TRUNCATE TABLE customer;&lt;/span&gt;
&lt;span class="o"&gt;TRUNCATE TABLE&lt;/span&gt;

&lt;span class="c"&gt;psql=#&lt;/span&gt;&lt;span class="t"&gt; SELECT * FROM table_count; SELECT COUNT(*) FROM customer;&lt;/span&gt;
&lt;span class="o"&gt;public|customer|0
0&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;Success! This was a fairly contrived example, but Tcl (and especially pl/tclU) offers a lot more functionality. If you want to examine pl/tcl and pl/tclu for yourself, you'll need to make sure it's compiled into the Postgres you are using. If using a packaging system, it's as simple as doing this (or something like it, depending on what packaging system you use):&lt;/p&gt;&lt;pre&gt;
&lt;span class="t"&gt;yum install postgresql-pltcl&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;If compiling from source, just pass the &lt;em&gt;--with-tcl&lt;/em&gt; option to &lt;b&gt;configure&lt;/b&gt;. You'll probably also need to install the Tcl development package, e.g. with &lt;span class="t"&gt;yum install tcl-devel&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Once installed, installing it into a specific database is as simple as:&lt;/p&gt;&lt;pre&gt;
&lt;span class="c"&gt;$&lt;/span&gt;&lt;span class="t"&gt; CREATE LANGUAGE pltcl;&lt;/span&gt;
&lt;span class="o"&gt;CREATE LANGUAGE&lt;/span&gt;
&lt;span class="c"&gt;$&lt;/span&gt;&lt;span class="t"&gt; CREATE LANGUAGE pltclu;&lt;/span&gt;
&lt;span class="o"&gt;CREATE LANGUAGE&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;For more about Tcl, check out the &lt;a href="http://wiki.tcl.tk/_/welcome"&gt;The Tcl Wiki&lt;/a&gt;, the &lt;a href="http://www.tcl.tk/man/tcl8.5/tutorial/tcltutorial.html"&gt;Tcl tutorial&lt;/a&gt;, or this &lt;a href="http://www.tcl.tk/man/tcl8.5/"&gt;Tcl reference&lt;/a&gt;. For more about pl/tcl and pl/tclu. visit the &lt;a href="http://www.postgresql.org/docs/current/interactive/pltcl-overview.html"&gt;Postgres pltcl documentation&lt;/a&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-3898500191192053835?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/3898500191192053835/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=3898500191192053835' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3898500191192053835'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3898500191192053835'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/05/tcl-inside-postgresql-pltcl-and-pltclu.html' title='Tickle me Postgres: Tcl inside PostgreSQL with pl/tcl and pl/tclu'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-1350753647711413921</id><published>2010-04-30T16:00:00.000-04:00</published><updated>2010-04-30T19:46:22.603-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>LinuxFest Northwest: PostgreSQL 9.0 upcoming features</title><content type='html'>&lt;p&gt;Once again, LinuxFest Northwest provided a full track of PostgreSQL talks during their two-day conference in Bellingham, WA.&lt;/p&gt;

&lt;p&gt;Gabrielle Roth and I presented &lt;a href="http://github.com/gorthx/pg9_preso"&gt;our favorite features in 9.0&lt;/a&gt;, including a live demo of Hot Standby with streaming replication!  We also demonstrated features like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the newly improved '&lt;a href="http://www.postgresql.org/docs/8.4/static/sql-altertable.html"&gt;set storage MAIN&lt;/a&gt;' behavior (&lt;a href="http://www.postgresql.org/docs/8.4/static/storage-toast.html"&gt;TOAST related&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;'samehost' and 'samenet' designations to &lt;a href="http://developer.postgresql.org/pgdocs/postgres/auth-pg-hba-conf.html"&gt;pg_hba.conf&lt;/a&gt; (see CIDR-address section)&lt;/li&gt;
&lt;li&gt;Log changed parameter values when postgresql.conf is reloaded&lt;/li&gt;
&lt;li&gt;Allow &lt;a href="http://developer.postgresql.org/pgdocs/postgres/sql-explain.html"&gt;EXPLAIN output&lt;/a&gt; in XML, JSON, and YAML formats (which our own Greg Sabino Mullane worked on!&lt;/li&gt;
&lt;li&gt;Allow &lt;a href="http://developer.postgresql.org/pgdocs/postgres/sql-notify.html"&gt;NOTIFY&lt;/a&gt; to pass an optional string to listeners&lt;/li&gt;
&lt;li&gt;And of course - &lt;a href="http://developer.postgresql.org/pgdocs/postgres/hot-standby.html"&gt;Hot Standby&lt;/a&gt; and &lt;a href="http://developer.postgresql.org/pgdocs/postgres/warm-standby.html#STREAMING-REPLICATION"&gt;Streaming Replication&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
The &lt;a href="http://developer.postgresql.org/pgdocs/postgres/release-9-0.html"&gt;full feature list&lt;/a&gt; is available at on the developer site right now!&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-1350753647711413921?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/1350753647711413921/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=1350753647711413921' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1350753647711413921'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/1350753647711413921'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/04/linuxfest-northwest-postgresql-90.html' title='LinuxFest Northwest: PostgreSQL 9.0 upcoming features'/><author><name>Selena Deckelmann</name><email>noreply@blogger.com</email></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-6051822758381419496</id><published>2010-04-28T12:04:00.007-04:00</published><updated>2010-04-28T16:34:58.154-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>Viewing Postgres function progress from the outside</title><content type='html'>&lt;style&gt;span { font-family: sans-serif; } span.c { color: red; } span.t { font-family: sans-serif; color: green; } span.o { font-family: sans-serif; color: blue; } &lt;/style&gt;&lt;p&gt;Getting visibility into what your PostgreSQL function is doing can be a difficult task. While you can sprinkle notices inside your code, for example with the &lt;a href="http://www.postgresql.org/docs/current/static/plpgsql-errors-and-messages.html"&gt;RAISE feature&lt;/a&gt; of plpgsql, that only shows the notices to the session that is currently running the function. Let's look at a solution to peek inside a long-running function from any session.&lt;/p&gt;&lt;p&gt;While there are a few ways to do this, one of the most elegant is to use &lt;a href="http://www.postgresql.org/docs/current/static/functions-sequence.html"&gt;Postgres sequences&lt;/a&gt;, which have the unique property of living "outside" the normal &lt;a href="http://wiki.postgresql.org/wiki/MVCC"&gt;MVCC&lt;/a&gt; visibility rules. We'll &lt;strike&gt;ab&lt;/strike&gt;use this feature to allow the function to update its status as it goes along.&lt;/p&gt;&lt;p&gt;First, let's create a simple example function that simulates doing a lot of work, and taking a long time to do so. The function doesn't really do anything, of course, so we'll throw some random sleeps in to emulate the effects of running on a busy production machine. Here's what the first version looks like:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
DROP FUNCTION IF EXISTS slowfunc();

CREATE FUNCTION slowfunc()
RETURNS TEXT
VOLATILE
SECURITY DEFINER
LANGUAGE plpgsql
AS $BC$
DECLARE
  x INT = 1;
  mynumber INT;
BEGIN
  RAISE NOTICE 'Start of function';

  WHILE x &lt;= 5 LOOP
    -- Random number from 1 to 10
    SELECT 1+(random()*9)::int INTO mynumber;
    RAISE NOTICE 'Start expensive step %: time to run=%', x, mynumber;
 PERFORM pg_sleep(mynumber);
    x = x + 1;
  END LOOP;

  RETURN 'End of function';
END
$BC$;&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Pretty straightforward function: we simply emulate doing five expensive steps, and output a small notice as we go along. Running it gives this output (with pauses from 1-10 seconds of course):&lt;/p&gt;&lt;pre&gt;
&lt;span class="c"&gt;$&lt;/span&gt;&lt;span class="t"&gt; psql -f slowfunc.sql&lt;/span&gt;
&lt;span class="o"&gt;DROP FUNCTION
CREATE FUNCTION
psql:slowfunc.sql:30: NOTICE:  Start of function
psql:slowfunc.sql:30: NOTICE:  Start expensive step 1: time to run=2
psql:slowfunc.sql:30: NOTICE:  Start expensive step 2: time to run=7
psql:slowfunc.sql:30: NOTICE:  Start expensive step 3: time to run=3
psql:slowfunc.sql:30: NOTICE:  Start expensive step 4: time to run=8
psql:slowfunc.sql:30: NOTICE:  Start expensive step 5: time to run=5
    slowfunc     
-----------------
 End of function&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;To grant some visibility to other processes about where we are, we're going to change a sequence from within the function itself. First we need to decide on what sequence to use. While we could pick a common name, this won't allow us to run the function in more than one process at a time. Therefore, we'll create unique sequences based on the PID of the process running the function. Doing so is fairly trivial for an application: just create that sequence before the expensive function is called. For this example, we'll use some psql tricks to achieve the same effect like so:&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
\t
\o tmp.drop.sql
SELECT 'DROP SEQUENCE IF EXISTS slowfuncseq_' || pg_backend_pid() || ';';
\o tmp.create.sql
SELECT 'CREATE SEQUENCE slowfuncseq_' || pg_backend_pid() || ';';
\o
\t
\i tmp.drop.sql
\i tmp.create.sql
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;From the top, this script turns off everything but tuples (so we have a clean output), then arranges for all output to go to the file named "tmp.drop.sql". Then we build a sequence name by concatenating the string 'slowfuncseq_' with the current PID. We put that into a DROP SEQUENCE statement. Then we redirect the output to a new file named "tmp.create.sql" (this closes the old one as well). We do the same thing for CREATE SEQUENCE. Finally, we stop sending things to the file, turn off "tuples only" mode, and import the two files we just created, first to drop the sequence if it exists, and then to create it. The files will look something like this:&lt;/p&gt;&lt;pre&gt;
&lt;span class="c"&gt;$&lt;/span&gt;&lt;span class="t"&gt; more tmp.*.sql&lt;/span&gt;
&lt;span class="o"&gt;::::::::::::::
tmp.drop.sql
::::::::::::::
 DROP SEQUENCE IF EXISTS slowfuncseq_8762;

::::::::::::::
tmp.create.sql
::::::::::::::
 CREATE SEQUENCE slowfuncseq_8762;&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;The only thing left is to add the calls to the sequence from within the function itself. Remember that the sequence called must exist, or the function will throw an exception, so make sure you create the sequence before the function is called! (Alternatively, you could use the same named sequence every time, but as explained before, you lose the ability to track more than one iteration of the function at a time.)&lt;/p&gt;&lt;pre&gt;&lt;span class="t"&gt;
DROP FUNCTION IF EXISTS slowfunc();

CREATE FUNCTION slowfunc()
RETURNS TEXT
VOLATILE
SECURITY DEFINER
LANGUAGE plpgsql
AS $BC$
DECLARE
  x INT = 1;
  mynumber INT;
  seqname TEXT;
BEGIN
  SELECT INTO seqname 'slowfuncseq_' || pg_backend_pid();
  PERFORM nextval(seqname);

  RAISE NOTICE 'Start of function';

  WHILE x &lt;= 5 LOOP
    -- Random number from 1 to 10
    SELECT 1+(random()*9)::int INTO mynumber;
    RAISE NOTICE 'Start expensive step %: time to run=%', x, mynumber;
 PERFORM pg_sleep(mynumber);
    PERFORM nextval(seqname);
    x = x + 1;
  END LOOP;

  RETURN 'End of function';
END
$BC$;
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Again, it's important that the steps become to create the sequence, run the function, and then drop the sequence. While access to sequences lives outside MVCC, creation of the sequence itself is not. Here's what the whole thing will look like in psql:&lt;/p&gt;&lt;pre&gt;
&lt;span class="t"&gt;
\t
\o tmp.drop.sql
SELECT 'DROP SEQUENCE IF EXISTS slowfuncseq_' || pg_backend_pid() || ';';
\o tmp.create.sql
SELECT 'CREATE SEQUENCE slowfuncseq_' || pg_backend_pid() || ';';
\o
\t
\i tmp.drop.sql
\i tmp.create.sql
SELECT slowfunc();
\i tmp.drop.sql
&lt;/span&gt;&lt;/pre&gt;&lt;p&gt;Now you can see how far along the function is from any other process. For example, if we kick off the script above, then go into psql from another window, we can use the process id from the pg_stat_activity view to see how far along our function is:&lt;/p&gt;&lt;pre&gt;
&lt;span class="c"&gt;$&lt;/span&gt;&lt;span class="t"&gt; select procpid, current_query from pg_stat_activity;&lt;/span&gt;
&lt;span class="o"&gt; procpid |                    current_query                     
---------+------------------------------------------------------
   10206 | SELECT slowfunc();
   10313 | select procpid, current_query from pg_stat_activity;
&lt;/span&gt;
&lt;span class="c"&gt;$&lt;/span&gt;&lt;span class="t"&gt; select last_value from slowfuncseq_10206;&lt;/span&gt;
&lt;span class="o"&gt; last_value 
------------
          3&lt;/span&gt;
&lt;/pre&gt;&lt;p&gt;You can assign your own values and meanings to the numbers, of course: this one simply tells us that the script is on the third iteration of our sleep loop. You could use multiple sequences to convey even more information.&lt;/p&gt;&lt;p&gt;There are other ways besides sequences to achieve this trick: one that I've used before is to have a plperlu function open a new connection to the existing database and update a text column in a simple tracking table. Another idea is to update a small semaphore table within the function, and check the modification time of the underlying file underneath your data directory.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-6051822758381419496?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/6051822758381419496/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=6051822758381419496' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/6051822758381419496'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/6051822758381419496'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/04/viewing-postgres-function-progress-from.html' title='Viewing Postgres function progress from the outside'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-7055231418105648994</id><published>2010-04-23T16:01:00.005-04:00</published><updated>2010-04-23T17:07:07.357-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>PostgreSQL at LinuxFest Northwest</title><content type='html'>&lt;p&gt;
This is my third year driving up to Bellingham for &lt;a href="http://linuxfestnorthwest.org/"&gt;LinuxFest Northwest&lt;/a&gt;, and I'm excited to be presenting two talks about PostgreSQL there. Adrian Klaver is one of the organizers of the conference, and has always been a huge supporter of PostgreSQL. He has gone out of his way to have a track of content about our favorite database.
&lt;/p&gt;
&lt;p&gt;
I'll be presenting an &lt;a href="http://slideshare.net/selenamarie/bucardo"&gt;introduction to Bucardo&lt;/a&gt; and co-hosting a talk about new features in version 9.0 of PostgreSQL with Gabrielle Roth. 
&lt;/p&gt;
&lt;p&gt;
Talking about &lt;a href="http://bucardo.org/"&gt;Bucardo&lt;/a&gt; and replication is always a blast.  The last time I gave this talk to a packed house in Seattle, so I'm hoping for another lively discussion about the state of replication in PostgreSQL.
&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-7055231418105648994?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/7055231418105648994/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=7055231418105648994' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7055231418105648994'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/7055231418105648994'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/04/postgresql-at-linuxfest-northwest.html' title='PostgreSQL at LinuxFest Northwest'/><author><name>Selena Deckelmann</name><email>noreply@blogger.com</email></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-6961378745123470580</id><published>2010-04-20T14:36:00.001-04:00</published><updated>2010-04-20T15:57:01.857-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open-source'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>Restoring individual table data from a Postgres dump</title><content type='html'>&lt;style&gt;span { font-family: sans-serif; } span.c { color: red; } span.t { font-family: sans-serif; color: green; } span.o { font-family: sans-serif; color: blue; } &lt;/style&gt;&lt;p&gt;Recently, one of our clients needed to restore the data in a specific table from the previous night's PostgreSQL dump file. Basically, there was a UPDATE query that did not do what it was supposed to, and some of the columns in the table were irreversibly changed. So, the challenge was to quickly restore the contents of that table.&lt;/p&gt;&lt;p&gt;The SQL dump file was generated by the &lt;a href="http://www.postgresql.org/docs/current/static/app-pg-dumpall.html"&gt;pg_dumpall&lt;/a&gt; command, and thus there was no easy way to extract individual tables. If you are using the &lt;a href="http://www.postgresql.org/docs/current/static/app-pgdump.html"&gt;pg_dump&lt;/a&gt; command, you can specify a "custom" dump format by adding the &lt;strong&gt;-Fc&lt;/strong&gt; option. Then, pulling out the data from a single table becomes as simple as adding a few flags to the &lt;a href="http://www.postgresql.org/docs/current/static/app-pgrestore.html"&gt;pg_restore&lt;/a&gt; command like so:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;pg_restore --data-only --table=alpha large.custom.dumpfile.pg &amp;gt; alpha.data.pg&lt;/span&gt;&lt;/p&gt;&lt;p&gt;One of the drawbacks of using the custom format is that it is only available on a per-database basis; you cannot use it with pg_dumpall. That was the case here, so we needed to extract the data of that one table from within the large dump file. If you know me well, you might suspect at this point that I've written yet another handy perl script to tackle the problem. As tempting as that may have been, time was of the essence, and the wonderful array of Unix command line tools already provided me with everything I needed.&lt;/p&gt;&lt;p&gt;Our goal at this point was to pull the data from a single table ("alpha") from a very large dump file ("large.dumpfile.pg") into a separate and smaller file that we could use to import directly into the database.&lt;/p&gt;&lt;p&gt;The first step was to find exactly where in the file the data was. We knew the name of the table, and we also know that a dump file inserts data by using the &lt;a href="http://www.postgresql.org/docs/current/static/sql-copy.html"&gt;COPY&lt;/a&gt; command, so there should be a line like this in the dump file:&lt;strong&gt;&lt;pre&gt;COPY alpha (a,b,c,d) FROM stdin;&lt;/pre&gt;&lt;/strong&gt;&lt;/p&gt;&lt;p&gt;Because all the COPYs are done together, we can be pretty sure that the command after "COPY alpha" is another copy. So the first thing to try is:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;grep -n COPY large.dumpfile.pg | grep -A1 'COPY alpha '&lt;/span&gt;&lt;/p&gt;&lt;p&gt;This uses grep's handy &lt;strong&gt;-n&lt;/strong&gt; option (aka &lt;strong&gt;--line-number&lt;/strong&gt;) to output the line number that each match appears on. Then we pipe that back to grep, search for our table name, and print the line after it with the &lt;strong&gt;-A&lt;/strong&gt; option (aka &lt;strong&gt;--after-context&lt;/strong&gt;). The output looked like this:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;grep -n COPY large.dumpfile.pg | grep -A1 'COPY alpha '&lt;/span&gt;&lt;br /&gt;&lt;span class="o"&gt;1233889:COPY alpha (cdate, who, state, add, remove) FROM stdin;&lt;br /&gt;12182851:COPY alpha_sequence (sname, value) FROM stdin;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Note that many of the options here are GNU specific. If you are using an operating system that doesn't support the common GNU tools, you are going to have a much harder time doing this (and many other shell tasks)!&lt;/p&gt;&lt;p&gt;We now have a pretty good guess at the starting and ending lines for our data: 1233889 to lines 12182850 (we subtract 1 as we don't want the next COPY). We can now use &lt;strong&gt;head&lt;/strong&gt; and &lt;strong&gt;tail&lt;/strong&gt; to extract the lines we want, once we figure out how many lines our data spans:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;echo 12182851 - 1233889 | bc&lt;/span&gt;&lt;br /&gt;&lt;span class="o"&gt;10948962&lt;/span&gt;&lt;br /&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;head -1233889 large.dumpfile.pg | tail -10948962 &gt; alpha.data.pg&lt;/span&gt;&lt;/p&gt;&lt;p&gt;However, what if the next command was not a COPY? We'll have to scan forward for the end of the COPY section, which is always a backslash and a single dot at the start of a new line. The new command becomes (all one line, but broken down for readability):&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;grep -n COPY large.dumpfile.pg \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | grep -m1 'COPY alpha' \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | cut -d: -f1 \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | xargs -Ix tail --lines=+x large.dumpfile.pg \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | grep -n -m1 '^\\\.'&lt;/span&gt;&lt;/p&gt;&lt;p&gt;That's a lot, but in the spirit of Unix tools doing one thing and one thing well, it's easy to break down. First, we grab the line numbers where COPY occurs in our file, then we find the first occurrence of our table (using the &lt;strong&gt;-m&lt;/strong&gt; aka &lt;strong&gt;--max-count&lt;/strong&gt; option). We cut out the first field from that output, using a colon as the delimiter. This gives is the line number where the COPY begins. We pass this to xargs, and tail the file with a &lt;strong&gt;--lines=+x&lt;/strong&gt; argument, which outputs all lines from that file *starting* at the given line number. Finally, we pipe that output to grep and look for the end of copy indicator, stopping at the first one, and also outputting the line number. Here's what we get:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;grep -n COPY large.dumpfile.pg \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | grep -m1 'COPY alpha' \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | cut -d: -f1 \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | xargs -Ix tail --lines=+x large.dumpfile.pg \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | grep -n -m1 '^\\\.'&lt;/span&gt;&lt;br /&gt;&lt;span class="o"&gt;148956:\.&lt;br /&gt;xargs: tail: terminated by signal 13&lt;/span&gt;&lt;/p&gt;&lt;p&gt;This tells us that 148956 lines after the COPY, we encountered the string "\.". (The complaint from xargs can be ignored). Now we can create our data file:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;grep -n COPY large.dumpfile.pg \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | grep -m1 'COPY alpha' \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | cut -d: -f1 \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | xargs -Ix tail --lines=+x large.dumpfile.pg \&lt;br /&gt; &amp;nbsp; &amp;nbsp; | head -148956 &gt; alpha.data.pg&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Now that the file is there, we should do a quick sanity check on it. If the file is small enough, we could simply call it up in your favorite editor or run it through &lt;strong&gt;less&lt;/strong&gt; or &lt;strong&gt;more&lt;/strong&gt;. You can also check things out by knowing that a Postgres dump file separates the data in columns by a tab character when using the COPY command. So we can view all lines that don't have a tab, and make sure there is nothing except comments and the &lt;strong&gt;COPY&lt;/strong&gt; and &lt;strong&gt;\.&lt;/strong&gt; lines:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;grep -v -P '\t' alpha.data.pg&lt;/span&gt;&lt;/p&gt;&lt;p&gt;The grep option &lt;strong&gt;-P&lt;/strong&gt; (aka &lt;strong&gt;--perl-regexp&lt;/strong&gt;) instructs grep to interpret the argument ("backslash t" in this case) as a Perl regular expression. You could also simply input a literal tab there: on most systems this can be done with the &lt;strong&gt;&amp;lt;ctrl-v&amp;gt;&amp;lt;TAB&amp;gt;&lt;/strong&gt; key combination.&lt;/p&gt;&lt;p&gt;It's time to replace that bad data. We'll need to truncate the existing table, then COPY our data back in. To do this, we'll create a file that we'll feed to &lt;strong&gt;psql -X -f&lt;/strong&gt;. Here's the top of the file:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;cat &amp;gt; alpha.restore.pg&lt;br /&gt;&lt;br /&gt;\set ON_ERROR_STOP on&lt;br /&gt;\timing&lt;br /&gt;&lt;br /&gt;\c mydatabase someuser&lt;br /&gt;&lt;br /&gt;BEGIN;&lt;br /&gt;&lt;br /&gt;CREATE SCHEMA backup;&lt;br /&gt;&lt;br /&gt;CREATE TABLE backup.alpha AS SELECT * FROM public.alpha;&lt;br /&gt;&lt;br /&gt;TRUNCATE TABLE alpha;&lt;br /&gt;&lt;/span&gt;&lt;/p&gt;&lt;p&gt;From the top: we tell psql to stop right away if it encounters any problems, and then turn on the timing of all queries. We explicitly connect to the correct database as the correct user. Putting it here in the script is a safety feature. Then we start a new transaction, create a backup schema, and make a copy of the existing data into a backup table before truncating the original table. The next step is to add in the data, then wrap things up:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;cat alpha.data.pg &amp;gt;&amp;gt; alpha.restore.pg&lt;/span&gt;&lt;/p&gt;&lt;p&gt;Now we run it and check for any errors. We use the &lt;strong&gt;-X&lt;/strong&gt; argument to ensure control of exactly which psql options are in effect, bypassing any psqlrc files that may be in use.&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;psql -X -f alpha.restore.pg&lt;/span&gt;&lt;/p&gt;&lt;p&gt;If everything looks good, the final step is to add a COMMIT and run the file again:&lt;/p&gt;&lt;p&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;echo "COMMIT;" &amp;gt;&amp;gt; alpha.restore.pg&lt;/span&gt;&lt;br /&gt;&lt;span class="c"&gt;$&lt;/span&gt; &lt;span class="t"&gt;psql -X -f alpha.restore.pg&lt;/span&gt;&lt;/p&gt;&lt;p&gt;And we are done! All of this is a little simplified, as in real life there was actually more than one table to be restored, and each had some foreign key dependencies that had to be worked around, but the basic idea remains the same. (and yes, I know you can do the extraction in a Perl one-liner)&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-6961378745123470580?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/6961378745123470580/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=6961378745123470580' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/6961378745123470580'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/6961378745123470580'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/04/restoring-individual-table-data-from.html' title='Restoring individual table data from a Postgres dump'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-2366118113661591952</id><published>2010-03-29T16:45:00.001-04:00</published><updated>2010-03-29T17:56:57.916-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='community'/><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><category scheme='http://www.blogger.com/atom/ns#' term='database'/><title type='text'>PostgreSQL Conference East 2010 review</title><content type='html'>&lt;p&gt;I just returned from the &lt;a href="http://www.postgresqlconference.org/"&gt;PostgreSQL Conference East 2010&lt;/a&gt;. This is one of the US "regional" Postgres conferences, which  usually occur once a year on both the East and West coast. This is the second year the East conference has taken place in my home town of Philadelphia.&lt;/p&gt;&lt;p&gt;Overall, it was a great conference. In addition to the talks, of course, there are many other important benefits to such a conference, such as the "hallway tracks", seeing old friends and clients, meeting new ones, and getting to argue about default postgresql.conf settings over lunch. I gave a 90 minute talk on "Postgres for non-Postgres people" and a lightning talk on the indispensable &lt;a href="http://bucardo.org/wiki/Tail_n_mail"&gt;tail_n_mail.pl program&lt;/a&gt;.&lt;/p&gt;&lt;p&gt;This year saw the conference take place at a hotel for the first time, and this was a big improvement over the previous school campus-based conferences. Everything was in one building, there was plenty of space to hang out and chat between the talks, and everything just felt a little bit easier. The one drawback was that the rooms were not really designed to lecture to large numbers of people (e.g. no stadium seating), but this was not too much of an issue for most of the talks.&lt;/p&gt;&lt;p&gt;A few of the talks I attended included:&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Mine! Luckily, my talk was in the very first slot, so I was able to give it and then be done talking for the rest of the conference (with the exception of the lightning talk). My talk was "PostgreSQL for MySQL (and other database people)". A quick show of hands showed that in addition to a good number of MySQL people, we had people coming from Oracle, Microsoft SQL Server, and even Informix. I walked through the steps to take when upgrading your application from using some other database to using Postgres, pointing out some of the pain points and particular Postgres gotchas, focusing on the SQL syntax. The second half of the talk focused on the Postgres project itself, explaining how it all worked, what the "community" and "core" consists of, how companies are involved, how development is done, and the philosophy of the project.&lt;/li&gt;
&lt;li&gt;"PostgreSQL at myYearbook.com" by Gavin M. Roy. I've heard earlier versions  of this talk before, but it was neat to see how much myyearbook.com had grown in just one year and some of the new challenges they faced. Of course, Gavin is still upset about the primary key situation and they are still doing unique indexes instead of PKs so they can do in-place reindexing for bloat removal.&lt;/li&gt;
&lt;li&gt;Baron Schwartz spoke about "Query Analysis with mk-query-digest". The "mk" is short for &lt;a href="http://www.maatkit.org/"&gt;maatkit&lt;/a&gt;, a nice suite of tools for doing all sorts of database-related things. Granted, it's very MySQL focused at the moment, but Baron has started to port things over to Postgres, and the demo he gave was pretty impressive. I'll definitely be downloading that code and taking a look.&lt;/li&gt;
&lt;li&gt;Magnus Hagander gave a talk on "Secure PostgreSQL Deployment" which was a lot more interesting than I thought it would be (I knew it had Windows slides). My take-home lessons: never use the ssl mode of "prefer", and always check your Debian systems as they like to switch SSL on everything for no good reason. It's also quite fascinating to see the number of ways you can authenticate to a Postgres database.&lt;/li&gt;
&lt;li&gt;I attended a talk on "Inside the PostgreSQL Infrastructure" by Dave Page. A lot of it I already knew, as I'm a little involved in said infrastructure, but it was good to hear some of the future plans, including standardizing on Debian instead of FreeBSD in the future.&lt;/li&gt;
&lt;li&gt;Spencer Christensen's talk on "PostgreSQL Administration for System Administrators" was very well done but mostly review for me :). It was nice to see a shout out in his talk (and some others) for &lt;a href="http://bucardo.org/check_postgres/check_postgres.pl.html"&gt;check_postgres.pl&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Robert Haas gave a good talk on "The PostgreSQL Query Planner" that seemed  to be very well received. The bit about the join removal tech was particularly interesting: the Postgres planner does some really, really clever things when trying to build the best possible plan for your query.&lt;/li&gt;&lt;/ul&gt;
&lt;p&gt;At the lunch on Saturday, Josh Drake asked if anyone else wanted to do a lightning  talk, so I made a quick outline on the back of a nearby piece of paper and gave a no-slides, no-notes five minute talk on tail_n_mail.pl. It went pretty well, and I even had 30 seconds left over at the end for questions. To clarify my answer to one of those further now: tail_n_mail.pl can parse CSV logs (indeed, any text file), but it cannot consolidate similar entries yet or any of the other neat things it does until we can teach TNM about how to parse the CSV logs properly.&lt;/p&gt;
&lt;p&gt;An excellent conference overall, but I'd be amiss if I didn't offer a little 
constructive criticism for the next time (and other conferences):&lt;/p&gt;
&lt;ul&gt;&lt;li&gt;Scheduling. The rooms were sometimes hard to find, and the schedule did not list the room next to the talk. That color-coded thing just does not work. In addition, it seemed like similar talks were sometimes stacked up against each other rather than staggered. Thus, you could learn about londiste OR rubyrep, but not both. Similarly, there were two Python talks up against each other.&lt;/li&gt;
&lt;li&gt;Lightning talks. Always, always put the lightning talks at the *start* of the conference, not the end. Lightning talks are a great way to learn about what other people are doing. By having it at the start of the conference, you have the entire rest of the time to followup with people about their talks and foster more real-life discussions.&lt;/li&gt;
&lt;li&gt;Lightning talks. Okay, not done talking about these yet. Lightning talks are somewhat notorious for spending lots of time getting the video to work right, as people switch computers, fiddle with plugs, etc. If you can't get it setup in 30 seconds, start the clock! You should be able to give your lightning talk without slides, if need be.&lt;/li&gt;&lt;/ul&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-2366118113661591952?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/2366118113661591952/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=2366118113661591952' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/2366118113661591952'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/2366118113661591952'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/03/postgresql-conference-east-2010-review.html' title='PostgreSQL Conference East 2010 review'/><author><name>Greg Sabino Mullane</name><uri>http://www.blogger.com/profile/01284672712225991979</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='08820056573005263883'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-4982923677561110337</id><published>2010-03-23T15:30:00.007-04:00</published><updated>2010-03-23T18:44:04.336-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='open-source'/><category scheme='http://www.blogger.com/atom/ns#' term='Conference'/><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>LibrePlanet 2010: Eben Moglen and the future of Oracle in free software</title><content type='html'>&lt;p&gt;
I just got back from &lt;a href="http://libreplanet.org"&gt;Libre Planet 2010&lt;/a&gt;, a conference for free software activists put on by the Free Software Foundation. I imagine most readers of this blog are familiar with the language debate over free software vs. open source. Much of the business and software community has settled into using open source as the term of choice, but &lt;a href="http://libreplanet.org"&gt;Libre Planet&lt;/a&gt; is certainly a place where saying "free software" is the norm.
&lt;/p&gt;
&lt;p&gt;
I presented two talks - one on &lt;a href="http://www.chesnok.com/daily/2010/03/20/giving-good-talks-presentation/"&gt;how to give good talks&lt;/a&gt; by connecting with your audience, and a second about non-coding roles in free software communities.  The first talk is built on &lt;a href="http://www.slideshare.net/selenamarie/leading-without-being-in-charge-1009616"&gt;my work with user groups&lt;/a&gt; and giving presentations at primarily free software conferences over the last five years. The second was built off of the &lt;a href="http://www.pgexperts.com/document.html?id=5"&gt;great work of Josh Berkus&lt;/a&gt;, for a talk that he first gave at a mini-conference I arranged the day before OSCON 2007 for Postgres. 
&lt;/p&gt;
&lt;p&gt;

One talk I attended surprised me with an important discussion of the future of the open source database market. 
&lt;/p&gt;
&lt;p&gt;

Eben Moglen spoke about the future of the &lt;a href="http://fsf.org"&gt;Free Software Foundation&lt;/a&gt; and the new challenges that software freedom faces in a world increasingly dominated by network services - social networking, collaboration tools and other software where ownership of data is largely shared, and no single person or entity can be legitimately claimed to be sole owner of the data or structure that emerges. 
&lt;/p&gt;
&lt;p&gt;

Eben Moglen said, "We are at a point of inflection in our long campaign." He talked at length about the work the &lt;a href="http://www.softwarefreedom.org/"&gt;Software Freedom Law Center&lt;/a&gt; has done, collaborating with organizations whose goals were not necessarily software freedom, nor directly aligned with the FSF. He specifically brought up patent pools, and work that the SFLC has done to bring non-free companies in the fight against abusive patents.
&lt;/p&gt;
&lt;p&gt;

Eben then turned his attention to the issue of the Oracle/Sun acquisition. He commented we haven't really looked to Oracle for pro-software freedom activity in the past. And then that "&lt;a href="http://twitter.com/selenamarie/status/10783785419"&gt;every technically competent 15-year old in the world uses MySQL.&lt;/a&gt;" While this isn't music to the ears of Postgres users and developers, with applications like Wordpress, I'd say that Eben isn't too far off.  
&lt;/p&gt;
&lt;p&gt;

What was interesting to me was Eben's conjecture that &lt;a href="http://identi.ca/notice/25447908"&gt;MySQL is now essentially a tool that's now being sharpened&lt;/a&gt; to stab deeply into the heart of Microsoft's SQL Server market. He pointed out that Oracle has about 375,000 customers, and claimed that there's no where you can learn Oracle for free (to which several people have pointed out -- you can download crippled versions of Oracle for free to learn basics.. but I claim that's not the same thing as being able to download and install full server versions of something like MySQL or PostgreSQL). 
&lt;/p&gt;
&lt;p&gt;

Regardless of the details, this play by Oracle would be an interesting use of open source software to disrupt a market.
&lt;/p&gt;
&lt;p&gt;

I suggest to the Postgres community that SQL Server to Postgres migrations are a real business opportunity our consultants, and an area in which we as a community should pursue documenting and assisting with transitions as much as possible.

&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-4982923677561110337?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/4982923677561110337/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=4982923677561110337' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/4982923677561110337'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/4982923677561110337'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/03/libreplanet-2010-eben-moglan-and-future.html' title='LibrePlanet 2010: Eben Moglen and the future of Oracle in free software'/><author><name>Selena Deckelmann</name><email>noreply@blogger.com</email></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7997313029981170997.post-3745564070521461210</id><published>2010-03-22T16:14:00.007-04:00</published><updated>2010-03-24T22:46:21.210-04:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='postgres'/><title type='text'>Using psql \o to append to a file</title><content type='html'>&lt;p&gt;I had a slow query I was working on recently, and wanted to capture the output of EXPLAIN ANALYZE to a file. This is easy, with psql's \o command:&lt;/p&gt;
&lt;pre&gt;
5432 josh@josh# \o explain-results
&lt;/pre&gt;
&lt;p&gt;Once EXPLAIN ANALYZE had finished running, I wanted the psql output back in my psql console window. This, too, is easy, using the \o command without a filename:&lt;/p&gt;
&lt;pre&gt;
5432 josh@josh# \o
&lt;/pre&gt;
&lt;p&gt;But later, after adding an index or two and changing some settings, I wanted to run a new EXPLAIN ANALYZE, and I wanted its output appended to the explain-analyze file I built earlier. At least on my system, \o will normally overwrite the target file, which would mean I'd lose my original results. I realize it's simple to, say, pipe output to a new file ("explain-analyze-2"), but I wasn't interested. Instead, because \o can also accept a pipe character and a shell command to pipe its output to, I did this:&lt;/p&gt;
&lt;pre&gt;
5432 josh@josh# \o | cat - &gt;&gt; explain-results
&lt;/pre&gt;
&lt;p&gt;Life is good.&lt;/p&gt;
&lt;p&gt;Update: A helpful commenter pointed out I hadn't actually used the same files in the original post. Oops. Fixed.&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/7997313029981170997-3745564070521461210?l=blog.endpoint.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://blog.endpoint.com/feeds/3745564070521461210/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='https://www.blogger.com/comment.g?blogID=7997313029981170997&amp;postID=3745564070521461210' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3745564070521461210'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7997313029981170997/posts/default/3745564070521461210'/><link rel='alternate' type='text/html' href='http://blog.endpoint.com/2010/03/using-o-to-append-to-file.html' title='Using psql \o to append to a file'/><author><name>Joshua Tolley</name><uri>http://www.blogger.com/profile/08481531515300677240</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd='http://schemas.google.com/g/2005' name='OpenSocialUserId' value='13545358980860046445'/></author><thr:total>4</thr:total></entry></feed>