Welcome to End Point’s blog

Ongoing observations by End Point people

Using PostgreSQL Hooks

PostgreSQL is well known for its extensibility; users can build new functions, operators, data types, and procedural languages, among others, without having to modify the core PostgreSQL code. Less well known is PostgreSQL's extensive set of "hooks", available to the more persistent coder. These hooks allow users to interrupt and modify behavior in all kinds of places without having to rebuild PostgreSQL.

Few if any of these hooks appear in the documentation, mostly because the code documents them quite well, and anyone wanting to use them is assumed already to be sufficiently familiar with the code to find the information they'd need to use one. For those interested in getting started using hooks, though, an example can be useful. Fortunately, the contrib source provides one, in the form of passwordcheck, a simple contrib module that checks users' passwords for sufficient strength. These checks include having a length greater than 8 characters, being distinct from the username, and containing both alphabetic and non-alphabetic characters. It can also use CrackLib for more intense password testing, if built against the CrackLib code.

In general, these hooks consist of global function pointers of a specific type, which are initially set to NULL. Whenever PostgreSQL wants actually to use a hook, it checks the function pointer, and if it's not NULL, calls the function it points to. When someone implements a hook, they write a function of the proper type and an initialization function to set the function pointer variable. They then package the functions in a library, and tell PostgreSQL to load the result, often using shared_preload_libraries.

For our example, the important pieces of the PostgreSQL code are in src/backend/commands/user.c and src/include/commands/user.h. First, we need a function pointer type, which in this case is called check_password_hook_type:

typedef void (*check_password_hook_type)
   (const char *username, const char *password,
   int password_type, Datum validuntil_time,
   bool validuntil_null);

extern PGDLLIMPORT check_password_hook_type check_password_hook;

This says the check_password_hook will take arguments for user name, password, password type, and validity information (for passwords valid until certain dates). It also provides an extern declaration of the actual function pointer, called "check_password_hook".

The next important pieces of code are in src/backend/commands/user.c, as follows:

/* Hook to check passwords in CreateRole() and AlterRole() */
check_password_hook_type check_password_hook = NULL;

...which defines the function hook variable, and this:

if (check_password_hook && password)
  (*check_password_hook) (stmt->role, password,

...which actually uses the hook. Actually the hook is used twice, with identical code, once in CreateRole() and once in AlterRole(), so as to provide password checking in both places. (Insert D.R.Y. rant here).

In order to take advantage of this hook, the passwordcheck module needs to implement the hook function, and set the check_password_hook variable to point to that function. First, passwordcheck.c needs to include a few things, including "commands/user.h" to ge the definitions of check_password_hook and check_password_hook_type, and call the PG_MODULE_MAGIC macro every PostgreSQL shared library needs. Then, it implements the password checking logic in a function called check_password():

static void
check_password(const char *username,
      const char *password,
      int password_type,
      Datum validuntil_time,
      bool validuntil_null)
/* Actual password checking logic goes here */

Note that this declaration matches the arguments described in the check_password_hook_type, above.

Now to ensure the check_password_hook variable points to this new check_password() function. When loading a shared library, PostgreSQL looks for a function defined in that library called _PG_init(), and runs it if it exists. In passwordcheck, the _PG_init() function is as simple as this:

 /* activate password checks when the module is loaded */
 check_password_hook = check_password;

Other modules using hooks often check the hook variable for NULL before setting it, in case something else is already using the hook. For instance, the auto_explain contrib module does this in _PG_init() (note that auto_explain uses three different hooks):

prev_ExecutorStart = ExecutorStart_hook;
 ExecutorStart_hook = explain_ExecutorStart;
 prev_ExecutorRun = ExecutorRun_hook;
 ExecutorRun_hook = explain_ExecutorRun;
 prev_ExecutorEnd = ExecutorEnd_hook;
 ExecutorEnd_hook = explain_ExecutorEnd;

auto_explain also resets the hook variables in its _PG_fini() function. Since unloading modules isn't yet supported and thus, _PG_fini() never gets called, this is perhaps unimportant, but is good for the sake of being thorough.

Back to passwordcheck. Having set the hook variable, all that remains is to get PostgreSQL to load this library. The easiest way to do that is to set shared_preload_libraries in postgresql.conf:

josh@eddie:~/devel/pgsrc/pg-eggyknap/contrib/passwordcheck$ psql
psql (9.0devel)
Type "help" for help.

5432 josh@josh# show shared_preload_libraries ;
(1 row)

Restarting PostgreSQL loads the library, proven as follows:

5432 josh@josh# create user badpass with password 'bad';
ERROR:  password is too short

There are hooks like this all over the PostgreSQL code base. Simply search for "_hook_type", to find such possibilities as these:

shmem_startup_hookCalled when PostgreSQL initializes its shared memory segment
explain_get_index_name_hookCalled when explain finds indexes' names.
planner_hookRuns when the planner begins, so plugins can monitor or even modify the planner's behavior
get_relation_info_hookAllows modification of expansion of the information PostgreSQL gets from the catalogs for a particular relation, including adding fake indexes

Learn more about End Point's Postgres Support, Development, and Consulting.


David Christensen said...

Hi Josh,

I assume these hooks are only available when writing C functions, or are there exposures in any of the other PLs?


Joshua Tolley said...

So far, the only way you can use them is through C functions. It might be possible to write wrappers so other PLs could use them; that could be an interesting project, though I'm not immediately sure how useful it would be. One hard part would be making the arguments available to the PL. Converting strings or ints to something a PL understands is pretty easy, but converting a PlannerInfo structure, for instance, without knowing lots about how the PL function plans to use the values involved could be tough.

rohit said...

Is it possible to use hooks to read changes made to tuples after an UPDATE, INSERT or DELETE? If so could you give me some information about how to do it. I would really appreciate that a lot.