On the performance mailing list, a recent thread drew my attention. It devired to be about using a connection pool software and prepared statements in order to increase scalability of PostgreSQL when confronted to a lot of concurrent clients all doing simple select queries. The advantage of the pooler is to reduce the number of backends needed to serve the queries, thus reducing PostgreSQL internal bookkeeping. Of course, my choice of software here is clear: PgBouncer is an excellent top grade solution, performs real well (it won’t parse queries), reliable, flexible.

The problem is that while conbining pgbouncer and prepared statements is possible, it requires the application to check at connection time if the statements it’s interrested in are already prepared. This can be done by a simple catalog query of this kind:

SELECT name
    FROM pg_prepared_statements 
   WHERE name IN ('my', 'prepared', 'statements');

Well, this is simple but requires to add some application logic. What would be great would be to only have to EXECUTE my_statement(x, y, z) and never bother if the backend connection is a fresh new one or an existing one, as to avoid having to check if the application should prepare.

The preprepare pgfoundry project is all about this: it comes with a prepare_all() function which will take all statements present in a given table ( SET preprepare.relation TO 'schema.the_table';) and prepare them for you. If you now tell pgbouncer to please call the function at backend creation time, you’re done (see connect_query).

There’s even a detailed README file, but no release yet (check out the code in the CVS, pgfoundry project page has clear instruction about how to do so.