On the performance mailing list, a recent thread drew my attention. It
devired to be about using a connection pool software and prepared statements
in order to increase scalability of PostgreSQL when confronted to a lot of
concurrent clients all doing simple select queries. The advantage of the
pooler is to reduce the number of backends needed to serve the queries, thus
reducing PostgreSQL internal bookkeeping. Of course, my choice of software
here is clear: PgBouncer is an excellent top grade solution, performs real
well (it won't parse queries), reliable, flexible.
The problem is that while conbining pgbouncer and prepared statements is
possible, it requires the application to check at connection time if the
statements it's interrested in are already prepared. This can be done by a
simple catalog query of this kind:
SELECT name
FROM pg_prepared_statements
WHERE name IN ('my', 'prepared', 'statements');
Well, this is simple but requires to add some application logic. What would
be great would be to only have to EXECUTE my_statement(x, y, z) and never
bother if the backend connection is a fresh new one or an existing one, as
to avoid having to check if the application should prepare.
The preprepare pgfoundry project is all about this: it comes with a
prepare_all() function which will take all statements present in a given
table (SET preprepare.relation TO 'schema.the_table';) and prepare them for
you. If you now tell pgbouncer to please call the function at backend
creation time, you're done (see connect_query).
There's even a detailed README file, but no release yet (check out the code
in the CVS, pgfoundry project page has clear instruction about how to do so.
