Friday, August 12 2011
PostgreSQL, pgloader

pgloader constant cols

The previous articles in the pgloader series detailed How To Use PgLoader then How to Setup pgloader, then what to expect from a parallel pgloader setup, and then pgloader reformating. Another need you might encounter when you get to use pgloader is adding constant values into a table's column.

The basic situation where you need to do so is adding an origin field to your table. The value of that is not to be found in the data file itself, typically, but known in the pgloader setup. That could even be the filename you are importing data from.

In pgloader that's called a user defined column. Here's what the relevant examples/pgloader.conf setup looks like:

[udc]
table           = udc
format          = text
filename        = udc/udc.data
input_encoding  = 'latin1'
field_sep       = %
columns         = b:2, d:1, x:3, y:4
udc_c           = constant value
copy_columns    = b, c, d

And the data file is:

1%5%foo%bar
2%10%bar%toto
3%4%toto%titi
4%18%titi%baz
5%2%baz%foo

And here's what the loaded table looks like:

pgloader/examples$ pgloader -Tsc pgloader.conf udc
Table name        |    duration |    size |  copy rows |     errors 
====================================================================
udc               |      0.201s |       - |          5 |          0

pgloader/examples$ psql --cluster 8.4/main pgloader -c "table udc"
 b  |       c        | d 
----+----------------+---
  5 | constant value | 1
 10 | constant value | 2
  4 | constant value | 3
 18 | constant value | 4
  2 | constant value | 5
(5 rows)

Of course the configuration is not so straightforward as to process fields in the data file in the order that they appear, after all the examples/pgloader.conf are also a test suite.

Long story short: if you need to add some constant values into the target table you're loading data to, pgloader will help you there!