I’ve been asked about my opinion on backup strategy and best practices, and it so happens that I have some kind of an opinion on the matter.
I tend to think best practice here begins with defining properly the backup plan you want to implement. It’s quite a complex matter, so be sure to ask yourself about your needs: what do you want to be protected from?
The two main things to want to protect from are hardware loss (crash disaster, plane in the data center, fire, water flood, etc) and human error ( UPDATE without a where clause).
On the PostgreSQL Hackers mailing lists, Andrew Dunstan just proposed some new options for pg_dump and pg_restore to ease our lives. One of the answers was talking about some scripts available to exploit the pg_restore listing that you play with using options -l and -L, or the long name versions --list and --use-list. The pg_staging tool allows you to easily exploit those lists too.
The pg_restore list is just a listing of one object per line of all objects contained into a custom dump, that is one made with pg_dump -Fc.
Dans la série de nos articles sur pgloader, l’article du jour décrit comment insérer des valeurs constantes (absentes du fichier de données) pendant le chargement. Cela permet par exemple d’ajouter un champ « origine », qui dépend typiquement de la chaîne de production des données et se retrouve souvent dans le nom du fichier de données lui-même.