pgcopydb v0.18
Hot off the press: pgcopydb v0.18 is out!
It’s the biggest release the project has had — 88 commits since v0.17, which shipped in August 2024. I took a break from my Open Source responsibilities for a while, because I was lacking employer support to make it happen.
What is pgcopydb
pgcopydb copies a PostgreSQL database to another PostgreSQL server, as fast as possible when physical file copy isn’t available. It parallelises the COPY across all tables simultaneously, builds indexes in parallel after data is loaded, and supports Change Data Capture via logical replication for minimal-downtime migrations. It is designed to be restartable: state is tracked in a local SQLite catalog so an interrupted run can resume where it left off.
Headline Features of pgcopydb v0.18
v0.18 brings compatibility with PostgreSQL 16, 17, and 18; a pgoutput-default CDC engine with significant reliability and performance improvements; regular-expression-based filtering; Citus-to-Citus migration support; and 24 bug fixes.
PostgreSQL 17 and 18
pgcopydb now fully builds and is tested against PostgreSQL 16, 17, and 18. This
includes CI coverage for PostgreSQL 18’s breaking changes (build system, large
object API, pg_dump metacommand syntax). The pg_dump \restrict / \unrestrict
metacommand fix (#947) also covers CVE-2025-8714.
pgoutput is now the default decoding plugin
pgcopydb previously defaulted to test_decoding.
Now, pgoutput is the default logical decoding plugin. This matters for two
reasons: pgoutput is the plugin used by standard PostgreSQL logical
replication, which means its output is more structured and more widely
tested; and it is the plugin required for filtered publications.
Filter-aware publications: when a filter INI file is active, pgcopydb now
creates a PostgreSQL publication that matches the filter set, rather than
publishing all tables. This limits WAL decode overhead on the source to exactly
the tables being migrated. The same filter is passed to wal2json when that
plugin is used instead.
CDC engine rewrite: SQLite instead of JSON
The CDC streaming engine has been rewritten to use SQLite as its storage format.
Previously, WAL changes captured from logical decoding were written as JSON files;
they are now stored in two structured SQLite databases — output.db (decoded
changes) and replay.db (applied changes). This brings several concrete
improvements:
- Restart safety: because SQLite is transactional, a process crash during streaming no longer risks partially-written change records. Processing resumes from a clean state.
- Size-based file rotation: when an output database grows past a configurable threshold, a new database file is opened automatically. This bounds unbounded growth in long-running CDC sessions.
- Stream cleanup: a new
pgcopydb stream cleanupcommand reclaims disk space by removing processed CDC database files, and the cleanup happens automatically every five minutes in the background when the follow (sub-) command is running. - Performance: catalog inserts into the SQLite internal catalog are now batched, and index creation is deferred. Attribute lookups in the transform hot path are cached.
Clone an entire Postgres instance with –all-databases
A new --all-databases flag clones every database in a PostgreSQL instance in a
single command:
pgcopydb clone --all-databases \
--source "postgres://user@source/" \
--target "postgres://user@target/"
This feature is not just about convenience, it is also about performance
characteristics and run-time resource sharing. When using the
--all-databases option pgcopydb fetches the catalog metadata for all the
databases found in the source database, each in their own SQLite catalog,
and then merges them to find the optimal ordering across all databases,
multiplexing COPY and CREATE INDEX operations to ensure lower migration time
overall.
Citus-to-Citus migration
pgcopydb now supports cloning a Citus cluster to another Citus cluster. This covers the Citus-specific metadata: distribution columns, colocation groups, reference tables, and the shard placement map. Documentation for TimescaleDB extension-aware migration was also added in this release.
Richer filtering
v0.18 extends the filter INI file in two directions:
Regex patterns — table names in filter sections now accept ~/pattern/
syntax for regular expression matching:
[exclude-table-data]
~/^log_/
~/^audit_/
Extension filtering — two new sections control which extensions are copied:
[include-only-extension]
timescaledb
postgis
[exclude-extension]
pg_stat_statements
This new filtering with regular expressions was also the opportunity to implement a new approach: rather than creating TEMP tables on the source database to use SQL anti-JOIN operators, we now use text, regexp, and OID arrays as SQL query parameters.
This means pgcopydb is fully compatible with read-only source servers now, including standby servers, and even when using filtering.
Bug fixes
24 bug fixes shipped in v0.18. A selection of the more significant ones:
- SQL_ASCII source encoding:
client_encodingis now forced toSQL_ASCIIwhen the source database uses that encoding, preventing garbled data in databases that deliberately use non-UTF-8 byte sequences. - LOCK TABLE failure: a
SAVEPOINTnow wraps theLOCK TABLEcheck in worker transactions, so a lock failure no longer aborts the whole COPY transaction. - Binary-incompatible columns: pgcopydb detects when a column type is not binary-compatible between source and target and falls back to text COPY automatically, rather than failing.
- GENERATED ALWAYS AS IDENTITY in CDC: identity columns were incorrectly
included in CDC
UPDATE SETclauses, causing replay errors. They are now skipped. - Materialized views in CDC:
REFRESH MATERIALIZED VIEWoperations are now handled correctly during CDC apply. - Double precision in CDC: floating-point values are now preserved at full precision through the CDC pipeline rather than being rounded in string serialization.
The full list is in the CHANGELOG.
By the numbers
| pgcopydb v0.18 | |
|---|---|
| Commits since v0.17 | 88 |
| Development period | Aug 2024 → Jun 2026 (23 months) |
| New features | 13 |
| Changed / improved | 8 |
| Bug fixes | 24+ |
| CVEs fixed | CVE-2025-8714 |
| PostgreSQL versions tested in CI | 16, 17, 18 |
Getting pgcopydb v0.18
docker pull dimitri/pgcopydb:v0.18
Packages for Debian, Ubuntu, and RPM-based distributions are available at the pgcopydb releases page.
Supporting the project
pgcopydb is maintained as an open-source project. Development is funded through the professional support program at oss.theartofpostgresql.com.
If your organization runs pgcopydb for production migrations, a subscription helps ensure the project continues to receive prompt attention when PostgreSQL releases or your infrastructure changes introduce new edge cases. Community support remains free, as always.
If pgcopydb has earned its place in your infrastructure, making it sustainable is a conversation worth having with your team.
