Export data from PostgreSQL, recreate tables in ClickHouse, load files, and validate counts for a fast, column-oriented analytics workload.
ClickHouse delivers sub-second analytics over billions of rows thanks to columnar storage and vectorized execution. Moving large, read-heavy datasets out of Postgres removes VACUUM overhead and frees your OLTP database for transactional traffic.
For one-off moves, use COPY ... TO STDOUT
in Postgres and clickhouse-client --query "INSERT ... FORMAT CSV"
. For ongoing syncs, rely on MaterializedPostgreSQL engine or Debezium CDC into Kafka.
Run COPY in parallel per table to a local CSV or to STDOUT piped into gzip. Text and CSV formats are fastest; include HEADER
for clarity.
COPY Customers TO '/tmp/customers.csv' WITH (FORMAT csv, HEADER true);COPY Orders TO '/tmp/orders.csv' WITH (FORMAT csv, HEADER true);
Pick a MergeTree engine keyed by frequently-filtered columns. Data types must match Postgres originals; replace serial
with UInt32
/UInt64
and timestamp
with DateTime
.
CREATE TABLE Customers ( id UInt32, name String, email String, created_at DateTime) ENGINE = MergeTreeORDER BY id;
Use clickhouse-client
with --query "INSERT INTO … FORMAT CSV"
. Add --max_insert_block_size
to tune batch size for large loads.
clickhouse-client --query "INSERT INTO Customers FORMAT CSV" < /tmp/customers.csvclickhouse-client --query "INSERT INTO Orders FORMAT CSV" < /tmp/orders.csv
Compare counts in both systems before unlocking production. Use SELECT count(*)
on ClickHouse and Postgres. Mismatches usually mean NULL text fields or delimiter issues.
Create a MaterializedPostgreSQL
database to stream changes into ClickHouse automatically. Alternatively, capture WAL changes with Debezium → Kafka → ClickHouse Kafka
engine.
Disable foreign keys in Postgres exports, cast all dates to UTC, compress files with gzip for network transfers, and load smallest dimension tables first to support early query tests.
Dashboards that scanned millions of Postgres rows typically drop to milliseconds in ClickHouse. Storage footprint shrinks 3-8× with default compression.
No. pg_dump creates INSERT statements not understood by ClickHouse. Use COPY or logical replication engines instead.
Traditional B-tree indexes are unnecessary. ClickHouse’s sparse index on the ORDER BY key and data skipping granules replace most secondary indexes.
Set up MaterializedPostgreSQL
or stream WAL changes with Debezium into Kafka and let ClickHouse consume the topic.