How to Migrate from BigQuery to ClickHouse in PostgreSQL

How do I migrate data from BigQuery to ClickHouse without downtime?

Move tables, schema, and data from Google BigQuery to ClickHouse with minimal downtime.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What problem does BigQuery → ClickHouse migration solve?

Teams switch to ClickHouse to cut query latency and cost for high-volume analytics. Migrating keeps historical data accessible in the new engine.

How do I export data from BigQuery quickly?

Use the EXPORT DATA statement to write each source table into compressed Parquet files stored in Google Cloud Storage (GCS). Parquet preserves column types and is natively understood by ClickHouse.

EXPORT DATA example

EXPORT DATA OPTIONS( uri='gs://bq_exports/customers_*.parquet', format='PARQUET', compression='SNAPPY') AS SELECT * FROM `project.dataset.Customers`;

How do I create equivalent tables in ClickHouse?

Connect to ClickHouse and run CREATE TABLE with column types that best match BigQuery. Use a MergeTree engine for OLAP workloads.

CREATE TABLE example

CREATE TABLE Customers ( id UInt64, name String, email String, created_at DateTime ) ENGINE = MergeTree ORDER BY id;

How do I load Parquet files into ClickHouse?

Run INSERT INTO ... SELECT * with the input('Parquet') function or use clickhouse-client --query.

CLI load example

clickhouse-client --query "INSERT INTO Customers SELECT * FROM s3('https://storage.googleapis.com/bq_exports/customers_*.parquet', 'GCS_ACCESS', 'GCS_SECRET', 'Parquet')";

Can I transform data during migration?

Yes. Wrap the SELECT in your INSERT to cast types, split JSON, or aggregate rows while streaming the Parquet input.

How do I keep data in sync during the cut-over?

Schedule incremental BigQuery exports based on created_at or updated_at and append them to ClickHouse until your application flips to the new database.

Best practices for large tables

Split exports by date range, load in parallel, and set max_partitions_per_insert_block to reduce the number of small parts created.

Common errors and fixes

Mismatch in numeric precision? Cast to Decimal(18,2). UTF-8 issues? Force String columns and clean later.

Why How to Migrate from BigQuery to ClickHouse in PostgreSQL is important

How to Migrate from BigQuery to ClickHouse in PostgreSQL Example Usage


-- Migrate the Products table

-- 1. BigQuery export
EXPORT DATA OPTIONS(
  uri='gs://bq_exports/Products_*.parquet',
  format='PARQUET',
  compression='SNAPPY') AS
SELECT id, name, price, stock FROM `shop.analytics.Products`;

-- 2. ClickHouse loading
CREATE TABLE Products (
  id UInt64,
  name String,
  price Decimal(10,2),
  stock UInt32
) ENGINE = MergeTree ORDER BY id;

clickhouse-client --query "INSERT INTO Products
SELECT * FROM s3('https://storage.googleapis.com/bq_exports/Products_*.parquet', '', '', 'Parquet');"

How to Migrate from BigQuery to ClickHouse in PostgreSQL Syntax


-- BigQuery side
EXPORT DATA
  OPTIONS(
    uri = 'gs://bq_exports/Orders_*.parquet',
    format = 'PARQUET',
    compression = 'SNAPPY'
  ) AS
SELECT * FROM `project.dataset.Orders`;

-- ClickHouse side
CREATE TABLE Orders (
    id UInt64,
    customer_id UInt64,
    order_date Date,
    total_amount Decimal(18,2)
) ENGINE = MergeTree
ORDER BY (order_date, id);

clickhouse-client --query "INSERT INTO Orders
SELECT *
FROM s3('https://storage.googleapis.com/bq_exports/Orders_*.parquet', '', '', 'Parquet');"

Common Mistakes

Using CSV instead of Parquet. CSV loses type fidelity, forcing expensive string parsing in ClickHouse. Always export as Parquet or ORC.
Copying BigQuery’s <code>TIMESTAMP</code> to ClickHouse <code>DateTime64</code> without adjusting time zone. Fix by adding <code>AT TIME ZONE 'UTC'</code> during export or <code>toDateTime64</code> cast on import.