How to Migrate On-Premise PostgreSQL to BigQuery in PostgreSQL

Galaxy Glossary

How do I migrate an on-premise PostgreSQL database to Google BigQuery?

Move data from your local PostgreSQL server to Google BigQuery for scalable analytics.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.

Description

Why migrate PostgreSQL to BigQuery?

BigQuery eliminates hardware maintenance, scales automatically, and lets analysts query terabytes in seconds. Migrating frees on-prem resources and unifies analytics.

What are the prerequisites?

Install pg_dump, gsutil, the bq CLI, and create a Google Cloud project with a BigQuery dataset. Open outbound network ports on the PostgreSQL host.

How do I export on-premise data?

Using pg_dump for table-level exports

Run pg_dump -Fc -t Customers -t Orders -t OrderItems -t Products ecommerce > ecommerce.dump. The custom format is compressed and resumable.

Handling large tables with parallel dumps

For multi-GB tables, use pg_dump -Fd -j 4 ecommerce -f ./dump_dir to split output into chunks and leverage multiple CPU cores.

How do I transfer dumps to Cloud Storage?

Create a bucket: gsutil mb gs://ecommerce-dumps. Upload: gsutil cp ecommerce.dump gs://ecommerce-dumps/. Use the -m flag for parallel uploads on large files.

How do I load data into BigQuery?

Loading a single table

Extract CSV: pg_restore -t Customers -f Customers.csv ecommerce.dump. Then load: bq load --autodetect ecommerce.Customers gs://ecommerce-dumps/Customers.csv.

Automating multi-table loads

Write a shell loop to restore each table to CSV and call bq load. For >1 TB, schedule a Transfer Service job pointing at the bucket.

How do I validate the migration?

Compare row counts: SELECT COUNT(*) FROM Customers; in both systems. For data quality, checksum numeric columns or sample 1% of rows.

Best practices for incremental syncs

Add updated_at columns and export only rows changed since the last run using WHERE updated_at > :last_sync. Push new files to the same bucket and append in BigQuery.

Common pitfalls and fixes

Encoding mismatches break loads—always export in UTF-8 with --encoding=UTF8. Timestamp precision loss occurs if you let autodetect pick STRING; explicitly supply a schema or use JSONLines.

When should I use federated queries instead?

If data changes every minute and you only need read access, create a BigQuery external table via Cloud SQL federation instead of constant full loads.

Summary checklist

Export → upload → load → validate. Automate each step with CI jobs or Cloud Functions and monitor failures in Stackdriver.

Why How to Migrate On-Premise PostgreSQL to BigQuery in PostgreSQL is important

How to Migrate On-Premise PostgreSQL to BigQuery in PostgreSQL Example Usage


-- Verify migrated data counts match
-- On PostgreSQL
SELECT COUNT(*) AS pg_count FROM Customers;

-- On BigQuery
SELECT COUNT(*) AS bq_count FROM `ecommerce.Customers`;

How to Migrate On-Premise PostgreSQL to BigQuery in PostgreSQL Syntax


# 1. Dump on-prem tables to custom format
pg_dump -Fc -t Customers -t Orders -t Products -t OrderItems \
       -h 10.0.0.5 -U admin ecommerce > ecommerce.dump

# 2. Copy to Cloud Storage
gsutil cp ecommerce.dump gs://ecommerce-dumps/

# 3. Restore a table to CSV for BigQuery load
pg_restore -t Customers -F c -f Customers.csv ecommerce.dump

# 4. Load into BigQuery
bq load --source_format=CSV --autodetect \
        ecommerce.Customers \
        gs://ecommerce-dumps/Customers.csv

Common Mistakes

Frequently Asked Questions (FAQs)

Is there downtime during migration?

No, you can run pg_dump with --snapshot on PostgreSQL 13+ to avoid locks, then schedule a brief cut-over.

Can I automate daily syncs?

Yes. Use cron, Cloud Functions, or Airflow to export changed rows and call bq load --replace=false.

How are indexes handled?

BigQuery is column-oriented and does not import PostgreSQL indexes. Rely on clustering and partitioning instead.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie
BauHealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.