Move tables, views, and pipelines from Google BigQuery into Snowflake using export-to-GCS + Snowflake COPY.
List every dataset, table, view, and scheduled query in BigQuery. Tag owners, data size, and refresh cadence. Map each object to a Snowflake database, schema, role, and warehouse. This inventory drives later automation scripts.
1) Export BigQuery tables to Google Cloud Storage (GCS). 2) Set up an external stage in Snowflake that points to the GCS bucket. 3) Create target tables in Snowflake. 4) COPY data in. 5) Re-create views, jobs, and permissions. 6) Validate row counts and checksums.
Use the EXPORT DATA statement or the bq extract CLI. Partition large tables to avoid >1 GB file limits and enable compression for faster transfer.
Snowflake loads GCS files through CREATE STAGE and COPY INTO. Key options: FILE_FORMAT, ON_ERROR, MATCH_BY_COLUMN_NAME, FORCE, and PURGE. Set MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE to tolerate column order changes.
The example below exports Orders
from BigQuery, defines a Snowflake stage, and copies data into the Snowflake ORDERS
table.
EXPORT DATA
OPTIONS (
uri='gs://bq_to_sf/orders/*.parquet',
format='PARQUET',
compression='SNAPPY')
AS
SELECT * FROM `ecom.Orders`;
CREATE STAGE gcs_orders_stage
url='gcs://bq_to_sf/orders/'
credentials=(gcs_key_id='$GCS_KEY' gcs_secret_key='$GCS_SECRET')
file_format=(type=parquet);
CREATE OR REPLACE TABLE Orders (
id BIGINT,
customer_id BIGINT,
order_date DATE,
total_amount NUMBER(10,2)
);
COPY INTO Orders
FROM @gcs_orders_stage
MATCH_BY_COLUMN_NAME = CASE_INSENSITIVE
ON_ERROR = ABORT_STATEMENT;
Compare row counts (SELECT COUNT(*)
) and hashed totals (SUM(total_amount)
) between BigQuery and Snowflake. Automate checks in CI pipelines to catch truncation or type mismatches early.
• Parallelize exports with partition decorators. • Compress files (Parquet or GZIP CSV) to cut egress costs. • Use Snowflake Snowpipe or Streams + Tasks for continuous loads. • Migrate views after base tables so dependencies resolve.
Yes. Use BigQuery Data Transfer Service to write to GCS or leverage Snowpipe with GCS event notifications for near-real-time ingestion.
Most ANSI SQL works, but BigQuery arrays and STRUCTs require UNNEST rewrites or Snowflake VARIANT. Test each view and adjust.
Create a scoped storage integration and reference it in CREATE STAGE so that secrets are managed by Snowflake, not in plain SQL.