Move tables, schema, and data from a MySQL database into Google BigQuery using export-and-load or direct transfer commands.
Teams move OLTP data into BigQuery to run large analytical queries, join with other sources, and leverage Google’s serverless compute.BigQuery scales reads without the operational overhead of sharding MySQL.
1) Export MySQL tables to Cloud Storage.
2) Create matching BigQuery datasets and tables.
3) Load the files with bq load
or schedule a BigQuery Data Transfer Service (DTS).
4) Validate row counts and column types.
Use mysqldump --tab
for delimited data or SELECT ... INTO OUTFILE
.Save each table as a CSV or newline-delimited JSON file in Cloud Storage.
mysqldump -u root -p ecommerce \ --tables Customers Orders Products OrderItems \ --tab=/tmp/ecommerce_export \ --fields-terminated-by="," --fields-enclosed-by="\""
Define a dataset, then create tables matching MySQL schema.Use SQL DDL or auto-detect during load.
bq mk --location=US ecommerce_raw
CREATE TABLE `ecommerce_raw.Customers` ( id INT64 NOT NULL, name STRING, email STRING, created_at TIMESTAMP);
Run bq load
for each table or script the process. Include field delimiter, skip header rows, and source format.
bq load --source_format=CSV --field_delimiter="," \ --skip_leading_rows=1 \ ecommerce_raw.Customers \ gs://my-bucket/Customers.csv \ id:INT64,name:STRING,email:STRING,created_at:TIMESTAMP
Use BigQuery DTS for Cloud SQL or Cloud Storage.DTS monitors new files and ingests them on a schedule, keeping BigQuery in sync without custom code.
Compare counts and sample rows. Run SELECT COUNT(*) FROM
in both systems.Use checksums or hashed concatenations of key columns for extra assurance.
• Export to GZip-compressed CSV to cut transfer costs.
• Use partitioned tables in BigQuery for time-based data.
• Convert MySQL DATETIME
to BigQuery TIMESTAMP
.
• Grant least-privilege IAM roles.
Incorrect encoding, mismatched null handling, and default string lengths cause load failures. Always set --encoding=UTF-8
and explicit schemas.
.
Yes. Use the BigQuery Data Transfer Service for Cloud SQL or a Dataflow pipeline to stream data.
No. BigQuery is columnar and doesn’t enforce primary keys. Re-create constraints in downstream tools or document them.
Typical loads finish in minutes because BigQuery parallelizes ingest. Network speed to Cloud Storage is often the bottleneck.