Plan tables, columns, and options in BigQuery to store data efficiently and query it fast.
Well-planned schemas lower storage costs, speed up queries, and simplify maintenance. Bad schemas force expensive scans, complicate joins, and create data silos.
Use CREATE TABLE
with column names, data types, and optional partitioning and clustering clauses. Always match data types to the source system to avoid costly casts.
CREATE TABLE myshop.Customers (
id INT64,
name STRING,
email STRING,
created_at TIMESTAMP
) PARTITION BY DATE(created_at);
Partition on a column heavily used in WHERE
filters, typically DATE
or TIMESTAMP
. For event or order data, partition by order date to restrict scans to relevant days.
Cluster on columns frequently used in equality filters or joins, like customer_id
. Clustering further prunes data blocks after partition pruning completes.
CREATE TABLE myshop.Orders (
id INT64,
customer_id INT64,
order_date DATE,
total_amount NUMERIC
) PARTITION BY order_date
CLUSTER BY customer_id;
Keep separate tables and join when needed. For Orders → OrderItems use INT64 foreign keys. Repeated nested records are possible but hinder cross-table analysis.
Prefer INT64
for IDs, NUMERIC
for currency, STRING
for free text, and TIMESTAMP/DATE
for temporal data. Avoid FLOAT64
for money to prevent rounding errors.
Add nullable columns with ALTER TABLE ... ADD COLUMN
. Backfill values in a separate job. To drop columns, create a new table and copy only required fields.
Use lowercase snake_case for tables and columns. Prefix surrogate keys with id
, e.g., customer_id
. Keep dataset names short, like myshop
.
Load a sample data set, run representative queries, and review bytes_processed
in the query plan. Iterate on partition and clustering choices until scans fall within acceptable limits.
No. BigQuery doesn’t support in-place type changes. Create a new column, backfill, then drop the old one.
Up to four. Order matters; use the most selective column first.
No extra storage cost, but each partition has 10 MB minimum billing. Small tables may not benefit.