How to Unit Test in BigQuery

How do I write and automate unit tests in BigQuery?

Unit testing in BigQuery isolates and validates SQL logic by comparing actual query results to expected outcomes directly inside the warehouse.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What is unit testing in BigQuery?

Unit testing in BigQuery runs small, deterministic SQL snippets that verify one transformation or calculation at a time. By checking actual results against expected rows inside the warehouse, you catch logic errors before dashboards break.

Why should I write tests?

Tests prevent regressions, document intent, and speed debugging when schemas or business rules change. They are essential for teams sharing queries or powering revenue-critical reports.

Which methods can I use?

Use native BigQuery scripting (DECLARE, CREATE TEMP TABLE, ASSERT) or wrappers like Dataform, dbt, and bqunit. Native scripts give full control; frameworks add orchestration and reporting.

How do I structure a test case?

1) Create a TEMP TABLE expected with hard-coded rows. 2) Build a TEMP TABLE actual using your production SQL. 3) Compare with EXCEPT DISTINCT or ASSERT. 4) Return zero rows or raise an error when mismatches exist.

Can I see a concrete example?

The example below ensures each Orders.total_amount equals the sum of its OrderItems. A non-zero mismatch count triggers an ASSERT failure.

How do I automate tests?

Store each test in a file, then run bq query --use_legacy_sql=false --script in CI (Cloud Build, GitHub Actions). Fail the pipeline when any script exits non-zero.

Best practices for BigQuery unit tests?

Keep fixtures tiny, cover edge cases, name tests after rules, clean up temp objects, and run tests locally before committing.

Why How to Unit Test in BigQuery is important

How to Unit Test in BigQuery Example Usage


-- Validate Orders.total_amount equals sum(OrderItems)
DECLARE test_name STRING DEFAULT 'order_total_accuracy';

CREATE TEMP TABLE expected AS
SELECT o.id AS order_id, o.total_amount
FROM `project.dataset.Orders` o
WHERE o.id IN (101,102);

CREATE TEMP TABLE actual AS
SELECT o.id AS order_id,
       SUM(oi.quantity*p.price) AS total_amount
FROM `project.dataset.Orders` o
JOIN `project.dataset.OrderItems` oi ON oi.order_id = o.id
JOIN `project.dataset.Products` p   ON p.id = oi.product_id
WHERE o.id IN (101,102)
GROUP BY o.id;

ASSERT (SELECT COUNT(1)
        FROM (SELECT * FROM expected EXCEPT DISTINCT SELECT * FROM actual)
       ) = 0
  DESCRIPTION 'Totals mismatch for sample orders';

How to Unit Test in BigQuery Syntax


-- BigQuery script template for a unit test
DECLARE test_name STRING DEFAULT 'order_total_matches_items';

-- 1. Expected rows
CREATE TEMP TABLE expected AS
SELECT 1 AS order_id, 220.00 AS total_amount
UNION ALL
SELECT 2, 150.00;

-- 2. Actual rows from production logic
CREATE TEMP TABLE actual AS
SELECT o.id AS order_id,
       SUM(oi.quantity * p.price) AS total_amount
FROM `project.dataset.Orders` o
JOIN `project.dataset.OrderItems` oi ON oi.order_id = o.id
JOIN `project.dataset.Products` p   ON p.id = oi.product_id
WHERE o.id IN (1,2)
GROUP BY o.id;

-- 3. Detect mismatches
CREATE TEMP TABLE mismatches AS
SELECT * FROM expected
EXCEPT DISTINCT
SELECT * FROM actual
UNION ALL
SELECT * FROM actual
EXCEPT DISTINCT
SELECT * FROM expected;

-- 4. Assert zero mismatches
ASSERT (SELECT COUNT(1) FROM mismatches) = 0
  DESCRIPTION CONCAT('Test failed: ', test_name);

-- 5. Optional success output
SELECT 'PASS' AS status, test_name AS test;

Common Mistakes

Relying on full tables instead of small fixtures. Large datasets slow tests and hide edge cases. Fix by inserting only the minimal rows needed to prove the rule.
Comparing floats with direct equality. Currency or percentage columns can differ by tiny decimals. Use ROUND or tolerance checks (e.g., ABS(a-b)<0.01).