A dbt snapshot is a special model type that automatically captures and versions row-level changes in a source table over time, giving you an auditable, slowly-changing-dimension history without hand-written change-data-capture logic.
A dbt snapshot is a declarative configuration that tells dbt to detect changes in a source table, copy the modified rows into a managed table, and record when each version became valid. Unlike an incremental model, a snapshot keeps every historical version of a row, enabling point-in-time analysis and slowly changing dimension (SCD) use cases.
Analysts and data engineers frequently need to answer questions such as “What did the customer record look like last month?” or “How many active subscriptions did we have on January 1st?” Traditional ETL pipelines overwrite rows, making historical states hard to reconstruct. dbt snapshots solve this by:
dbt_valid_from
and dbt_valid_to
timestamps.A snapshot is defined in a snapshots/
directory using Jinja + YAML. During a dbt snapshot
run, dbt performs these steps:
id
).dbt_valid_to = current_timestamp
) and insert the new version (dbt_valid_from = current_timestamp
, dbt_valid_to = null
).Because dbt executes SQL directly in your warehouse, snapshots scale with your database's native performance and partitioning features.
target_schema
& target_database
Tell dbt where to create the physical snapshot table. Best practice is to store snapshots in their own schema (e.g., snapshots
) to simplify permissioning.
unique_key
The business key that identifies a logical row. Without it, dbt cannot tell which historical chain a record belongs to.
strategy
timestamp
: Detect change when a source updated_at column advances.check
: Detect change by comparing a list of columns or a hash of the entire row.Use the updated_at
or check_cols
parameters to restrict change detection to only the fields you care about, reducing churn and storage cost.
Suppose we need to capture historical versions of a customers
table that gets nightly backfills.
{% snapshot customers_snapshot %}
{{
config(
target_database='analytics',
target_schema='snapshots',
unique_key='id',
strategy='check',
check_cols=['name', 'email', 'plan', 'company_size']
)
}}
select * from {{ source('stripe', 'customers') }}
{% endsnapshot %}
Running dbt snapshot
creates snapshots.customers_snapshot
with four extra fields:
dbt_scd_id
— surrogate key for the versiondbt_updated_at
— load timestampdbt_valid_from
dbt_valid_to
To find the state of a customer on a specific date:
select *
from snapshots.customers_snapshot
where id = 42
and '2024-03-01' between dbt_valid_from and coalesce(dbt_valid_to, '9999-12-31');
If you use the Galaxy SQL editor, you can save the query in a Collection named “Customer History” and let teammates endorse it for downstream analytics.
unique_key
— composite keys are acceptable as long as they never change.dbt_valid_to
is null for only one current row per unique_key
.dbt_valid_from
or unique_key
for query speed.After a dbt job completes, your warehouse contains snapshot tables like snapshots.customers_snapshot
. Open Galaxy, connect to the warehouse, and:
dbt snapshots offer a simple, declarative route to historical data tracking. By combining snapshots with modern tooling like Galaxy, teams gain both trusted history and a collaborative environment to explore it.
Modern analytics must answer questions about how data looked at any point in the past. Overwriting tables destroys that history, while database-level CDC is often complex and platform-specific. dbt snapshots provide a vendor-agnostic, SQL-only approach to retain every version of a record. They power regulatory audits, churn analysis, funnel reconstruction, A/B test backfills, and more—without extra infrastructure. Because snapshots run directly in the warehouse, every analytics engineer can implement SCD logic using familiar dbt workflows.
Match your snapshot cadence to the rate of change in the source table. Daily is common, but rapidly changing tables may need hourly snapshots.
Type 1 (overwrite) doesn’t require snapshots because you don’t preserve history. Snapshots implement SCD Type 2 by design.
Yes. Point Galaxy at the same warehouse; the snapshot schema appears in the sidebar. Galaxy’s AI Copilot can even draft time-travel queries for you.
dbt treats the new key as unrelated, producing a fresh history and potentially duplicating data. Instead, create a new snapshot file and deprecate the old one.