A dbt DAG is the dependency graph that orders dbt models so they run in the correct, cycle-free sequence.
A dbt DAG is a Directed Acyclic Graph that maps every model, seed, snapshot, and test in your project. Each node represents a dataset produced by a SELECT
statement, and each edge represents a dependency created with ref()
or source()
. The graph is acyclic, so no model can depend on itself, directly or indirectly.
During compilation, dbt scans project files for ref()
, source()
, and config(materialized=…)
calls. It records parent–child relationships and produces a manifest JSON. This manifest feeds the scheduler, ensuring upstream models finish before downstream models start.
The DAG accelerates incremental builds by running only models whose parents changed. It prevents circular dependencies, surfaces lineage for debugging, and powers documentation sites and CI checks. Teams gain reproducible, ordered pipelines without manual orchestration.
Suppose stg_orders
depends on raw_orders
, while dim_customers
depends on both stg_orders
and stg_customers
. dbt’s DAG ensures raw_orders → stg_orders → dim_customers
executes sequentially, letting analysts trust each layer’s freshness.
Keep models small and single-purpose to minimize fan-out. Use consistent layer prefixes (raw_, stg_, fct_, dim_) so lineage is obvious. Leverage exposures
and tests to extend the DAG to BI tools and quality checks.
Analytics engineering, reverse-ETL, feature stores, and data quality monitoring all rely on the dbt DAG. Teams schedule dbt run
nightly, then trigger downstream dashboards once the DAG completes.
Galaxy’s desktop SQL editor parses compiled dbt SQL to visualize your DAG and suggest optimizations. Its AI Copilot rewrites queries when dependencies change, keeping nodes aligned without manual refactoring.
A dbt DAG replaces manual scheduling with automatic, dependency-aware execution. Teams avoid race conditions and guarantee data freshness by trusting dbt to order models correctly. Effective DAG design simplifies onboarding: newcomers can trace lineage visually, understand model purposes, and spot performance bottlenecks quickly.
Run dbt docs generate && dbt docs serve
. The auto-generated site includes an interactive lineage graph. Galaxy also renders the DAG directly in its SQL editor.
The dependency graph is identical, but incremental models use is_incremental()
logic to process only new records once their parents finish.
dbt detects circular references at compile time and raises an error, preventing deployment until the cycle is removed.
Galaxy imports your manifest.json
, highlights each model’s SQL, and lets the AI Copilot adjust queries when their upstream schema changes.