dbt Core is an open-source command-line framework that lets data teams transform, test, and document data in their warehouse using version-controlled SQL and Jinja templates.
dbt Core enables analytics engineers to build modular, version-controlled SQL transformations with testing and documentation baked in.
dbt Core is an open-source framework that turns raw warehouse tables into cleaned, documented, and tested models using SQL and Jinja. It runs from the command line, integrates with Git, and creates a directed acyclic graph (DAG) of dependencies so only changed models re-run.
dbt interprets .sql files as models, compiles Jinja into warehouse-specific SQL, and executes them in dependency order. It stores run metadata, generates documentation sites, and provides built-in assertions called tests to validate data quality.
dbt Core brings software-engineering discipline—version control, modularity, automated tests—to analytics projects. Teams gain reliable, repeatable pipelines, peer-reviewed code, and easy rollbacks, improving trust in downstream dashboards and machine-learning features.
Models are SELECT statements that create views or tables. Tests are YAML-defined assertions like unique, not_null, or custom SQL queries. Docs auto-generate searchable websites with lineage graphs, column descriptions, and test results.
Install with pip install dbt-core
plus the warehouse adapter, e.g., dbt-bigquery
. Run dbt init my_project
, supply connection details in profiles.yml
, then create models under models/
and run dbt run
.
dbt builds transformations inside the warehouse, avoiding data movement. Incremental models process only new or changed rows, while macros and hooks let you templatize complex patterns like SCD2 or snapshotting.
Use crontab, Airflow, Prefect, Dagster, or dbt Cloud to trigger dbt run
, dbt test
, or dbt build
. Pass the --select
/--exclude
flags to target subsets and leverage the DAG for parallel execution.
Keep models atomic and layered (staging, intermediate, marts). Enforce code review via GitHub. Use snapshots for slowly changing dimensions, document every field, and run continuous integration tests on pull requests.
-- models/order_totals.sql
SELECT
order_id,
SUM(amount) AS total_amount
FROM {{ ref('raw_orders') }}
GROUP BY order_id
Yes. Write and test dbt model SQL in the Galaxy desktop editor to benefit from lightning-fast autocomplete, context-aware AI suggestions, and team Collections that keep approved models centrally shared.
dbt Core bridges the gap between data engineering and analytics by introducing version control, testing, and documentation directly in SQL workflows. This alignment reduces errors, accelerates development, and increases stakeholder trust because every transformation is transparent, reviewed, and reproducible.
Yes, dbt Core is open-source under the Apache 2.0 license. You only pay for warehouse compute.
Official adapters exist for Snowflake, BigQuery, Redshift, Databricks, Postgres, and more. Community adapters extend coverage further.
Galaxy is not a dbt orchestrator, but its modern SQL editor lets you develop, test, and share dbt model SQL faster than traditional IDEs.
Place each SELECT into a model file, use ref()
for dependencies, add YAML tests, then run dbt build
to validate.