Use GitHub Actions to automatically execute dbt test suites every time a pull request is opened or updated, preventing broken data models from reaching production.
Running dbt tests on every pull request (PR) means wiring your Git provider’s continuous-integration (CI) system so that each change to your analytics codebase automatically triggers dbt test
. If any schema or data quality test fails, the PR is marked red, blocking the merge until the problem is fixed. The result is faster feedback for developers and a dramatically lower risk of shipping bad data downstream.
The earlier you catch issues, the cheaper they are to fix. Embedding tests into PRs applies the well-proven software principle of “shift left” to analytics engineering—invalid SQL or broken model assumptions are surfaced minutes after code is pushed instead of days after an Airflow run fails.
Modern organizations rely on dbt models for BI dashboards, experimentation, and ML features. A single unchecked change can cascade into broken revenue reports or faulty predictions. PR test gates protect the entire data value chain.
Large model refactors (renames, new sources, deleted columns) feel risky. When tests fire automatically, developers receive precise feedback on which contracts they broke, enabling quick incremental fixes without “big-bang” deploy anxiety.
At a minimum you need:
dbt deps && dbt seed && dbt run && dbt test
against a dedicated schemaOptionally you can:
dbt build --select state:modified+
for speedProvision a low-privilege user in Snowflake, BigQuery, Redshift, Databricks, or your warehouse of choice. The account should:
dbt_ci_<pull_request_number>
)Navigate to Settings → Secrets & variables → Actions and add:
ci
)Create .github/workflows/dbt_pr.yml
with a pull_request
trigger. A minimal example is shown later in this article. Key steps:
dbt deps
dbt build --select state:modified+ --target ${{ secrets.DBT_PROFILE_TARGET }}
dbt’s state-based selector lets CI build only models affected by the PR:
dbt build --select state:modified+ --defer --state ./.state_artifacts
This reduces runtime from ~15 minutes to under 2 minutes for most feature branches.
Under Settings → Branches → Branch protection rules specify that dbt-pr
(or whatever you named the workflow) must succeed before the main
branch can be merged.
Using actions/github-script
or a dedicated Action like mighty/diff-cover-commenter
, you can write failed tests back to the PR conversation so reviewers don’t need to open the logs.
name: dbt-pr
on:
pull_request:
paths-ignore:
- 'README.md'
jobs:
run-dbt-tests:
runs-on: ubuntu-latest
steps:
- name: Checkout repo
uses: actions/checkout@v4
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: '3.11'
- name: Install dbt
run: |
pip install --upgrade pip
pip install dbt-core==1.7.* dbt-snowflake==1.7.*
- name: Cache .dbt modules
uses: actions/cache@v4
with:
path: ~/.dbt
key: dbt-${{ runner.os }}-${{ hashFiles('packages.yml') }}
- name: Prepare state artifacts
uses: actions/upload-artifact@v4
with:
name: state-artifacts
path: target
retention-days: 1
- name: Run dbt build
env:
DBT_USER: ${{ secrets.DBT_USER }}
DBT_PASSWORD: ${{ secrets.DBT_PASSWORD }}
DBT_ACCOUNT: ${{ secrets.DBT_ACCOUNT }}
DBT_PROFILE_TARGET: ${{ secrets.DBT_PROFILE_TARGET }}
run: |
dbt deps
dbt build --select state:modified+ --target $DBT_PROFILE_TARGET --fail-fast
Interpolating the PR ID into your target schema (dbt_ci_{{ env.PR_NUMBER }}
) prevents race conditions when multiple branches build simultaneously.
--fail-fast
stops execution on the first error, saving compute and surfacing root causes quickly.
Forward job links to Slack or add an artifact viewer like spectacles-ci
so non-CI experts can access context without digging through raw logs.
Invoke dbt build --select state:modified+ --target ci --vars '{delete_schema:true}'
or run a dbt run-operation drop_schema
in a push
job to avoid warehouse clutter.
dbt build -j 8
), and choose smaller warehouse sizes for PR workloads.DBT_PROFILES_DIR
and use environment variables inside profiles.yml
.dbt source freshness
or use --exclude tag:freshness
to avoid time-based flakiness.Although Galaxy focuses on interactive SQL editing rather than CI, the two workflows complement each other. Engineers can prototype model SQL in Galaxy’s desktop editor—leveraging its AI copilot for autocompletion and best-practice suggestions—then commit the changes to GitHub. The dbt-on-PR pipeline described here immediately validates that the Galaxy-authored query meets data quality gates before merge.
Automating dbt tests on pull requests marries software engineering discipline with analytics development. With GitHub Actions, the implementation is straightforward, cost-effective, and highly customizable. Start small with a single test job, iterate on performance, and you’ll soon wonder how you ever shipped analytics code without a green check mark.
Data teams adopting dbt often treat their analytics code like software, but many still rely on ad-hoc local runs. Automating tests in CI embeds data quality checks directly in the developer feedback loop, dramatically reducing incidents caused by broken models and enabling confident collaboration at scale.
No, but you should create a separate database or schema that is truncated or dropped after each run to avoid interfering with production objects.
Use state selection (state:modified+
), increase job parallelism with -j <threads>
, or subset only critical tests in CI while running the full suite nightly.
Yes. Combine actions/github-script
with dbt’s --write-json
flag to parse run_results.json
and create a comment summarizing failures.
Galaxy is primarily an IDE; it accelerates writing dbt model SQL but does not execute CI pipelines. However, the SQL you author in Galaxy can be committed to GitHub, where the workflow described here validates it automatically.