How to Integrate Redshift with dbt

Galaxy Glossary

How do I connect Amazon Redshift to dbt and build models?

Connect Amazon Redshift to dbt to build, test, and document SQL models in a version-controlled workflow.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

Why integrate Amazon Redshift with dbt?

dbt turns raw Redshift tables into tested, documented models using version-controlled SQL. Integration lets engineers modularize queries, run incremental pipelines, and manage schema changes through code.

How do I create the Redshift profile?

Add a Redshift target in ~/.dbt/profiles.yml. Provide host, port, user, password, dbname, and schema. Use iam_role or cluster_id for IAM auth. Verify the profile with dbt debug.

Which parameters are required?

Required: host, port (5439), user, password or AWS auth, dbname, schema, threads, keepalives_idle, connect_timeout. Optional: sslmode, search_path, role, ra3_node.

How do I start a dbt project for Redshift?

Run dbt init ecommerce_warehouse, choose Redshift, and point to the profile name. Commit generated files to Git. Execute dbt run to build initial models.

How do I model Orders and Customers?

Create models/orders_summary.sql that joins Orders, Customers, and OrderItems. Mark it incremental so only new rows rebuild. Add tests and docs in YAML, then run dbt build.

How can I schedule and monitor runs?

Use dbt Cloud, Airflow, or GitHub Actions to call dbt run and dbt test. Store logs in S3 or CloudWatch and alert failures via Slack.

What are best practices for performance and cost?

Use incremental models, define proper DISTKEY and SORTKEY, prefer COPY over INSERT, cap concurrency with threads, and schedule VACUUM after large loads.

Which pitfalls should I avoid?

Avoid building models in the public schema; use a dedicated analytics schema. Never commit secrets to Git—load them via environment variables or AWS Secrets Manager.

Frequently asked questions

Can I use IAM roles instead of passwords?

Yes. Set method: iam and cluster_id or iam_profile in the profile. dbt will obtain temporary credentials automatically.

How do I handle late-arriving data in Redshift models?

Use merge_strategy='delete_insert' in an incremental model or switch to dbt snapshots for slowly changing dimensions.

Why How to Integrate Redshift with dbt is important

How to Integrate Redshift with dbt Example Usage


dbt run -m orders_summary --target dev

How to Integrate Redshift with dbt Syntax


```yaml
# ~/.dbt/profiles.yml
ecommerce_redshift:
  target: dev
  outputs:
    dev:
      type: redshift
      host: redshift-cluster.xy123.us-east-1.redshift.amazonaws.com
      port: 5439
      user: analytics_admin
      password: ""
      dbname: ecommerce
      schema: analytics
      threads: 4
      iam_role: arn:aws:iam::123456789012:role/RedshiftRole
      connect_timeout: 10
```

```sql
-- models/orders_summary.sql

select
    o.id             as order_id,
    c.id             as customer_id,
    c.name           as customer_name,
    o.order_date,
    o.total_amount,
    sum(oi.quantity) as items
from  o
join  c on c.id = o.customer_id
left join  oi on oi.order_id = o.id
{% if is_incremental() %}
where o.order_date >= (select max(order_date) from )
{% endif %}
group by 1,2,3,4,5;
```

Common Mistakes

Using a profile name that differs between profiles.yml and dbt_project.yml—dbt fails to connect. Fix by matching the "profile" field in dbt_project.yml to the key in profiles.yml.
Ignoring DISTKEY and SORTKEY when creating models, causing slow joins. Define these keys in model configs or with post-hooks to optimize query performance.