How to Integrate Amazon Redshift with Airflow

Galaxy Glossary

How do I connect Amazon Redshift to Apache Airflow?

Automate Amazon Redshift loads, exports, and transformations inside Apache Airflow DAGs using built-in Redshift operators.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.

Description

Table of Contents

What is the fastest way to connect Airflow to Redshift?

Create an Airflow Connection of type Amazon Redshift (or Postgres) that stores host, port, database, user, and IAM-role-based credentials. Reference the connection with redshift_conn_id in your tasks.

How do I run SQL against Redshift from a DAG?

Use RedshiftSQLOperator (or PostgresOperator) to embed SQL. Airflow will open a Redshift session, execute the statement, and commit or roll back automatically.

Can Airflow copy data into Redshift?

Yes.Combine S3ToRedshiftTransfer for COPY jobs and PythonOperator for pre-processing files. Schedule incremental loads or full refreshes nightly.

How do dependencies work?

Define tasks (e.g., extract_orders → load_orders → transform_orders). Use >> to chain tasks so Redshift steps fire only after upstream jobs succeed.

Best practices for production?

Store secrets in AWS Secrets Manager, enable Auto-Vacuum and ANALYZE, set autocommit=True, and log SQL to S3 for audit.Parametrize dates with Jinja ({{ ds }}) to keep DAGs reusable.

How to monitor Redshift tasks?

Leverage Airflow SLA alarms, task-level retries, and Redshift system tables (stl_query, svl_qlog). Push query IDs to XCom for downstream diagnostics.

.

Why How to Integrate Amazon Redshift with Airflow is important

How to Integrate Amazon Redshift with Airflow Example Usage


# dags/redshift_orders_etl.py
from airflow import DAG
from airflow.providers.amazon.aws.operators.redshift_sql import RedshiftSQLOperator
from airflow.providers.amazon.aws.transfers.s3_to_redshift import S3ToRedshiftOperator
from datetime import datetime

default_args = {"retries": 2}

dag = DAG(
    dag_id="orders_daily_etl",
    start_date=datetime(2024, 6, 1),
    schedule_interval="0 2 * * *",
    default_args=default_args,
    catchup=False,
)

stage_orders = S3ToRedshiftOperator(
    task_id="stage_orders",
    schema="public",
    table="staging_orders",
    s3_bucket="ecommerce-raw",
    s3_key="orders/.csv",
    copy_options=["CSV", "IGNOREHEADER 1"],
    redshift_conn_id="redshift_prod",
    dag=dag,
)

load_orders = RedshiftSQLOperator(
    task_id="load_orders",
    sql="sql/load_orders.sql",  # keeps SQL version-controlled
    redshift_conn_id="redshift_prod",
    autocommit=True,
    dag=dag,
)

stage_orders >> load_orders

How to Integrate Amazon Redshift with Airflow Syntax


from airflow.providers.amazon.aws.operators.redshift_sql import RedshiftSQLOperator

RedshiftSQLOperator(
    task_id="load_daily_orders",
    sql="""
        INSERT INTO Orders (id, customer_id, order_date, total_amount)
        SELECT id, customer_id, order_date, total_amount
        FROM staging_orders
        WHERE order_date = '';
    """,
    redshift_conn_id="redshift_prod",   # Connection in Airflow UI
    autocommit=True,                     # Commit after run
    parameters=None,                     # Optional SQL params
    database="analytics",               # Target DB (optional)
    aws_conn_id=None                     # Use if IAM-based auth differs
)

Common Mistakes

Frequently Asked Questions (FAQs)

Do I need a separate Airflow worker for Redshift?

No. Redshift tasks run via a lightweight JDBC/psycopg2 session, so any standard worker can execute them.

Can I use IAM roles instead of passwords?

Yes. Attach an IAM role to the Airflow instance or Kubernetes service account and reference it in the Redshift connection’s role_arn.

How do I speed up large COPY jobs?

Split files in S3 into equal-sized parts, enable COMPUPDATE, and set COPY option MAXERROR 0 to fail fast on bad rows.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie Logo
Bauhealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.