Oracle integration with Airflow lets you schedule, monitor, and retry Oracle SQL jobs through Apache Airflow’s OracleHook and OracleOperator.
It connects Airflow DAGs to Oracle so you can run, retry, and log PL/SQL or plain SQL statements on a schedule.
Use the airflow connections add
CLI or the UI. Supply host, port, SID or service_name, user, and password. Name it oracle_default
or any string referenced by oracle_conn_id
.
Create a DAG, import OracleOperator
, and pass the SQL plus optional parameters
.Airflow handles retries, logging, XCom pushes, and templating.
Use the parameters
argument: parameters={"min_total":1000}
.This avoids SQL injection and lets Airflow render Jinja templates for dynamic values.
Chain an OracleOperator
that SELECTs data with a PythonOperator
or PostgresOperator
to push rows into PostgreSQL, S3, or BigQuery.
• Store credentials in a secret back-end.
• Use small, idempotent SQL chunks.
• Add retries
and retry_delay
in default_args
.
• Set catchup=False
for one-off loads.
1. Forgetting the Oracle client libraries—Airflow won’t start.
2.Misnaming oracle_conn_id
—tasks fail with “connection id not found.”
.
Yes. Each worker executing Oracle tasks must have cx_Oracle and the Oracle client libraries available.
Absolutely. Point sql
to a .sql file in the DAG folder and use Jinja templates like {{ ds }}
for dynamic dates.
Limit results with fetch_size
or push to XCom in chunks; otherwise, large pulls can exceed worker memory.