Scheduling jobs in ClickHouse means automating recurring tasks—such as OPTIMIZE, DELETE, or reporting queries—using cron or an external orchestrator.
Create a shell script or inline cron entry that calls clickhouse-client
with the SQL you want to run. Cron handles the timing; ClickHouse executes the query.
* * * * * user command
represents minute, hour, day-of-month, month, and day-of-week. Replace command
with a clickhouse-client
call.
Wrapper scripts let you version-control SQL, add logging, and reuse connection flags. They also keep long queries readable.
OPTIMIZE TABLE
jobs?Schedule OPTIMIZE TABLE Orders FINAL
during off-peak hours—typically early morning—so merges do not impact user queries.
Add a TTL order_date + INTERVAL 90 DAY DELETE
clause when you create a table. ClickHouse removes rows automatically, eliminating the need for manual DELETE jobs.
Airflow, Dagster, or Prefect trigger ClickHouse jobs via Python operators. Use connection pools and retries to handle transient failures.
Test queries on staging data, use --readonly=2
in dry-runs, write idempotent SQL, and log both STDOUT and STDERR for auditability.
No. ClickHouse lacks a native job scheduler; use cron or an orchestrator.
Use --multiquery
or separate statements with semicolons inside the quoted --query
string.
Redirect STDERR, enable cron email alerts, and set retries in Airflow to catch and rerun failed jobs.