Use schema extraction, data export, and ClickHouse’s INSERT or clickhouse-client to move Oracle tables and data into ClickHouse efficiently.
ClickHouse’s columnar engine delivers sub-second analytics at lower cost. Moving historical Oracle data into ClickHouse speeds dashboards without replacing mission-critical OLTP workloads.
1) Extract Oracle schema 2) Map data types to ClickHouse 3) Export data (CSV/Parquet) 4) Create ClickHouse tables 5) Load files or stream through clickhouse-client
6) Validate counts & spot-check queries.
Run DBMS_METADATA.GET_DDL
or expdp
with CONTENT=METADATA_ONLY
.Save table, index, and constraint definitions to SQL files.
NUMBER → Decimal/Int64; VARCHAR2/CLOB → String; DATE/TIMESTAMP → DateTime64; BLOB → String + Codec; RAW → FixedString(N). Adjust precision and nullability manually.
Edit the generated DDL: replace types, remove Oracle storage clauses, add ENGINE (usually MergeTree
or ReplicatedMergeTree
) and a primary key (ORDER BY).Example shown below.
Use sqlplus
SET MARKUP CSV
, sqlcl
, or expdp
with access_method=direct_path
to write compressed CSV files. Each table → separate file.
Option 1: clickhouse-client --query="INSERT INTO … FORMAT CSV" < file.csv
. Option 2: stage files in S3 and use INSERT INTO … SELECT * FROM s3(…)
. Enable input_format_defaults_for_omitted_fields=1
to handle missing columns.
Run SELECT count(*)
, min/max timestamps, and checksum hashes on numeric columns in both databases.Randomly sample rows for deep comparison.
Use Oracle GoldenGate → Kafka → ClickHouse, or deploy Debezium Oracle connector to stream CDC events into ClickHouse Materialized Views.
Partition large tables by day, compress exports, parallel-load with GNU parallel, disable ClickHouse quorum inserts during bulk load, and run loads in UTC to avoid DST issues.
.
Yes. Perform an initial bulk load, then apply CDC streams (GoldenGate, Debezium) until lag is zero. Cut over reads to ClickHouse afterward.
ClickHouse handles multi-GB files, but splitting into 1–2 GB chunks enables parallel loading and easier retry on failure.
Procedures and triggers don’t translate. Re-implement business logic in the application layer or use ClickHouse Materialized Views for simple aggregations.