Imports rows from a local or remote CSV file into a ClickHouse table using INSERT … FORMAT CSV or clickhouse-client/clickhouse-local.
Pipe the file into clickhouse-client
and run INSERT INTO table FORMAT CSV
. This streams rows directly, bypassing intermediate parsing tools.
Use cat file.csv | clickhouse-client --query "INSERT INTO db.table FORMAT CSV"
. Replace db.table
with your actual table path.
Yes. clickhouse-local
lets you create a temporary table and import CSV without connecting to a cluster, ideal for ad-hoc analysis.
Order matters. CSV columns must appear in the same sequence as the table definition or as explicitly listed after INSERT INTO table (col1,col2)
.
If the file has a header row, load it with FORMAT CSVWithNames
. ClickHouse treats the first line as column names and ignores it during insertion.
ClickHouse will refuse to insert rows that can’t be cast to the column type. Cast or clean data beforehand, or store raw strings then transform.
Yes. Set --max_insert_block_size
when invoking clickhouse-client
. Larger blocks speed up imports at the cost of memory.
Pipe a gzipped file through gunzip -c
before the client. Example: gunzip -c orders.csv.gz | clickhouse-client …
.
Run SELECT count(*) FROM table
before and after the import, or compare against the expected row count from wc -l file.csv
.
Use FORMAT CSV
with streaming, disable compression on low-powered machines, and temporarily increase max_partitions_to_read
and max_insert_threads
if needed.
No. The table schema dictates types. ClickHouse casts incoming strings; mismatches cause errors.
Yes. Use s3('bucket/path/file.csv')
in a SELECT or download then pipe to clickhouse-client
.
ClickHouse is not transactional. Once data is inserted, you must delete partitions to undo an import.