COPY swiftly imports millions of rows into ParadeDB tables with minimal WAL overhead.
ParadeDB relies on PostgreSQL storage. COPY moves data from files or STDIN directly into a table, bypassing row-by-row overhead. Use it to ingest large embedding sets, historical orders, or product catalogs.
Pick COPY when loading >10k rows, restoring exports, or backfilling new vector columns. INSERT is fine for small, transactional writes.
Save UTF-8 CSVs without headers by default.Quote text fields, escape delimiters, and store numeric vectors as arrays like "{0.12,0.44,-0.88}".
customers.csv
1,Jane Doe,jane@shop.com,2024-01-142,John Roe,john@shop.com,2024-02-02
-- from server-side file
COPY target_table [ (column_list) ]
FROM '/absolute/path/data.csv'
WITH (
FORMAT csv,
DELIMITER ',',
NULL '',
QUOTE '"',
HEADER false,
ENCODING 'UTF8'
);.
-- from client STDIN
\copy target_table FROM program 'cat data.csv' WITH (FORMAT csv);
-- 1. ParadeDB table with vector column
CREATE TABLE products_embeddings (
id BIGINT PRIMARY KEY,
name TEXT,
price NUMERIC(10,2),
embedding vector(384) -- ParadeDB vector type
);
-- 2. Bulk load
COPY products_embeddings (id, name, price, embedding)
FROM '/var/lib/postgresql/import/products_embeddings.csv'
WITH (FORMAT csv, DELIMITER ',', NULL '', ENCODING 'UTF8');
1) Disable indexes and constraints, load, then rebuild. 2) Increase maintenance_work_mem. 3) Use WAL-bypass: COPY ... FREEZE
on Postgres ≥15.4) Split huge files into 1-2 GB chunks for parallel sessions.
Mismatched column order: Always specify column list to avoid alignment errors.
Unexpected "invalid input syntax for type vector": Ensure embeddings are written as PostgreSQL array literals, not JSON.
No native command yet.Download to local storage or mount S3 via postgres_fdw
or extensions, then run COPY.
Yes, but loading is fastest with indexes dropped first. Recreate indexes afterwards to speed up overall time.
Check pg_stat_progress_copy
(Postgres 14+) to see bytes processed and estimate remaining time.
.
PostgreSQL limits server-side COPY to superusers for security. Use \copy from psql or set pg_read_server_files
role to grant limited rights.
Not natively. Split files and run in smaller chunks to simulate checkpointing.
Yes. If any row fails, the entire COPY rolls back unless you use log_errors
(Postgres 15+) to skip bad rows.