Bulk loading quickly ingests large files into PostgreSQL tables using the high-performance COPY command.
Use the COPY command (or psql’s \copy) to stream files directly into tables, bypassing row-by-row INSERTs and cutting load times by orders of magnitude.
COPY sends a single, optimized stream to the server, eliminating per-row overhead, trigger firing, and network latency. This makes it 10–100× faster than loops of INSERT.
See the full reference below. Key options: FORMAT (csv, text, binary), DELIMITER, HEADER, NULL, QUOTE, ENCODING, and FREEZE for bulk loads into static tables.
\copy sales FROM 'sales_2023.csv' WITH (FORMAT csv, HEADER, DELIMITER ',', NULL '', ENCODING 'UTF8');
\copy runs on the client, reading the file and funneling data to the server over the existing connection—perfect for cloud-hosted databases where the server can’t reach your laptop’s filesystem.
COPY sales (id, product_id, qty, sold_at)
FROM '/var/lib/postgresql/imports/sales_2023.csv'
WITH (FORMAT csv, HEADER, FREEZE);
RUN AS superuser or grant the table owner the pg_read_server_files role. Server-side COPY avoids client upload time.
Yes. Pipe gunzip output into psql:
gunzip -c bigfile.csv.gz | psql -c "COPY mytable FROM STDIN WITH (FORMAT csv);"
Disable indexes and constraints, set synchronous_commit = off, and increase maintenance_work_mem. Re-enable and ANALYZE afterward for accurate planner stats.
Run SELECT relname, pg_size_pretty(pg_total_relation_size(relid)) FROM pg_stat_progress_copy;
in another session (PostgreSQL 14+).
Wrap COPY in a transaction: BEGIN; COPY ...; COMMIT;
. On error, PostgreSQL aborts the COPY and you can simply ROLLBACK;
.
Server-side COPY FROM needs superuser or membership in pg_read_server_files. Client-side \copy works under regular roles because the file is read locally.
Yes. In PostgreSQL 15+, use COPY ... LOG ERRORS. For earlier versions, load into a staging table or use tools like pg_bulkload.
Binary avoids parsing, so it’s ~20-30 % faster, but ties you to PostgreSQL versions and endianness. Use it for internal pipelines; stick with CSV for exchange.