COPY INTO bulk-loads structured files from stages into Snowflake tables quickly and reliably.
Use the COPY INTO <table>
command to bulk load data from an internal or external stage (S3, Azure, GCS). Snowflake parallel-loads files, automatically handles compression, and validates data types.
Create the target table, define a file format object, and stage the files.Grant USAGE
on the stage and file format to the loading role.
Specify the table, stage location, file format, optional column list, and load options (e.g., ON_ERROR
, PURGE
, VALIDATION_MODE
). Enclose identifiers in double quotes when needed.
Yes. Provide a column list that matches the file field order or use FILE_FORMAT => (FIELD_DELIMITER='|')
to map correctly.Missing columns default to NULL.
Query the INFORMATION_SCHEMA.LOAD_HISTORY
view or check the RESULT_SCAN
of the COPY INTO command for row counts and rejected rows.
1) Compress files (gzip or bzip2) to cut network time.
2) Keep file sizes 100–250 MB for optimal parallelism.
3) Use ON_ERROR='CONTINUE'
during first loads, then fix data.
4) Purge files after success to avoid reloading.
Create a Snowpipe with the same COPY INTO syntax, or wrap COPY INTO in a task scheduled with WAREHOUSE = 'LOADER_WH'
.
.
No. The target table must exist. Use CREATE TABLE
or CREATE TABLE LIKE
before loading.
Yes. Define a JSON file format and use COPY INTO … FILE_FORMAT=(TYPE='JSON')
. Use SELECT $1:field::type
to parse semi-structured columns.
Fix source data or adjust file format, then re-run COPY INTO with FORCE=TRUE
to reload previously marked files.