How to Bulk Load Data in Snowflake

Galaxy Glossary

How do you bulk load data into Snowflake tables using COPY INTO?

COPY INTO bulk-loads structured files from stages into Snowflake tables quickly and reliably.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

What is the fastest way to import large CSV files into Snowflake?

Use the COPY INTO <table> command to bulk load data from an internal or external stage (S3, Azure, GCS). Snowflake parallel-loads files, automatically handles compression, and validates data types.

Which prerequisites should I prepare before running COPY INTO?

Create the target table, define a file format object, and stage the files.Grant USAGE on the stage and file format to the loading role.

How do I structure the COPY INTO command?

Specify the table, stage location, file format, optional column list, and load options (e.g., ON_ERROR, PURGE, VALIDATION_MODE). Enclose identifiers in double quotes when needed.

Can I load only specific columns?

Yes. Provide a column list that matches the file field order or use FILE_FORMAT => (FIELD_DELIMITER='|') to map correctly.Missing columns default to NULL.

How do I verify the load succeeded?

Query the INFORMATION_SCHEMA.LOAD_HISTORY view or check the RESULT_SCAN of the COPY INTO command for row counts and rejected rows.

What are best practices for reliable bulk loads?

1) Compress files (gzip or bzip2) to cut network time.
2) Keep file sizes 100–250 MB for optimal parallelism.
3) Use ON_ERROR='CONTINUE' during first loads, then fix data.
4) Purge files after success to avoid reloading.

How do I automate recurring bulk loads?

Create a Snowpipe with the same COPY INTO syntax, or wrap COPY INTO in a task scheduled with WAREHOUSE = 'LOADER_WH'.

Why How to Bulk Load Data in Snowflake is important

How to Bulk Load Data in Snowflake Example Usage


-- Load daily order items from S3 into the OrderItems table
CREATE OR REPLACE FILE FORMAT csv_mmddyyyy
  TYPE = 'CSV' FIELD_DELIMITER = ',' FIELD_OPTIONALLY_ENCLOSED_BY = '"';

CREATE STAGE s3_orderitems
  URL='s3://acme-ecomm/delta/order_items/'
  CREDENTIALS=(AWS_KEY_ID='🎯' AWS_SECRET_KEY='🔑');

COPY INTO OrderItems (order_id, product_id, quantity)
FROM @s3_orderitems
FILE_FORMAT = (FORMAT_NAME = csv_mmddyyyy)
PATTERN = '.*202406.*.csv'
ON_ERROR = 'ABORT_STATEMENT'
PURGE = TRUE;

How to Bulk Load Data in Snowflake Syntax


COPY INTO target_table
    FROM @stage/path/
    FILE_FORMAT = ( FORMAT_NAME = my_csv_format )
    [ON_ERROR = 'CONTINUE' | 'SKIP_FILE' | 'ABORT_STATEMENT']
    [PURGE = TRUE | FALSE]
    [VALIDATION_MODE = 'RETURN_ERRORS' | 'RETURN_ALL_ERRORS']
    [FILES = ('file1.csv.gz','file2.csv.gz')]
    [PATTERN = '.*2024.*.csv']
    [FORCE = TRUE | FALSE];

-- Example for ecommerce data
COPY INTO Customers
FROM @~/incoming/customers/
FILE_FORMAT = (TYPE = CSV, FIELD_OPTIONALLY_ENCLOSED_BY='"')
ON_ERROR = 'CONTINUE'
PURGE = TRUE;

Common Mistakes

Ignoring file sizes. Uploading single 5 GB files limits parallelism and slows loads. Split large extracts into 100–250 MB chunks for fastest performance.
Leaving ON_ERROR at default. The default ABORT stops the whole load on one bad row. Use ON_ERROR='CONTINUE' for initial loads, then fix rejects via VALIDATION_MODE.