How to COPY from Azure Blob Storage into Amazon Redshift

Galaxy Glossary

How do I load Azure Blob Storage files into Amazon Redshift?

The COPY command ingests files stored in Azure Blob Storage directly into Amazon Redshift tables.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

What does the COPY command achieve when loading from Azure?

The COPY command bulk-loads CSV, JSON, Parquet, or Avro files sitting in Azure Blob Storage straight into a Redshift table. It parallelizes reads across Redshift nodes, making ingestion faster than INSERT loops.

How do I authorise Redshift to read Azure blobs?

Create an Azure AD application or shared access signature (SAS) that grants Blob Reader rights. Supply its AZURE_ACCOUNT_KEY or SAS_TOKEN in the CREDENTIALS clause of COPY.Make sure Redshift’s public subnet can reach .blob.core.windows.net.

What is the exact COPY syntax?

COPY target_table FROM 'azure://<storage-acct>.blob.core.windows.net/<container>/path/' CREDENTIALS ( AZURE_ACCOUNT_NAME '', AZURE_ACCOUNT_KEY '' ) FORMAT AS CSV DELIMITER ',' IGNOREHEADER 1 TIMEFORMAT 'auto' DATEFORMAT 'auto' REGION 'us-east-1' COMPUPDATE ON STATUPDATE ON;

Which parameters matter most?

FORMAT chooses CSV, JSON, PARQUET, or AVRO. DELIMITER sets column separator for CSV. IGNOREHEADER skips header rows.COMPUPDATE and STATUPDATE update table compression and stats during load.

How do I load an ecommerce Customers table?

CREATE TABLE Customers( id INT, name VARCHAR(50), email VARCHAR(70), created_at TIMESTAMP );COPY Customers FROM 'azure://galaxydata.blob.core.windows.net/ecomm/customers/' CREDENTIALS ( AZURE_ACCOUNT_NAME 'galaxydata', AZURE_ACCOUNT_KEY 'AZURE_KEY' ) FORMAT AS CSV DELIMITER ',' IGNOREHEADER 1;

Which options boost large loads?

Split files into 100-250 MB parts so Redshift can parallelize. Compress with gzip to cut transfer time.Disable STATUPDATE during massive ingests, then run ANALYZE afterward.

What are typical pitfalls?

Wrong region endpoint stalls COPY; ensure .blob.core.windows.net is reachable. Mismatched column count triggers Load error: Invalid field count; validate delimiter and header rows.

Why How to COPY from Azure Blob Storage into Amazon Redshift is important

How to COPY from Azure Blob Storage into Amazon Redshift Example Usage


COPY Orders
FROM 'azure://galaxydata.blob.core.windows.net/ecomm/orders/'
CREDENTIALS (
  AZURE_ACCOUNT_NAME 'galaxydata',
  AZURE_ACCOUNT_KEY  'AZURE_KEY'
)
FORMAT AS PARQUET
COMPUPDATE ON
STATUPDATE OFF;

How to COPY from Azure Blob Storage into Amazon Redshift Syntax


COPY target_table
FROM 'azure://<storage-acct>.blob.core.windows.net/<container>/path/'
CREDENTIALS (
  AZURE_ACCOUNT_NAME '<acct_name>',
  AZURE_ACCOUNT_KEY  '<key>'
  -- or SAS_TOKEN '<sas_token>'
)
[FORMAT AS {CSV|JSON|PARQUET|AVRO}]
[DELIMITER '<char>']
[IGNOREHEADER <n>]
[TIMEFORMAT 'auto']
[DATEFORMAT 'auto']
[REGION '<aws-region>']
[COMPUPDATE {ON|OFF}]
[STATUPDATE {ON|OFF}];

Common Mistakes

Using an expired SAS token causes authentication failures. Always generate a fresh token or rotate keys, then update the CREDENTIALS clause before rerunning COPY.
Leaving COMPUPDATE OFF on first load keeps sub-optimal encodings, slowing queries. Keep COMPUPDATE ON for initial loads or run ANALYZE COMPRESSION afterward.