How to Migrate from Redshift to Snowflake

How can I migrate an Amazon Redshift cluster to Snowflake with minimal downtime?

Unload data from Redshift to S3, create Snowflake stages, then COPY INTO matching tables to complete a fast, low-risk migration.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What is the fastest way to migrate Redshift data to Snowflake?

Unload each Redshift table to compressed Parquet files in S3, create external or internal stages in Snowflake that point to those files, then run parallel COPY INTO commands to populate Snowflake tables.

Which migration phases should I follow?

Follow four atomic phases: Assess & plan, Extract (UNLOAD), Load (COPY INTO), and Validate & cutover.

Phase 1: How do I assess and plan?

Inventory tables, views, and dependencies with the Redshift system views. Size data volumes, note sort/dist keys, and map data types to Snowflake equivalents.

Phase 2: How do I extract data from Redshift?

Run UNLOAD in parallel for each table, targeting Parquet files in S3. Use IAM roles and PARALLEL OFF to control file sizes if needed.

Phase 3: How do I load data into Snowflake?

Create a named stage that points to the S3 bucket. Use COPY INTO with PURGE=TRUE and FILE_FORMAT=(TYPE=PARQUET) to ingest each table concurrently.

Phase 4: How do I validate and cut over?

Row-count and checksum each table, replay recent Redshift changes via change-data-capture, then redirect clients to Snowflake. Decommission Redshift only after success metrics pass.

How do I migrate schemas and views?

Export DDL from Redshift using pg_dump --schema-only or system catalog queries. Replace Redshift-specific syntax (DISTKEY, SORTKEY) with Snowflake clustering keys or drop them.

When should I use Snowflake external tables?

Use external tables for a phased approach: query S3-resident Parquet directly in Snowflake, test workloads, then materialize to native tables later with CREATE TABLE AS SELECT.

Best practices for large datasets?

Compress files (Parquet/GZIP), keep objects <1 GB, and leverage Snowflake multi-cluster warehouses to scale COPY throughput. Pre-create target tables with appropriate data types.

How can Galaxy speed up this migration?

Galaxy’s AI copilot writes UNLOAD/COPY scripts, refactors queries for Snowflake syntax, and lets teams collaborate on validated migration SQL without pasting code in Slack.

Why How to Migrate from Redshift to Snowflake is important

How to Migrate from Redshift to Snowflake Example Usage


-- Migrate the Orders table
-- 1. Redshift UNLOAD
UNLOAD ('SELECT * FROM public.orders')
TO 's3://ecom-export/orders_'
CREDENTIALS 'aws_access_key_id=<ID>;aws_secret_access_key=<KEY>'
FORMAT PARQUET PARALLEL OFF;

-- 2. Snowflake COPY
COPY INTO orders
FROM @ecom_stage/orders_
FILE_FORMAT=(TYPE=PARQUET)
ON_ERROR='CONTINUE';

How to Migrate from Redshift to Snowflake Syntax


-- Phase 2 – extract from Redshift
UNLOAD ('SELECT * FROM public.customers')
TO 's3://ecom-export/customers_'
CREDENTIALS 'aws_access_key_id=<ID>;aws_secret_access_key=<KEY>'
FORMAT PARQUET PARALLEL OFF;

-- Phase 3 – load into Snowflake
CREATE OR REPLACE STAGE ecom_stage
  url='s3://ecom-export/'
  credentials=(aws_key_id='<ID>' aws_secret_key='<KEY>');

COPY INTO customers
  FROM @ecom_stage/customers_
  FILE_FORMAT=(TYPE=PARQUET)
  FORCE=TRUE
  PURGE=TRUE;

Common Mistakes

Mistake: Exporting data as CSV instead of Parquet. CSV bloats files and slows COPY. Fix: Use <code>FORMAT PARQUET</code> in UNLOAD for smaller, column-compressed files.
Mistake: Forgetting to match Redshift and Snowflake data types (e.g., TIMESTAMP vs TIMESTAMP_NTZ). Fix: Pre-create tables in Snowflake with the correct types before running COPY.

Frequently Asked Questions (FAQs)

Can I keep both warehouses in sync during migration?

Use Redshift Spectrum or AWS DMS to stream ongoing changes to S3, then load them into Snowflake on a schedule until final cutover.

Do I need to rewrite all SQL?

Most standard SQL works, but replace Redshift-specific functions (DISTINCT ON, LISTAGG delimiter syntax) with Snowflake equivalents. Galaxy AI can automate many rewrites.

How long does a 1 TB migration take?

With compressed Parquet and a medium Snowflake warehouse, expect 30-60 minutes of load time. End-to-end duration depends on network bandwidth and validation steps.