Unload data from Redshift to S3, create Snowflake stages, then COPY INTO matching tables to complete a fast, low-risk migration.
Unload each Redshift table to compressed Parquet files in S3, create external or internal stages in Snowflake that point to those files, then run parallel COPY INTO
commands to populate Snowflake tables.
Follow four atomic phases: Assess & plan, Extract (UNLOAD), Load (COPY INTO), and Validate & cutover.
Inventory tables, views, and dependencies with the Redshift system views. Size data volumes, note sort/dist keys, and map data types to Snowflake equivalents.
Run UNLOAD
in parallel for each table, targeting Parquet files in S3. Use IAM roles and PARALLEL OFF
to control file sizes if needed.
Create a named stage that points to the S3 bucket. Use COPY INTO
with PURGE=TRUE
and FILE_FORMAT=(TYPE=PARQUET)
to ingest each table concurrently.
Row-count and checksum each table, replay recent Redshift changes via change-data-capture, then redirect clients to Snowflake. Decommission Redshift only after success metrics pass.
Export DDL from Redshift using pg_dump --schema-only
or system catalog queries. Replace Redshift-specific syntax (DISTKEY, SORTKEY) with Snowflake clustering keys or drop them.
Use external tables for a phased approach: query S3-resident Parquet directly in Snowflake, test workloads, then materialize to native tables later with CREATE TABLE AS SELECT
.
Compress files (Parquet/GZIP), keep objects <1 GB, and leverage Snowflake multi-cluster warehouses to scale COPY throughput. Pre-create target tables with appropriate data types.
Galaxy’s AI copilot writes UNLOAD/COPY scripts, refactors queries for Snowflake syntax, and lets teams collaborate on validated migration SQL without pasting code in Slack.
Use Redshift Spectrum or AWS DMS to stream ongoing changes to S3, then load them into Snowflake on a schedule until final cutover.
Most standard SQL works, but replace Redshift-specific functions (DISTINCT ON, LISTAGG delimiter syntax) with Snowflake equivalents. Galaxy AI can automate many rewrites.
With compressed Parquet and a medium Snowflake warehouse, expect 30-60 minutes of load time. End-to-end duration depends on network bandwidth and validation steps.