How to Migrate from On-Premise PostgreSQL to Amazon Redshift

How do I migrate an on-premise PostgreSQL database to Amazon Redshift?

Move data, schema, and workloads from a local PostgreSQL server into a fully-managed Amazon Redshift cluster with minimal downtime.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

Why choose Amazon Redshift over on-premise PostgreSQL?

Redshift separates storage and compute, auto-scales, and offloads maintenance. Teams get columnar storage, massively parallel processing (MPP), and painless backups without buying new hardware.

What are the high-level migration steps?

1) Assess schema and code, 2) Spin up a Redshift cluster, 3) Create S3 staging buckets, 4) Extract data from PostgreSQL, 5) Load into Redshift, 6) Validate & switch workloads.

How do I assess my existing database?

Run AWS Schema Conversion Tool (SCT) to scan objects, flag incompatible types, and generate a migration assessment report. Fix issues in PostgreSQL before export to reduce surprises.

Which extraction method is fastest?

UNLOAD to S3 via psql is fastest because Redshift’s COPY reads in parallel. For huge tables add PARALLEL OFF to create multiple part files automatically.

How do I create matching tables in Redshift?

Let SCT generate CREATE TABLE scripts or hand-write DDL with DISTKEY and SORTKEY tuned for query patterns. Always choose ENCODE AUTO for compression.

What syntax extracts data from PostgreSQL?

\! pg_dump \ --format=custom \ --file=customers.dump \ --table=public.customers \ --column-inserts

Or use UNLOAD inside PostgreSQL 14+:

UNLOAD ($$SELECT * FROM public.customers$$) TO '/var/lib/postgresql/unload/customers_' WITH (FORMAT 'CSV', HEADER true);

How do I stage the files in S3?

aws s3 cp /var/lib/postgresql/unload/ s3://ecom-staging/customers/ --recursive

Use aws s3 sync for delta uploads during cut-over.

What is the exact COPY syntax for Redshift?

COPY customers FROM 's3://ecom-staging/customers/' CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/RedshiftCopyRole' FORMAT AS CSV IGNOREHEADER 1 TIMEFORMAT 'auto' REGION 'us-east-1';

Load each table in parallel to maximize cluster throughput.

How do I migrate incremental changes?

AWS Database Migration Service (DMS) can stream CDC from PostgreSQL to Redshift. Enable logical replication in postgresql.conf, create a replication slot, and let DMS apply ongoing changes until cut-over.

How do I validate the migration?

Run row counts, checksums, and spot-check business queries. For example:

SELECT 'customers' AS table, COUNT(*) AS pg_count, (SELECT COUNT(*) FROM redshift_public.customers) AS rs_count FROM public.customers;

When should I cut over applications?

After CDC lag is near zero, point BI tools and ETL pipelines to Redshift endpoints. Keep PostgreSQL read-only for a day as a rollback plan.

Best practices for production migrations?

• Compress export files gzip to cut network time.
• Size Redshift WLM queues before heavy loads.
• Keep S3 buckets in the same region as the cluster.
• Tag all AWS resources for cost tracking.

Common mistakes to avoid?

1) Copying without compression balloons S3 costs. Always compress exports.
2) Ignoring DISTKEY/SORTKEY yields slow joins. Tune keys before loading big tables.

What troubleshooting steps help if COPY fails?

Query stl_load_errors for row-level messages. Fix data types, escape characters, and retry with ACCEPTINVCHARS only as a last resort.

FAQ: How long does a 1 TB migration take?

With gzip + multi-part UNLOAD and Redshift RA3, expect 1-2 hours for extract and 20-40 minutes for COPY, assuming 10 Gbps network.

FAQ: Can I roll back?

Yes. Keep PostgreSQL online in read-only mode until Redshift is fully validated. Switch DNS only after confirming query results.

Why How to Migrate from On-Premise PostgreSQL to Amazon Redshift is important