How to Migrate from Postgres to Redshift in PostgreSQL

Galaxy Glossary

How do I migrate data from PostgreSQL to Amazon Redshift?

Move data and schema from a PostgreSQL source into Amazon Redshift with minimal downtime and full fidelity.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Description

Why migrate from Postgres to Redshift?

Teams outgrow single-instance Postgres when analytics queries slow OLTP workloads. Redshift’s columnar storage and MPP architecture scale reads without impacting the transactional database.

What are the prerequisites?

Create an Amazon S3 bucket, provision a Redshift cluster, and ensure network connectivity from Postgres to S3 or to an AWS DMS replication instance.

How do I check schema compatibility?

Run pg_dump --schema-only and feed the output into the AWS Schema Conversion Tool (SCT). SCT flags unsupported data types (e.g., uuid) and suggests Redshift-friendly equivalents such as VARCHAR(36).

Which migration paths exist?

1) Offline: UNLOAD Postgres tables to CSV, upload to S3, then COPY into Redshift. 2) Minimal-downtime: Use AWS DMS to replicate ongoing changes until cut-over.

How to export Postgres data to S3?

Use the aws_s3 extension or psql piping:
\copy (SELECT * FROM orders) TO PROGRAM 'gzip > orders.csv.gz' CSV then aws s3 cp orders.csv.gz s3://my-bucket/.

How to load data into Redshift?

Grant the cluster an IAM role with S3 read access and run:
COPY orders FROM 's3://my-bucket/orders.csv.gz' IAM_ROLE 'arn:aws:iam::123:role/redshiftCopy' GZIP CSV;

How to keep Postgres and Redshift in sync?

Configure AWS DMS with Postgres as the source and Redshift as the target. Enable “ongoing replication” to stream changes while users stay online.

How to validate the migration?

Run row counts and checksums on critical tables:
SELECT COUNT(*) FROM orders; in both systems should match. Verify aggregates like total revenue.

When do I cut over?

After data parity checks pass, stop writes to Postgres, let DMS apply the last changes, then update downstream BI tools to read from Redshift.

Best practices for performance in Redshift?

Define DISTKEY on high-cardinality join columns (e.g., customer_id) and SORTKEY on frequently filtered dates (e.g., order_date). Schedule VACUUM and ANALYZE.

What are common mistakes?

1) Forgetting to convert sequences; use Redshift IDENTITY. 2) Loading data without compressing; always GZIP large exports.

Why How to Migrate from Postgres to Redshift in PostgreSQL is important

How to Migrate from Postgres to Redshift in PostgreSQL Example Usage


-- Migrate Customers table offline
-- 1. Postgres: export
psql -c "\copy (SELECT id,name,email,created_at FROM customers) TO '/tmp/customers.csv' CSV"
aws s3 cp /tmp/customers.csv s3://mig-bucket/

-- 2. Redshift: create table
CREATE TABLE customers (
  id INT IDENTITY(1,1),
  name VARCHAR(255),
  email VARCHAR(320),
  created_at TIMESTAMP
)
DISTKEY(id) SORTKEY(created_at);

-- 3. Load
COPY customers FROM 's3://mig-bucket/customers.csv'
IAM_ROLE 'arn:aws:iam::123:role/redshiftCopy'
CSV;

-- 4. Check
SELECT COUNT(*) FROM customers;

How to Migrate from Postgres to Redshift in PostgreSQL Syntax


-- 1. Export schema from Postgres
pg_dump -U admin -d shop -s > schema.sql

-- 2. Transform schema for Redshift (SCT or manual)
CREATE TABLE orders (
    id INT IDENTITY(1,1) PRIMARY KEY,
    customer_id INT,
    order_date DATE,
    total_amount NUMERIC(10,2)
)
DISTKEY(customer_id)
SORTKEY(order_date);

-- 3. Export data from Postgres
psql -U admin -d shop -c "\copy orders TO 'orders.csv' CSV"
aws s3 cp orders.csv s3://migration-bucket/

-- 4. Load into Redshift
COPY orders FROM 's3://migration-bucket/orders.csv'
IAM_ROLE 'arn:aws:iam::123:role/redshiftCopy'
CSV;

-- 5. Validate counts
SELECT COUNT(*) FROM public.orders;

Common Mistakes

Frequently Asked Questions (FAQs)

Is downtime required?

No. Use AWS DMS for continuous replication and switch once lag is zero.

Can I migrate only a subset of tables?

Yes. Specify table mappings in DMS or export only selected tables to S3.

How large can the data be?

Redshift handles petabytes. Split exports into ≤6.2 GB files for optimal parallel COPY.

Want to learn about other SQL terms?