Amazon Redshift is a PostgreSQL-compatible, cloud data-warehouse that you connect to and query with standard SQL.
Use a PostgreSQL connection string that points to the Redshift endpoint, port 5439, and your database name.
psql "host=example-cluster.abc123.us-east-1.redshift.amazonaws.com \
port=5439 dbname=dev user=reporting password=••••"
Keep the connection user in the public
schema read-only unless you need DDL rights. Store credentials in a secrets manager and pass them as environment variables for CI jobs.
Redshift’s COPY
command ingests files in parallel, making it the fastest way to populate fact tables.
COPY Orders
FROM 's3://ecom-data/2024/06/orders.csv'
CREDENTIALS 'aws_iam_role=arn:aws:iam::123456789012:role/redshiftCopyRole'
FORMAT AS CSV IGNOREHEADER 1 REGION 'us-east-1';
Grant the cluster IAM role s3:GetObject
on the bucket. Match file column order to the table to avoid load errors.
Because Redshift speaks PostgreSQL, you write the same joins you already know.
SELECT c.name,
COUNT(o.id) AS orders,
SUM(o.total_amount) AS lifetime_value
FROM Customers c
JOIN Orders o ON o.customer_id = c.id
GROUP BY c.name
ORDER BY lifetime_value DESC
LIMIT 20;
Use DISTKEY
and SORTKEY
on customer_id
to minimize data movement for this pattern.
DISTSTYLE KEY
to large fact tables on the most common join column.ENCODE AUTO
or specific encodings found via ANALYZE COMPRESSION
.VACUUM
and ANALYZE
after heavy loads to reclaim space and update statistics.Redshift omits some features (e.g., ALTER TABLE … ADD CONSTRAINT
foreign keys) and adds warehouse-specific ones like COPY
, UNLOAD
, and spectrum external tables. Always test queries for compatibility before porting code.
Yes. Redshift is wire-compatible with PostgreSQL 8.0.2, so any standard PostgreSQL client works as long as you point to the cluster endpoint on port 5439.
Use the UNLOAD
command:
UNLOAD ('SELECT * FROM Orders')
TO 's3://ecom-data/unloads/orders_'
IAM_ROLE 'arn:aws:iam::123456789012:role/redshiftUnloadRole'
FORMAT AS PARQUET;
Serverless removes cluster sizing; you pay for RPU-seconds. SQL syntax stays the same, but connection strings use <workgroup>.<region>.redshift-serverless.amazonaws.com
.