Redshift augments PostgreSQL with columnar, massively parallel processing for faster, petabyte-scale analytics.
Redshift removes performance ceilings you hit in vanilla PostgreSQL when tables grow into billions of rows. Columnar storage, zone maps, and parallel execution slash scan time for aggregations and joins, letting you keep historical ecommerce data online instead of archiving it.
Choose Redshift for read-heavy BI dashboards, funnel analysis, customer segmentation, ad-hoc analytics, ML feature generation, and ELT pipelines that crunch large fact tables.Keep OLTP workloads (high-velocity inserts/updates) on PostgreSQL.
Stage files in S3, then run the COPY command. Define DISTKEY and SORTKEY on high-cardinality, frequently-filtered columns (e.g., Orders.order_date) to cut query time.Schedule ANALYZE and VACUUM to keep statistics fresh.
COPY Orders FROM 's3://shop-data/orders/*.csv' IAM_ROLE 'arn:aws:iam::123:role/rs' CSV GZIP TIMEFORMAT 'auto';
Use column pruning with SELECT column_list
, push filters early, and pre-aggregate with materialized views or CREATE TABLE AS SELECT
(CTAS).Partition by date in combination with SORTKEY to enable skip scans.
1) Keep COPY files 100 MB–1 GB for even slice distribution.
2) Use UNLOAD
to share query results with downstream tools.
3) Separate hot and cold data via spectrum or multi-cluster if concurrency spikes.
Avoid treating Redshift like OLTP—bulk batch writes beat single-row inserts. Never ignore SORTKEYs; poor choice forces full-table scans.
.
Redshift started from PostgreSQL 8.0, so most SQL works, but it lacks features like window functions with RANGE, common table inheritance, and extensions.
Teams typically migrate when fact tables exceed hundreds of millions of rows or queries take minutes even after index tuning.
Yes, with Amazon Kinesis or MSK + staged micro-batches you can achieve sub-minute latency. For sub-second needs, stay on PostgreSQL or use Aurora Serverless.