Denormalizing data in Redshift flattens relational tables into wider, query-ready tables or materialized views to speed up analytics.
Denormalization reduces joins, cutting cross-node data movement and CPU usage. Fewer joins mean faster dashboards and cheaper clusters.
CREATE TABLE AS
(CTAS) writes the result of a SELECT into a new table, letting you set distribution style and sort keys in one step.
1. Draft a SELECT that joins normalized tables.
2. Choose a DISTKEY
matching frequent filters.
3.Add SORTKEY
s that support range scans.
4. Run CTAS to create the flattened table.
5. Schedule refresh via orchestration.
Materialized views keep data fresh automatically. Use them when your workload tolerates their refresh lag and immutable key/sort design.
Refresh after every ETL batch or on a cron aligned with SLA.Large tables benefit from dropping and re-running CTAS with ALTER TABLE APPEND
to avoid vacuum overhead.
• Align DISTKEY
with join columns.
• Keep SORTKEY
s narrow.
• Vacuum after big inserts.
• Use ENCODE AUTO
for compression.
• Tag tables for automated housekeeping.
Skipping distribution keys causes random distribution and full scans—always set a DISTKEY
. Relying on row-by-row INSERT
bloats storage—prefer CTAS or COPY
+ ALTER TABLE APPEND
.
.
Yes. CTAS writes data in bulk, applies compression automatically, and skips logging, giving significant speedups over INSERT SELECT.
No. You must drop and recreate the view with new keys.
Redshift supports up to 16 PB compressed, but aim to keep flattened tables narrow and column-encoded to control storage costs.