Uses system tables, EXPLAIN, and table design checks to locate and fix slow-running Amazon Redshift queries.
Slow queries usually read too much data, require large network shuffles, or spill to disk. Check scan rows, join distribution, and temporary disk usage first.
Filter early and project only needed columns. Verify with SVL_QUERY_REPORT: rows
and bytes
should be small relative to the table size.
High workmem
usage pushes data to disk, slowing execution.In EXPLAIN ANALYZE look for "DS_DIST_OUTER" or "DS_BCAST_INNER" steps with Spill
=true.
Query STL_QUERY
or STL_WLM_QUERY
for statements exceeding a duration threshold. Order by total_exec_time
to list the worst offenders.
SELECT userid, query, label, total_exec_time/1000 AS sec
FROM stl_query
WHERE starttime > current_date - 1
AND total_exec_time > 60000
ORDER BY total_exec_time DESC;
Run EXPLAIN
to see logical steps and EXPLAIN ANALYZE
for real execution stats.Focus on DS_DIST
(data shuffles), SCAN
rows, and Spill
fields.
Look for large DS_DIST
steps. If they appear, align DISTKEY
columns on both joined tables. Verify join columns use the same DISTKEY
.
Aggregate in subqueries, avoid SELECT *
, and push predicates into CTEs.Use JOIN
order that filters largest tables first.
If queries still shuffle or scan large ranges, set a proper SORTKEY
matching the main filter column and a DISTKEY
matching the most frequent join key.
Divide concurrency slots by workload type and give long-running analytics queues more memory. Monitor STL_WLM_QUERY
to tune query_concurrency
.
.
Yes. Rewrite queries to filter early, remove unused columns, and break complex CTEs into temporary tables.
High concurrency can queue queries and limit memory. Tune WLM slots and queue timeouts to balance throughput and latency.
Spectrum scans S3 data, so cold reads add latency. Partition external tables and push predicates to minimize scanned files.