How to Diagnose Slow Queries in Redshift

Galaxy Glossary

How do I troubleshoot slow queries in Amazon Redshift?

Uses system tables, EXPLAIN, and table design checks to locate and fix slow-running Amazon Redshift queries.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

What makes a Redshift query slow?

Slow queries usually read too much data, require large network shuffles, or spill to disk. Check scan rows, join distribution, and temporary disk usage first.

Does the query scan more data than needed?

Filter early and project only needed columns. Verify with SVL_QUERY_REPORT: rows and bytes should be small relative to the table size.

Is the query spilling to disk?

High workmem usage pushes data to disk, slowing execution.In EXPLAIN ANALYZE look for "DS_DIST_OUTER" or "DS_BCAST_INNER" steps with Spill=true.

How do I find slow queries quickly?

Query STL_QUERY or STL_WLM_QUERY for statements exceeding a duration threshold. Order by total_exec_time to list the worst offenders.

SELECT userid, query, label, total_exec_time/1000 AS sec FROM stl_query WHERE starttime > current_date - 1 AND total_exec_time > 60000 ORDER BY total_exec_time DESC;

How do I read the EXPLAIN plan?

Run EXPLAIN to see logical steps and EXPLAIN ANALYZE for real execution stats.Focus on DS_DIST (data shuffles), SCAN rows, and Spill fields.

Identify distribution and sort issues

Look for large DS_DIST steps. If they appear, align DISTKEY columns on both joined tables. Verify join columns use the same DISTKEY.

How can I rewrite queries for speed?

Aggregate in subqueries, avoid SELECT *, and push predicates into CTEs.Use JOIN order that filters largest tables first.

When should I change table design?

If queries still shuffle or scan large ranges, set a proper SORTKEY matching the main filter column and a DISTKEY matching the most frequent join key.

What are safe WLM settings?

Divide concurrency slots by workload type and give long-running analytics queues more memory. Monitor STL_WLM_QUERY to tune query_concurrency.

Why How to Diagnose Slow Queries in Redshift is important

How to Diagnose Slow Queries in Redshift Example Usage


-- List slowest ecommerce queries today
SELECT q.userid,
       q.query,
       q.total_exec_time/1000 AS sec,
       t.text AS sql_text
FROM   stl_query     q
JOIN   stl_querytext t USING (userid, query)
WHERE  q.starttime::date = current_date
ORDER  BY sec DESC
LIMIT  5;

How to Diagnose Slow Queries in Redshift Syntax


-- Check slow queries in the last 24h
SELECT userid, query, label, total_exec_time/1000 AS seconds
FROM stl_query
WHERE starttime > current_date - 1
  AND total_exec_time > 60000
ORDER BY total_exec_time DESC;

-- View execution plan
EXPLAIN ANALYZE
SELECT c.name, SUM(oi.quantity*p.price) AS lifetime_value
FROM Customers c
JOIN Orders o  ON o.customer_id = c.id
JOIN OrderItems oi ON oi.order_id   = o.id
JOIN Products p   ON p.id           = oi.product_id
WHERE o.order_date >= date_trunc('year', current_date)
GROUP BY c.name
ORDER BY lifetime_value DESC;

-- Change table to minimize shuffle
ALTER TABLE Orders ALTER DISTKEY customer_id;
ALTER TABLE Orders ALTER SORTKEY(order_date);

Common Mistakes

Ignoring the EXPLAIN plan. Many users guess the problem instead of reading <code>DS_DIST</code> and <code>Spill</code> flags. Always run <code>EXPLAIN ANALYZE</code> and target the step with the highest actual time.
Using default DISTKEY/SORTKEY. Leaving tables with <code>EVEN</code> distribution and no sort key causes full scans and shuffles. Pick keys that align with frequent filters and joins.