How to Diagnose Slow Queries in Redshift

Galaxy Glossary

How do I troubleshoot slow queries in Amazon Redshift?

Uses system tables, EXPLAIN, and table design checks to locate and fix slow-running Amazon Redshift queries.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.

Description

Table of Contents

What makes a Redshift query slow?

Slow queries usually read too much data, require large network shuffles, or spill to disk. Check scan rows, join distribution, and temporary disk usage first.

Does the query scan more data than needed?

Filter early and project only needed columns. Verify with SVL_QUERY_REPORT: rows and bytes should be small relative to the table size.

Is the query spilling to disk?

High workmem usage pushes data to disk, slowing execution.In EXPLAIN ANALYZE look for "DS_DIST_OUTER" or "DS_BCAST_INNER" steps with Spill=true.

How do I find slow queries quickly?

Query STL_QUERY or STL_WLM_QUERY for statements exceeding a duration threshold. Order by total_exec_time to list the worst offenders.

SELECT userid, query, label, total_exec_time/1000 AS sec
FROM stl_query
WHERE starttime > current_date - 1
AND total_exec_time > 60000
ORDER BY total_exec_time DESC;

How do I read the EXPLAIN plan?

Run EXPLAIN to see logical steps and EXPLAIN ANALYZE for real execution stats.Focus on DS_DIST (data shuffles), SCAN rows, and Spill fields.

Identify distribution and sort issues

Look for large DS_DIST steps. If they appear, align DISTKEY columns on both joined tables. Verify join columns use the same DISTKEY.

How can I rewrite queries for speed?

Aggregate in subqueries, avoid SELECT *, and push predicates into CTEs.Use JOIN order that filters largest tables first.

When should I change table design?

If queries still shuffle or scan large ranges, set a proper SORTKEY matching the main filter column and a DISTKEY matching the most frequent join key.

What are safe WLM settings?

Divide concurrency slots by workload type and give long-running analytics queues more memory. Monitor STL_WLM_QUERY to tune query_concurrency.

.

Why How to Diagnose Slow Queries in Redshift is important

How to Diagnose Slow Queries in Redshift Example Usage


-- List slowest ecommerce queries today
SELECT q.userid,
       q.query,
       q.total_exec_time/1000 AS sec,
       t.text AS sql_text
FROM   stl_query     q
JOIN   stl_querytext t USING (userid, query)
WHERE  q.starttime::date = current_date
ORDER  BY sec DESC
LIMIT  5;

How to Diagnose Slow Queries in Redshift Syntax


-- Check slow queries in the last 24h
SELECT userid, query, label, total_exec_time/1000 AS seconds
FROM stl_query
WHERE starttime > current_date - 1
  AND total_exec_time > 60000
ORDER BY total_exec_time DESC;

-- View execution plan
EXPLAIN ANALYZE
SELECT c.name, SUM(oi.quantity*p.price) AS lifetime_value
FROM Customers c
JOIN Orders o  ON o.customer_id = c.id
JOIN OrderItems oi ON oi.order_id   = o.id
JOIN Products p   ON p.id           = oi.product_id
WHERE o.order_date >= date_trunc('year', current_date)
GROUP BY c.name
ORDER BY lifetime_value DESC;

-- Change table to minimize shuffle
ALTER TABLE Orders ALTER DISTKEY customer_id;
ALTER TABLE Orders ALTER SORTKEY(order_date);

Common Mistakes

Frequently Asked Questions (FAQs)

Can I speed up queries without changing schema?

Yes. Rewrite queries to filter early, remove unused columns, and break complex CTEs into temporary tables.

Does concurrency affect performance?

High concurrency can queue queries and limit memory. Tune WLM slots and queue timeouts to balance throughput and latency.

Is Redshift Spectrum slower?

Spectrum scans S3 data, so cold reads add latency. Partition external tables and push predicates to minimize scanned files.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie Logo
Bauhealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.