How to Debug Queries in Redshift

Galaxy Glossary

How do you debug slow queries in Amazon Redshift?

Use EXPLAIN, ANALYZE, system tables, and WLM logs to find slow steps and fix Amazon Redshift queries.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

Why is my Redshift query slow?

High scan volume, skewed data distribution, missing sort keys, or queue waits usually cause latency. Measuring each factor pinpoints the fix.

Which tools expose query performance?

Combine EXPLAIN, EXPLAIN ANALYZE, system views (SVL_*, STL_*), and the Workload Management (WLM) queue log. Together they reveal scan size, join type, CPU use, and slot contention.

How do I read EXPLAIN output?

Look for sequential scans, DS_DIST_*

steps, and high estimated row counts. These signals show unsorted tables, distribution key fights, or missing predicates.

How do I profile a live query?

Run EXPLAIN ANALYZE. Redshift executes the statement, time-stamps every step, then returns actual row and byte counts. Compare with the estimated plan to locate misestimates.

Which system tables are most useful?

SVL_QUERY_REPORT (overall cost & bytes), SVL_QUERY_SUMMARY (per-step time), STL_WLM_QUERY (queue waits), and SVL_ALERT_EVENT_LOG (disk spills) surface bottlenecks without rerunning the query.

Quick checklist before optimizing

1) Verify predicates use SORTKEY columns. 2) Ensure joins use matching DISTKEYs. 3) Vacuum & analyze tables. 4) Move heavy ad-hoc work to a lower-priority WLM queue.

How do I debug with an ecommerce example?

Suppose Orders has a DISTKEY on customer_id but OrderItems is EVEN. Joining them forces data redistribution. EXPLAIN will display DS_DIST_BOTH. Make OrderItems.customer_id a DISTKEY to eliminate the shuffle.

Best practices for faster turnaround

Capture the query and pid from stl_query, monitor with stv_recents, and abort long-running tests quickly using CANCEL pid.

Why How to Debug Queries in Redshift is important

How to Debug Queries in Redshift Example Usage


-- Find steps causing disk spills for a slow monthly revenue report
EXPLAIN ANALYZE
SELECT c.name, SUM(o.total_amount) AS revenue
FROM Customers c
JOIN Orders o ON o.customer_id = c.id
WHERE o.order_date >= date_trunc('month', current_date)
GROUP BY c.name
ORDER BY revenue DESC;

How to Debug Queries in Redshift Syntax


EXPLAIN [ANALYZE] [VERBOSE] sql_statement;
-- Example
EXPLAIN ANALYZE
SELECT p.name, SUM(oi.quantity) AS units
FROM Orders o
JOIN OrderItems oi ON oi.order_id = o.id
JOIN Products p ON p.id = oi.product_id
WHERE o.order_date >= date_trunc('month', CURRENT_DATE)
GROUP BY p.name
ORDER BY units DESC;

SELECT * FROM SVL_QUERY_REPORT WHERE query = 2456; -- 2456 = stl_query.query id

SELECT service_class, total_queue_time
FROM STL_WLM_QUERY
WHERE query = 2456;

Common Mistakes

Ignoring data skew: Users set the same DISTKEY for huge tables but leave small dimension tables EVEN. This forces massive data movement. Fix by co-locating join keys or choosing ALL distribution for tiny lookups.
Relying only on EXPLAIN: EXPLAIN shows estimates, not reality. Without ANALYZE or system tables you miss spills and queue waits. Always pair EXPLAIN with SVL/STL views for accurate profiling.