How to Use Subqueries in Redshift

How do I write a subquery in Amazon Redshift?

A Redshift subquery is a SELECT statement nested inside another query to filter, compute, or supply data.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What is a subquery in Redshift?

A subquery is a SELECT placed inside another query’s SELECT, FROM, or WHERE clause to provide a value set, single value, or virtual table the outer query can use.

When should I use a subquery instead of a JOIN?

Use a subquery when the nested logic depends on row-by-row evaluation, requires aggregation before filtering, or when simplifying complex JOINs improves readability.

What is the basic syntax?

Wrap the inner SELECT in parentheses. The outer query can treat the result as a table, list, or scalar value depending on context.

Scalar subquery example

Return each customer with total orders computed inline.

IN/EXISTS subquery example

Filter customers who bought out-of-stock products using EXISTS.

FROM-clause subquery example

Create an inline view that aggregates order revenue, then join to Customers for reporting.

How do I avoid performance issues?

Rewrite correlated subqueries as JOINs when possible, ensure inner queries have selective predicates, and avoid returning unnecessary columns.

Best practices for subqueries

1️⃣ Keep subqueries small and indexed by distribution/sort keys. 2️⃣ Use LIMIT for testing. 3️⃣ Materialize heavy subqueries into temporary tables for reuse.

Why How to Use Subqueries in Redshift is important

How to Use Subqueries in Redshift Example Usage


-- List customers with at least one order containing an out-of-stock product
SELECT DISTINCT c.id, c.name, c.email
FROM Customers c
WHERE EXISTS (
  SELECT 1
  FROM Orders o
  JOIN OrderItems oi ON oi.order_id = o.id
  JOIN Products p  ON p.id = oi.product_id
  WHERE o.customer_id = c.id
    AND p.stock = 0);

How to Use Subqueries in Redshift Syntax


-- Scalar subquery in SELECT
SELECT c.id,
       c.name,
       (SELECT COUNT(*)
        FROM Orders o
        WHERE o.customer_id = c.id) AS order_count
FROM Customers c;

-- Subquery in WHERE using IN
SELECT *
FROM Products p
WHERE p.id IN (SELECT product_id
               FROM OrderItems
               WHERE quantity > 3);

-- EXISTS correlated subquery
SELECT c.*
FROM Customers c
WHERE EXISTS (
  SELECT 1
  FROM Orders o
  WHERE o.customer_id = c.id
    AND o.total_amount > 500);

-- Subquery in FROM (inline view)
SELECT c.name, rev.total_amount
FROM Customers c
JOIN (
  SELECT customer_id, SUM(total_amount) AS total_amount
  FROM Orders
  GROUP BY customer_id) rev
ON rev.customer_id = c.id;

Common Mistakes

Forgetting parentheses around the subquery. Redshift requires the inner SELECT to be fully enclosed; omitting them results in a syntax error. Always wrap the subquery.
Returning multiple columns in a scalar context. A scalar subquery must yield exactly one column and one row. Fix by selecting a single aggregated value or switch to an IN clause.