How to Use ROW_NUMBER() in Redshift

How do I assign a row number in Amazon Redshift?

ROW_NUMBER() assigns a unique, sequential integer to each row within a partition ordered by a specified column list.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What does ROW_NUMBER() do in Redshift?

ROW_NUMBER() is a window function that returns a monotonically increasing integer starting at 1 for the first row in each partition. It is useful for de-duplicating records, implementing pagination, and selecting the first or last row per group without complex subqueries.

What is the correct syntax?

Use ROW_NUMBER() OVER (PARTITION BY partition_cols ORDER BY sort_cols) AS alias. PARTITION BY is optional; ORDER BY is required to guarantee deterministic numbering.

How do I add row numbers to ordered results?

The example below numbers all orders by descending total_amount so the highest value order receives row number 1.

SELECT id, customer_id, total_amount, ROW_NUMBER() OVER (ORDER BY total_amount DESC) AS rn FROM Orders;

How can I keep only the latest order per customer?

Partition by customer_id and order by order_date DESC, then filter for rn = 1.

WITH ranked AS ( SELECT *, ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn FROM Orders) SELECT * FROM ranked WHERE rn = 1;

How do I paginate a large result set?

Wrap the query with ROW_NUMBER(), then filter by a range of row numbers.

WITH ordered AS ( SELECT *, ROW_NUMBER() OVER (ORDER BY created_at DESC) AS rn FROM Customers) SELECT * FROM ordered WHERE rn BETWEEN 101 AND 150;

Best practices for ROW_NUMBER()

Always include ORDER BY for deterministic results; use PARTITION BY to restart numbering per group; avoid using ROW_NUMBER() without a WHERE filter in very large tables to minimize unnecessary scans.

Common mistakes to avoid

See below for two frequent errors and their fixes.

Why How to Use ROW_NUMBER() in Redshift is important

How to Use ROW_NUMBER() in Redshift Example Usage


-- Select the first product purchased by every customer
WITH first_purchase AS (
    SELECT
        oi.order_id,
        oi.product_id,
        ROW_NUMBER() OVER (PARTITION BY o.customer_id ORDER BY o.order_date) AS rn
    FROM OrderItems oi
    JOIN Orders o ON o.id = oi.order_id)
SELECT *
FROM first_purchase
WHERE rn = 1;

How to Use ROW_NUMBER() in Redshift Syntax


ROW_NUMBER() OVER (
    [PARTITION BY partition_column [, ...]]
    ORDER BY sort_expression [ASC | DESC] [, ...]
) AS row_number_alias

Example with ecommerce tables:
SELECT
    o.id,
    o.customer_id,
    ROW_NUMBER() OVER (PARTITION BY o.customer_id ORDER BY o.order_date DESC) AS rn
FROM Orders o;

Common Mistakes

Forgetting ORDER BY inside OVER(): Without ORDER BY, Redshift may assign row numbers nondeterministically. Always specify a clear sorting column list to guarantee consistent results.
Filtering on row_number alias in the same SELECT: You can’t reference the alias in the WHERE clause of the same query block. Wrap the query in a CTE or subquery, then apply the filter.