How to Use ROW_NUMBER() in Redshift

Galaxy Glossary

How do I assign a row number in Amazon Redshift?

ROW_NUMBER() assigns a unique, sequential integer to each row within a partition ordered by a specified column list.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.

Description

What does ROW_NUMBER() do in Redshift?

ROW_NUMBER() is a window function that returns a monotonically increasing integer starting at 1 for the first row in each partition. It is useful for de-duplicating records, implementing pagination, and selecting the first or last row per group without complex subqueries.

What is the correct syntax?

Use ROW_NUMBER() OVER (PARTITION BY partition_cols ORDER BY sort_cols) AS alias. PARTITION BY is optional; ORDER BY is required to guarantee deterministic numbering.

How do I add row numbers to ordered results?

The example below numbers all orders by descending total_amount so the highest value order receives row number 1.

SELECT
id,
customer_id,
total_amount,
ROW_NUMBER() OVER (ORDER BY total_amount DESC) AS rn
FROM Orders;

How can I keep only the latest order per customer?

Partition by customer_id and order by order_date DESC, then filter for rn = 1.

WITH ranked AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS rn
FROM Orders)
SELECT *
FROM ranked
WHERE rn = 1;

How do I paginate a large result set?

Wrap the query with ROW_NUMBER(), then filter by a range of row numbers.

WITH ordered AS (
SELECT *,
ROW_NUMBER() OVER (ORDER BY created_at DESC) AS rn
FROM Customers)
SELECT *
FROM ordered
WHERE rn BETWEEN 101 AND 150;

Best practices for ROW_NUMBER()

Always include ORDER BY for deterministic results; use PARTITION BY to restart numbering per group; avoid using ROW_NUMBER() without a WHERE filter in very large tables to minimize unnecessary scans.

Common mistakes to avoid

See below for two frequent errors and their fixes.

Why How to Use ROW_NUMBER() in Redshift is important

How to Use ROW_NUMBER() in Redshift Example Usage


-- Select the first product purchased by every customer
WITH first_purchase AS (
    SELECT
        oi.order_id,
        oi.product_id,
        ROW_NUMBER() OVER (PARTITION BY o.customer_id ORDER BY o.order_date) AS rn
    FROM OrderItems oi
    JOIN Orders o ON o.id = oi.order_id)
SELECT *
FROM first_purchase
WHERE rn = 1;

How to Use ROW_NUMBER() in Redshift Syntax


ROW_NUMBER() OVER (
    [PARTITION BY partition_column [, ...]]
    ORDER BY sort_expression [ASC | DESC] [, ...]
) AS row_number_alias

Example with ecommerce tables:
SELECT
    o.id,
    o.customer_id,
    ROW_NUMBER() OVER (PARTITION BY o.customer_id ORDER BY o.order_date DESC) AS rn
FROM Orders o;

Common Mistakes

Frequently Asked Questions (FAQs)

Can I sort by multiple columns in ROW_NUMBER()?

Yes. Provide a comma-separated list in the ORDER BY clause, e.g., ORDER BY order_date DESC, id.

Does ROW_NUMBER() impact query performance?

Minimal overhead when the ORDER BY column is in sort key or distribution key. Large unsorted tables may require extra disk spill during sorting.

What is the difference between ROW_NUMBER() and RANK()?

ROW_NUMBER() returns unique integers without gaps; RANK() assigns the same rank to ties and can skip numbers after ties.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie
BauHealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.