How to Use ROW_NUMBER() in BigQuery

How do I assign sequential row numbers in BigQuery?

ROW_NUMBER() assigns a consecutive integer to each row within a specified partition and order, enabling easy ranking and de-duplication.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What does ROW_NUMBER() do in BigQuery?

ROW_NUMBER() returns a sequential number starting at 1 for every row in each partition. The numbering order is controlled by an ORDER BY clause inside the analytic window. Because it is deterministic, you can quickly rank, paginate, or filter duplicate records.

What is the syntax for ROW_NUMBER()?

The analytic function must be paired with an OVER() clause. You may optionally partition the data and must specify an order for numbering.

ROW_NUMBER() OVER ( [PARTITION BY partition_expression] ORDER BY sort_expression [ASC|DESC] )

How do I add a row number per customer?

Use PARTITION BY customer_id to restart the count for each customer, then ORDER BY the desired column, such as order_date, to define the sequence.

Example — Rank orders per customer

SELECT customer_id, order_id, order_date, total_amount, ROW_NUMBER() OVER ( PARTITION BY customer_id ORDER BY order_date DESC ) AS order_rank FROM `project.dataset.Orders`;

When should I use ROW_NUMBER()?

Apply ROW_NUMBER() to keep the latest record in slowly changing tables, implement result pagination, remove duplicates while preserving a specific record, or assign serial numbers in reports.

Best practices for ROW_NUMBER()

Always include ORDER BY to guarantee deterministic results. Use PARTITION BY only when you need independent sequences. To keep one record per group, wrap the query in a CTE and filter for row_number = 1.

Common mistakes to avoid

Omitting ORDER BY

Without ORDER BY, BigQuery raises an error because the function cannot know how to sequence rows. Always supply at least one sort column, ideally with ties resolved by a unique field like id.

Filtering with > 1 instead of = 1

When de-duplicating, developers sometimes filter WHERE row_num > 1 to delete duplicates but forget to invert the condition when selecting the survivor rows. Double-check the comparison operator.

Frequently asked questions

Can I use ROW_NUMBER() without PARTITION BY?

Yes. If you omit PARTITION BY, BigQuery numbers the entire result set as one partition.

How do I keep only the latest order per customer?

Wrap the ranking query in a subquery and filter WHERE order_rank = 1. This keeps the most recent order for every customer.

Does ROW_NUMBER() slow large queries?

ROW_NUMBER() itself is lightweight, but the ORDER BY inside the window requires sorting.Ensure the sort keys have low cardinality or are pre-clustered for better performance.

Why How to Use ROW_NUMBER() in BigQuery is important

How to Use ROW_NUMBER() in BigQuery Example Usage


-- Keep only the most expensive product per order
WITH ranked AS (
  SELECT
    oi.order_id,
    oi.product_id,
    p.name,
    p.price,
    ROW_NUMBER() OVER (
      PARTITION BY oi.order_id
      ORDER BY p.price DESC
    ) AS price_rank
  FROM `project.dataset.OrderItems` oi
  JOIN `project.dataset.Products` p
    ON p.id = oi.product_id
)
SELECT *
FROM ranked
WHERE price_rank = 1;

How to Use ROW_NUMBER() in BigQuery Syntax


ROW_NUMBER() OVER ( [PARTITION BY partition_expression] ORDER BY sort_expression [ASC|DESC] )

Example with ecommerce tables:

SELECT
  id,
  name,
  ROW_NUMBER() OVER (ORDER BY created_at) AS signup_position
FROM `project.dataset.Customers`;

SELECT
  order_id,
  customer_id,
  ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY order_date DESC) AS order_rank
FROM `project.dataset.Orders`;

Common Mistakes

Forgetting ORDER BY – ROW_NUMBER() requires ORDER BY inside OVER(); omitting it causes an error. Fix by adding a deterministic sort column.
Using non-unique ORDER BY columns, leading to unpredictable ties. Fix by appending a unique field like id to the sort list.