ROW_NUMBER() assigns a consecutive integer to each row within a specified partition and order, enabling easy ranking and de-duplication.
ROW_NUMBER() returns a sequential number starting at 1 for every row in each partition. The numbering order is controlled by an ORDER BY clause inside the analytic window. Because it is deterministic, you can quickly rank, paginate, or filter duplicate records.
The analytic function must be paired with an OVER() clause. You may optionally partition the data and must specify an order for numbering.
ROW_NUMBER() OVER ( [PARTITION BY partition_expression] ORDER BY sort_expression [ASC|DESC] )
Use PARTITION BY customer_id to restart the count for each customer, then ORDER BY the desired column, such as order_date, to define the sequence.
SELECT
customer_id,
order_id,
order_date,
total_amount,
ROW_NUMBER() OVER (
PARTITION BY customer_id
ORDER BY order_date DESC
) AS order_rank
FROM `project.dataset.Orders`;
Apply ROW_NUMBER() to keep the latest record in slowly changing tables, implement result pagination, remove duplicates while preserving a specific record, or assign serial numbers in reports.
Always include ORDER BY to guarantee deterministic results. Use PARTITION BY only when you need independent sequences. To keep one record per group, wrap the query in a CTE and filter for row_number = 1.
Without ORDER BY, BigQuery raises an error because the function cannot know how to sequence rows. Always supply at least one sort column, ideally with ties resolved by a unique field like id.
When de-duplicating, developers sometimes filter WHERE row_num > 1 to delete duplicates but forget to invert the condition when selecting the survivor rows. Double-check the comparison operator.
Yes. If you omit PARTITION BY, BigQuery numbers the entire result set as one partition.
Wrap the ranking query in a subquery and filter WHERE order_rank = 1. This keeps the most recent order for every customer.
ROW_NUMBER() itself is lightweight, but the ORDER BY inside the window requires sorting.Ensure the sort keys have low cardinality or are pre-clustered for better performance.
.
No. Counting starts at 1 for the first row in each partition.
ROW_NUMBER() increments by 1 with no gaps, while RANK() skips numbers when ties occur.
You cannot use it directly in the same SELECT level; wrap the query in a subquery or CTE and then filter.