How to GROUP BY in BigQuery

How do I use GROUP BY in BigQuery to aggregate data?

GROUP BY in BigQuery aggregates rows that share column values so you can apply aggregate functions like COUNT, SUM, or AVG.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

Why use GROUP BY in BigQuery?

Group related rows to calculate totals, averages, or counts in one scan, reducing data processed and query cost.

How do I write a basic GROUP BY query?

Place non-aggregated columns in the GROUP BY clause and use aggregate functions on the remaining columns.

Can I GROUP BY multiple columns?

Yes. List every dimension column after GROUP BY or use positional references like 1,2,3 that map to the SELECT list.

How do I filter aggregated results?

Use HAVING after GROUP BY to keep or discard groups based on aggregate conditions, e.g., HAVING SUM(total_amount) > 1000.

What are best practices?

Project only needed columns, use table aliases, avoid SELECT *, and filter early with WHERE to minimize scanned data.

When should I use GROUP EACH BY?

Legacy SQL only. In Standard SQL, BigQuery handles large cardinality automatically; simply use GROUP BY.

Example: daily revenue per product

This query sums order totals by product and date, demonstrating multiple dimensions and HAVING.

SELECT p.name, DATE(o.order_date) AS order_day, SUM(oi.quantity * p.price) AS revenue FROM `shop.Orders` o JOIN `shop.OrderItems` oi ON oi.order_id = o.id JOIN `shop.Products` p ON p.id = oi.product_id GROUP BY p.name, order_day HAVING revenue > 500 ORDER BY revenue DESC;

FAQ: Is SELECT column mandatory in GROUP BY?

Yes. Every non-aggregated column in SELECT must appear in GROUP BY or the query errors.

FAQ: Can I alias aggregate columns?

Absolutely. Use AS to create readable names—e.g., SUM(total_amount) AS total_revenue.

Why How to GROUP BY in BigQuery is important

How to GROUP BY in BigQuery Example Usage


-- Average order value per customer, excluding low spenders
SELECT c.id,
       c.name,
       AVG(o.total_amount) AS avg_order_value
FROM   Customers c
JOIN   Orders o ON o.customer_id = c.id
GROUP  BY c.id, c.name
HAVING AVG(o.total_amount) > 50
ORDER  BY avg_order_value DESC;

How to GROUP BY in BigQuery Syntax


SELECT [dimension_columns],
       aggregate_function(measure_column) AS alias
FROM   dataset.table
[WHERE  condition]
GROUP  BY dimension_columns
[HAVING aggregate_condition]
[ORDER  BY 1,2 ...]

-- Ecommerce example aggregating orders by customer
SELECT  c.id,
        c.name,
        COUNT(o.id)            AS order_count,
        SUM(o.total_amount)    AS lifetime_value
FROM    Customers c
LEFT JOIN Orders o ON o.customer_id = c.id
GROUP  BY c.id, c.name
ORDER  BY lifetime_value DESC;

Common Mistakes

Omitting SELECT columns from GROUP BY: BigQuery throws "column is not aggregated" error. Fix by adding every non-aggregated column to GROUP BY.
Using HAVING for row-level filters: HAVING runs after aggregation, causing wrong results. Move row filters to WHERE to trim data earlier.