How to GROUP BY in BigQuery

Galaxy Glossary

How do I use GROUP BY in BigQuery to aggregate data?

GROUP BY in BigQuery aggregates rows that share column values so you can apply aggregate functions like COUNT, SUM, or AVG.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.

Description

Why use GROUP BY in BigQuery?

Group related rows to calculate totals, averages, or counts in one scan, reducing data processed and query cost.

How do I write a basic GROUP BY query?

Place non-aggregated columns in the GROUP BY clause and use aggregate functions on the remaining columns.

Can I GROUP BY multiple columns?

Yes. List every dimension column after GROUP BY or use positional references like 1,2,3 that map to the SELECT list.

How do I filter aggregated results?

Use HAVING after GROUP BY to keep or discard groups based on aggregate conditions, e.g., HAVING SUM(total_amount) > 1000.

What are best practices?

Project only needed columns, use table aliases, avoid SELECT *, and filter early with WHERE to minimize scanned data.

When should I use GROUP EACH BY?

Legacy SQL only. In Standard SQL, BigQuery handles large cardinality automatically; simply use GROUP BY.

Example: daily revenue per product

This query sums order totals by product and date, demonstrating multiple dimensions and HAVING.

SELECT p.name,
DATE(o.order_date) AS order_day,
SUM(oi.quantity * p.price) AS revenue
FROM `shop.Orders` o
JOIN `shop.OrderItems` oi ON oi.order_id = o.id
JOIN `shop.Products` p ON p.id = oi.product_id
GROUP BY p.name, order_day
HAVING revenue > 500
ORDER BY revenue DESC;

FAQ: Is SELECT column mandatory in GROUP BY?

Yes. Every non-aggregated column in SELECT must appear in GROUP BY or the query errors.

FAQ: Can I alias aggregate columns?

Absolutely. Use AS to create readable names—e.g., SUM(total_amount) AS total_revenue.

Why How to GROUP BY in BigQuery is important

How to GROUP BY in BigQuery Example Usage


-- Average order value per customer, excluding low spenders
SELECT c.id,
       c.name,
       AVG(o.total_amount) AS avg_order_value
FROM   Customers c
JOIN   Orders o ON o.customer_id = c.id
GROUP  BY c.id, c.name
HAVING AVG(o.total_amount) > 50
ORDER  BY avg_order_value DESC;

How to GROUP BY in BigQuery Syntax


SELECT [dimension_columns],
       aggregate_function(measure_column) AS alias
FROM   dataset.table
[WHERE  condition]
GROUP  BY dimension_columns
[HAVING aggregate_condition]
[ORDER  BY 1,2 ...]

-- Ecommerce example aggregating orders by customer
SELECT  c.id,
        c.name,
        COUNT(o.id)            AS order_count,
        SUM(o.total_amount)    AS lifetime_value
FROM    Customers c
LEFT JOIN Orders o ON o.customer_id = c.id
GROUP  BY c.id, c.name
ORDER  BY lifetime_value DESC;

Common Mistakes

Frequently Asked Questions (FAQs)

Can I reference SELECT positions in GROUP BY?

Yes. GROUP BY 1,2 refers to the first and second columns in the SELECT list. This shortens queries but reduces readability.

Does GROUP BY affect query cost?

It can lower cost because aggregation reduces result size, but scanning cost depends on selected columns and filters applied in WHERE.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie
BauHealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
Oops! Something went wrong while submitting the form.