How to Use GROUP BY in SQL Server

How do you use GROUP BY in SQL Server to summarize data?

GROUP BY aggregates rows that share the same values in specified columns so you can run summary calculations like COUNT, SUM, AVG, MAX, and MIN.

Welcome to the Galaxy, Guardian!

Oops! Something went wrong while submitting the form.

Description

What does GROUP BY do in SQL Server?

GROUP BY clusters rows with identical values in one or more columns, allowing aggregate functions to return a single result per group instead of per row.

When should I apply GROUP BY?

Use GROUP BY whenever you need totals, averages, or other aggregates per customer, date, product, or any dimension rather than over the entire table.

How is GROUP BY syntax structured?

Start with SELECT, include aggregate and non-aggregate columns, add FROM and optional JOINs, then finish with GROUP BY listing every non-aggregated column.

Can I filter groups?

Yes—use HAVING after GROUP BY to keep or discard groups based on aggregate results, such as only showing customers with more than five orders.

Why must non-aggregated columns appear in GROUP BY?

SQL Server enforces determinism; every selected value must be derived unambiguously.GROUP BY columns tell the engine how to pick one value per group.

Example: total spend per customer

This query joins Customers and Orders, then groups by customer to calculate lifetime spend.

SELECT c.id, c.name, SUM(o.total_amount) AS lifetime_value FROM Customers c JOIN Orders o ON o.customer_id = c.id GROUP BY c.id, c.name;

Example: best-selling products in 2024

SELECT p.id, p.name, SUM(oi.quantity) AS units_sold FROM OrderItems oi JOIN Orders o ON o.id = oi.order_id JOIN Products p ON p.id = oi.product_id WHERE o.order_date >= '2024-01-01' GROUP BY p.id, p.name ORDER BY units_sold DESC;

Best practices for GROUP BY

Index frequently grouped columns, avoid GROUP BY *expressions* that prevent index use, and push non-aggregate filters into WHERE instead of HAVING.

Common performance tips

Aggregate on indexed integers rather than strings, group on surrogate keys, and keep SELECT list minimal to reduce shuffle cost.

Why How to Use GROUP BY in SQL Server is important

How to Use GROUP BY in SQL Server Example Usage


-- Average order value per customer in 2024
SELECT c.id, c.name, AVG(o.total_amount) AS avg_order_value
FROM   Customers c
JOIN   Orders o ON o.customer_id = c.id
WHERE  o.order_date BETWEEN '2024-01-01' AND '2024-12-31'
GROUP  BY c.id, c.name
HAVING COUNT(*) >= 5;

How to Use GROUP BY in SQL Server Syntax


SELECT <non-aggregated columns>, <aggregate_function>(<column>)
FROM   <table>
[JOIN  <other_table> ON <condition>]
[WHERE <row filter>]
GROUP  BY <non-aggregated column list>
[HAVING <aggregate filter>]
[ORDER BY <column | aggregate>];

Common Mistakes

Selecting columns not in GROUP BY or an aggregate. SQL Server throws "is invalid in the select list" error. Fix by adding the column to GROUP BY or wrapping it in an aggregate like MAX().
Filtering aggregated results in WHERE instead of HAVING. WHERE runs before grouping, so conditions like COUNT(*) > 5 return "invalid" errors. Move the condition to HAVING.