GROUP BY clusters rows that share the same values and lets you run aggregate functions on each cluster.
GROUP BY collapses many rows into buckets that share column values so you can run COUNT, SUM, AVG, MIN, or MAX on each bucket.
GROUP BY partitions rows by the listed columns or expressions and produces one output row per distinct combination. Aggregate functions then compute statistics for every partition.
SELECT aggregate_function(column) [, ...], grouping_columns
FROM table_name
WHERE filter_predicate
GROUP BY grouping_columns;
Use GROUP BY when you need summary data: totals, averages, counts, or any metric that compares multiple rows at once. It powers dashboards, KPIs, and audit queries.
SELECT department_id,
AVG(salary) AS avg_salary
FROM employees
GROUP BY department_id
ORDER BY avg_salary DESC;
Use HAVING to filter after aggregation. WHERE filters before grouping; HAVING can reference aggregates.
SELECT department_id,
COUNT(*) AS headcount
FROM employees
GROUP BY department_id
HAVING COUNT(*) > 10;
List every non-aggregated column in both SELECT and GROUP BY. PostgreSQL treats each unique combination as its own group.
SELECT country, city, SUM(sales) AS total
FROM orders
GROUP BY country, city;
Use GROUP BY ROLLUP(a,b) for hierarchical subtotals, CUBE(a,b) for all combinations, and GROUPING SETS to specify custom subtotal lists.
• Index frequently grouped columns.
• Alias aggregates for readability.
• Pre-compute complex expressions in CTEs.
• Keep grouping columns narrow (INT over TEXT) to reduce sort/hash cost.
1 – Missing column in GROUP BY: PostgreSQL error “column must appear in the GROUP BY clause.” Add it to GROUP BY or wrap it in an aggregate.
2 – Using WHERE with aggregates: WHERE cannot reference aggregates. Move the condition to HAVING.
-- Distinct users per day
SELECT date_trunc('day', created_at) AS day,
COUNT(DISTINCT user_id) AS users
FROM logins
GROUP BY day
ORDER BY day;
-- Grand total with ROLLUP
SELECT department_id,
SUM(sales) AS dept_sales
FROM sales
GROUP BY ROLLUP(department_id);
Yes. GROUP BY with no aggregates simply removes duplicates, acting like DISTINCT. However DISTINCT is usually clearer and faster.
No. Output order is undefined unless you add ORDER BY.
You can group by any expression, but you must repeat the expression itself or use its positional index. Standard SQL forbids grouping by the SELECT alias.