How to GROUP BY in PostgreSQL

Galaxy Glossary

How do I use GROUP BY in PostgreSQL to aggregate data?

GROUP BY clusters rows that share the same values and lets you run aggregate functions on each cluster.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

How to use GROUP BY in PostgreSQL

GROUP BY collapses many rows into buckets that share column values so you can run COUNT, SUM, AVG, MIN, or MAX on each bucket.

What does GROUP BY do in PostgreSQL?

GROUP BY partitions rows by the listed columns or expressions and produces one output row per distinct combination. Aggregate functions then compute statistics for every partition.

What is the basic syntax?

SELECT aggregate_function(column) [, ...], grouping_columns FROM table_name WHERE filter_predicate GROUP BY grouping_columns;

When should I use GROUP BY?

Use GROUP BY when you need summary data: totals, averages, counts, or any metric that compares multiple rows at once. It powers dashboards, KPIs, and audit queries.

Example – average salary per department

SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id ORDER BY avg_salary DESC;

How do I filter grouped results?

Use HAVING to filter after aggregation. WHERE filters before grouping; HAVING can reference aggregates.

SELECT department_id, COUNT(*) AS headcount FROM employees GROUP BY department_id HAVING COUNT(*) > 10;

How do I group by multiple columns?

List every non-aggregated column in both SELECT and GROUP BY. PostgreSQL treats each unique combination as its own group.

SELECT country, city, SUM(sales) AS total FROM orders GROUP BY country, city;

What about ROLLUP, CUBE, and GROUPING SETS?

Use GROUP BY ROLLUP(a,b) for hierarchical subtotals, CUBE(a,b) for all combinations, and GROUPING SETS to specify custom subtotal lists.

Best practices for fast GROUP BY queries

• Index frequently grouped columns.
• Alias aggregates for readability.
• Pre-compute complex expressions in CTEs.
• Keep grouping columns narrow (INT over TEXT) to reduce sort/hash cost.

Common mistakes and fixes

1 – Missing column in GROUP BY: PostgreSQL error “column must appear in the GROUP BY clause.” Add it to GROUP BY or wrap it in an aggregate.

2 – Using WHERE with aggregates: WHERE cannot reference aggregates. Move the condition to HAVING.

More examples

-- Distinct users per day SELECT date_trunc('day', created_at) AS day, COUNT(DISTINCT user_id) AS users FROM logins GROUP BY day ORDER BY day; -- Grand total with ROLLUP SELECT department_id, SUM(sales) AS dept_sales FROM sales GROUP BY ROLLUP(department_id);

Frequently asked questions

Why How to GROUP BY in PostgreSQL is important

How to GROUP BY in PostgreSQL Example Usage


SELECT department_id, AVG(salary) AS avg_salary FROM employees GROUP BY department_id ORDER BY avg_salary DESC;

How to GROUP BY in PostgreSQL Syntax


SELECT aggregate_function(column | expression) [, ...]
FROM table_name
[WHERE filter_predicate]
GROUP BY grouping_expression [, ...]
[HAVING condition]
[ORDER BY sort_expression];

Common Mistakes

Selecting a column that is neither aggregated nor listed in GROUP BY. PostgreSQL raises an error. Fix by adding the column to GROUP BY or applying an aggregate like MAX().
Placing aggregate filters in WHERE instead of HAVING. WHERE runs before grouping and can’t reference aggregates. Move conditions like COUNT(*) > 5 into HAVING.

Frequently Asked Questions (FAQs)

Can I use GROUP BY without an aggregate function?

Yes. GROUP BY with no aggregates simply removes duplicates, acting like DISTINCT. However DISTINCT is usually clearer and faster.

Does GROUP BY guarantee order?

No. Output order is undefined unless you add ORDER BY.

Can I GROUP BY an expression or alias?

You can group by any expression, but you must repeat the expression itself or use its positional index. Standard SQL forbids grouping by the SELECT alias.