SQL GROUP BY

What is SQL GROUP BY used for?

SQL GROUP BY groups rows sharing the same values so aggregate functions return a single result per group.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Compatible dialects for SQL GROUP BY: Supported by PostgreSQL, MySQL, MariaDB, SQL Server, Oracle, SQLite, DB2, Snowflake, BigQuery, Redshift, and all ANSI-compliant engines.

SQL GROUP BY Full Explanation

GROUP BY is a SELECT-clause element that partitions result-set rows into logical groups based on one or more expressions. After grouping, aggregate functions like COUNT, SUM, AVG, MIN, and MAX compute a summary value for each group instead of the entire set. Nonaggregated columns in the SELECT list must appear in the GROUP BY list, ensuring deterministic results. GROUP BY executes after WHERE filtering and before HAVING filtering and ORDER BY sorting. Many dialects extend it with ROLLUP, CUBE, and GROUPING SETS for multidimensional analysis.Caveats:- GROUP BY ignores NULL versus NULL equality and treats all NULLs as one group in most engines.- Ordering of groups is not guaranteed unless ORDER BY is specified.- Selecting non-grouped, nonaggregated columns causes error in strict SQL modes (MySQL sql_mode=only_full_group_by, PostgreSQL, SQL Server, Oracle).- Performance depends on indexes or sort/aggregate algorithms; large groupings may require memory or disk spill.

SQL GROUP BY Syntax

SELECT grouping_column_1 [, grouping_column_2 ...],
       aggregate_function(expression) AS alias
FROM   table_name
[WHERE  conditions]
GROUP BY grouping_column_1 [, grouping_column_2 ...]
[HAVING aggregate_predicate]
[ORDER BY column_or_alias];

SQL GROUP BY Parameters

grouping_column_n (expression) - Column name or expression used to form each group
aggregate_function (function) - COUNT, SUM, AVG, MIN, MAX, or custom aggregate applied per group

Example Queries Using SQL GROUP BY

-- Basic count by status
SELECT status, COUNT(*) AS order_count
FROM   orders
GROUP BY status;

-- Revenue per year and month
SELECT DATE_TRUNC('year', created_at) AS year,
       DATE_TRUNC('month', created_at) AS month,
       SUM(total_amount) AS revenue
FROM   sales
GROUP BY DATE_TRUNC('year', created_at),
         DATE_TRUNC('month', created_at);

-- Filter groups with HAVING
SELECT customer_id, SUM(total_amount) AS lifetime_value
FROM   sales
GROUP BY customer_id
HAVING SUM(total_amount) > 1000;

-- Extended syntax: rollup (PostgreSQL example)
SELECT region, product, SUM(quantity) AS qty
FROM   shipments
GROUP BY ROLLUP(region, product);

Expected Output Using SQL GROUP BY

Each query returns one result row per distinct grouping combination along with the requested aggregate values
Rows not meeting any HAVING predicate are excluded

Use Cases with SQL GROUP BY

Summarizing metrics like sales per customer or visits per day
Creating KPI dashboards that rely on aggregated data
Driving pivot tables and drill-down reports
Feeding statistical models with grouped datasets
Pre-aggregating data for performance before insertion into analytics tables

Common Mistakes with SQL GROUP BY

Selecting columns not present in GROUP BY or an aggregate
Forgetting HAVING and wrongly placing aggregate conditions in WHERE
Assuming groups are sorted without ORDER BY
Expecting separate NULL groups when databases merge them
Using COUNT(column) and missing NULL rows; use COUNT(*) instead

First Introduced In

SQL-86

Frequently Asked Questions

What does GROUP BY do?

GROUP BY collects rows sharing the same values into groups so aggregate functions compute one result per group.

Can I use GROUP BY without aggregates?

Yes. The output will mimic DISTINCT by returning one row per unique combination of grouped columns, but no summarization occurs.

Difference between WHERE and HAVING?

WHERE filters rows before grouping. HAVING filters groups after aggregation, typically with aggregate predicates.

How to order grouped results?

Add ORDER BY with the grouping columns or aggregate aliases to sort the final grouped dataset.