GROUP BY aggregates rows that share the same column values, enabling per-group calculations like totals, counts, and averages.
The GROUP BY
clause collapses rows that have identical values in one or more columns so aggregate functions—COUNT
, SUM
, AVG
, MIN
, MAX
—return a single result per group. That makes it ideal for sales dashboards, customer metrics, and inventory rollups.
Use it to calculate lifetime customer value, daily order counts, product-level revenue, or any summary where you need figures broken down by a dimension like date, customer, or product.
SELECT aggregate_function(col) AS alias, group_column[, ...]
FROM table_name
[WHERE conditions]
GROUP BY group_column[, ...]
[HAVING aggregate_condition]
[ORDER BY sort_columns]
[ROLLUP | CUBE | GROUPING SETS];
SELECT c.id,
c.name,
SUM(o.total_amount) AS lifetime_value
FROM Customers c
JOIN Orders o ON o.customer_id = c.id
GROUP BY c.id, c.name
ORDER BY lifetime_value DESC;
List every non-aggregated selected column in the GROUP BY
. Filter aggregated results with HAVING
, not WHERE
. Index the grouping columns when the table is large. Prefer window functions when you need both raw and aggregated rows in the same query.
Omitting columns: ParadeDB will error if a selected column isn’t aggregated or grouped. Add it to GROUP BY
or wrap it in an aggregate.
Filtering with WHERE on aggregates: WHERE SUM(total_amount) > 1000
is invalid. Move the condition to HAVING
.
GROUP BY ROLLUP(year, month)
adds summary rows per month and a final yearly total, saving multiple UNIONs.
Yes. Use the expression in both SELECT and GROUP BY or wrap it in a CTE to reference the alias.
No. Add an ORDER BY clause to guarantee result order; otherwise ParadeDB returns groups in arbitrary order.
Use ROLLUP, CUBE, or GROUPING SETS with GROUP BY to insert subtotal and total rows without extra UNION queries.