WHERE filters rows before aggregation, HAVING filters aggregated groups after GROUP BY.
Understand when to filter rows with WHERE and when to filter aggregated groups with HAVING.
WHERE evaluates each row before any GROUPBY or aggregate function runs, while HAVING evaluates the result set after aggregation. Because HAVING sees grouped data, it can reference aggregate functions such as COUNT(), SUM(), or AVG(), which WHERE cannot.
Use WHERE for non-aggregate conditions like date ranges, status flags, or simple comparisons. Switch to HAVING when you must filter on aggregate results—for example, “customers with more than 5 orders” or “products whose total sales exceed $10 000.”
HAVING executes after GROUPBY. The database groups rows, calculates aggregates, then applies HAVING. This order lets HAVING access summary values but also means unnecessary grouping happens if row-level filters were not applied earlier with WHERE.
WHERE usually outperforms HAVING because it reduces the data set earlier, allowing the optimizer to skip grouping unwanted rows. Always push row-level predicates into WHERE first, then reserve HAVING for true aggregate filters.
Aggregates like COUNT(), SUM(), MIN(), and MAX() cannot appear in WHERE. If your condition requires one of these functions, place it in HAVING. For mixed cases, combine both clauses: put simple filters in WHERE and aggregate filters in HAVING.
The query below finds active customers created after 2023-01-01 who placed more than five orders:SELECT customer_id,
WHERE filters rows by status and date first; HAVING then filters grouped customers by order_count.
COUNT(*) AS order_count
FROM orders
WHERE customer_status = 'active'
AND created_at > '2023-01-01'
GROUP BY customer_id
HAVING COUNT(*) > 5;
Always place row-level predicates in WHERE to leverage indexes. Only use HAVING for conditions that reference aggregates. Combine both clauses for optimal performance and clarity. Document your choice so teammates understand execution order, especially when collaborating in tools like Galaxy.
Filtering at the proper stage cuts query runtimes, lowers compute cost, and ensures accurate results. Misplacing predicates forces the database to group unnecessary rows, slows dashboards, and inflates cloud bills. Mastering WHERE vs HAVING is foundational for performant analytics pipelines and is frequently tested in technical interviews.
Yes, but it’s rare. Without GROUP BY, HAVING acts like a WHERE on the aggregated result set, usually returning one row.
The optimizer may deduplicate, but it’s redundant and can confuse readers. Keep each predicate where it belongs.
Galaxy’s AI copilot flags aggregate misuse, suggests moving filters to WHERE, and visualizes the execution plan so you can verify performance.
No. Use window functions in SELECT or a subquery, then filter the outer query’s results with WHERE.