SQL DISTINCT removes duplicate rows from a result set, returning only unique combinations of the selected columns. Place the DISTINCT keyword right after SELECT to deduplicate data for reporting, aggregation, or sub-queries. Use cautiously on large tables because it forces a sort or hash step that may slow queries.
SQL DISTINCT removes duplicate rows from the result set, guaranteeing each returned row is unique for the selected columns. This deduplication happens after filtering in the logical query order.
Use DISTINCT when you need a list of unique values, deduplicated reports, or inputs for IN clauses. Avoid it if the table already enforces uniqueness with constraints.
Most databases implement DISTINCT by sorting or hashing the result set on the requested columns, then discarding duplicates. Both operations add CPU, memory, and possible disk I/O.
Place DISTINCT immediately after SELECT. Example:
SELECT DISTINCT city FROM customers;
DISTINCT can span several columns; uniqueness applies to their combined values.
SELECT DISTINCT first_name, last_name FROM users;
Wrap the column in COUNT(DISTINCT column). Example:
SELECT COUNT(DISTINCT status) AS status_count FROM orders;
DISTINCT can appear inside SUM, AVG, or other aggregates to eliminate duplicate inputs.
SELECT SUM(DISTINCT amount) AS unique_sales FROM invoices;
Combine DISTINCT with INSERT INTO … SELECT or use CTEs plus ROW_NUMBER() to permanently delete duplicates.
DISTINCT can slow queries because it forces a sort or hash step. Indexes on the DISTINCT columns mitigate cost by avoiding full table scans.
Project only needed columns, filter early, index DISTINCT columns, and avoid DISTINCT inside subqueries that feed GROUP BY aggregates.
GROUP BY without aggregates, EXISTS subqueries, or window functions like ROW_NUMBER() offer deduplication without DISTINCT.
DISTINCT returns unique rows, is simple to use, but may harm performance. Indexes, minimal column selection, and aggregate awareness keep queries fast.
Yes, as long as the data type supports equality comparisons. Large objects may slow sorting.
Absolutely. Place the alias after the column list: SELECT DISTINCT country AS unique_country FROM customers;
.
No. COUNT(DISTINCT) requires a specific column or expression.
Use EXPLAIN
plans before and after adding DISTINCT, and watch for Sort or HashAggregate operations.