Filtering in Amazon Redshift uses the WHERE clause (and related predicates) to return only rows matching specified conditions.
Filtering means adding a WHERE
, HAVING
, or qualifier that keeps just the rows you need, cutting scan time and cost.
Write SELECT column_list FROM table_name WHERE condition;
. Redshift evaluates the condition for every row and returns matches.
Use AND
, OR
, and parentheses to chain tests.Redshift applies them left-to-right, honoring parentheses first.
Compare timestamp columns with >=
, <=
, BETWEEN
, or DATE_TRUNC()
for period buckets.
Use IS NULL
or IS NOT NULL
; regular =
comparisons ignore NULLs.
HAVING
filters post-aggregation, perfect for keeping only groups that meet a metric threshold, e.g., customers with total spend > $500.
Narrow predicates reduce disk I/O.Place the most selective conditions first and avoid functions on indexed columns to keep zone-map pruning effective.
1) Store dates in UTC
2) Normalize strings to one case
3) Avoid LIKE '%abc'
patterns
4) Pre-compute heavy expressions in derived columns.
SELECT o.id, o.total_amount
FROM Orders o
WHERE o.order_date BETWEEN '2023-01-01' AND '2023-12-31'
AND o.total_amount > 500;
Try Redshift’s DISTSTYLE KEY
or sort keys on filtered columns to boost performance even more.
.
Redshift uses zone maps and sort keys instead of traditional B-tree indexes. Proper sort keys make filters faster.
Use json_extract_path_text()
or the native SUPER type with dot
notation, then apply WHERE on the extracted value.
Yes. Most drivers accept placeholders (e.g., $1
) that safely inject values without SQL injection risks.