Techniques and functions for testing, replacing, and preventing NULL values in Amazon Redshift queries.
NULL represents unknown data. Aggregations, joins, and comparisons ignore or mis-evaluate NULLs, so query results can surprise you unless you handle them explicitly.
Use COALESCE(expr1, expr2, ...) to return the first non-NULL value. It supports unlimited arguments and stops scanning once a non-NULL appears, making it ideal for fallback logic across many columns.
NVL(expr, replacement) is a two-argument shortcut.Both arguments must share a compatible type. Prefer NVL for simple defaults and COALESCE for multi-level fallbacks.
Use column IS NULL to filter missing data and column IS NOT NULL to keep known values. Never use the equality operator (=) for NULL comparisons; it always returns NULL, not TRUE.
NULLIF(a, b) returns NULL when a equals b, letting you skip invalid arithmetic.Combine it with COALESCE to substitute safe defaults.
1) Replace NULLs as late as possible to preserve data fidelity. 2) Pick default values that won’t distort analytics (e.g., 0 for totals, empty string for text). 3) Document NULL logic in views so downstream users understand the intent.
.
COALESCE stops at the first non-NULL value, so its cost is minimal. Index usage remains intact because Redshift evaluates functions row by row.
Redshift supports NVL for Oracle familiarity. Internally, it rewrites to COALESCE with two arguments, so performance and semantics are identical.
COUNT(column) ignores NULLs. Use COUNT(*) - COUNT(column) to measure how many NULLs a column contains.