Normalize data in PostgreSQL by scaling numeric columns to a common range (min-max) or distribution (z-score) for easier comparison and analysis.
Normalization rescales values so order totals, product prices, or customer spend share a comparable scale. This simplifies outlier detection, model training, and dashboard visuals.
Use it before feeding ML models, during KPI comparisons across time, or when plotting metrics with vastly different units.
WITH stats AS (
SELECT MIN(total_amount) AS min_val,
MAX(total_amount) AS max_val
FROM Orders
)
SELECT o.id,
(o.total_amount - s.min_val) / NULLIF(s.max_val - s.min_val, 0) AS total_amount_norm
FROM Orders o CROSS JOIN stats s;
The NULLIF guard prevents divide-by-zero if all totals are identical.
WITH stats AS (
SELECT AVG(total_amount) AS avg_val,
STDDEV_POP(total_amount) AS std_val
FROM Orders
)
SELECT o.id,
(o.total_amount - s.avg_val) / NULLIF(s.std_val, 0) AS total_amount_z
FROM Orders o CROSS JOIN stats s;
Z-scores center data at 0 with a standard deviation of 1, highlighting outliers.
ALTER TABLE Orders ADD COLUMN total_amount_norm numeric;
WITH stats AS (
SELECT MIN(total_amount) AS min_val,
MAX(total_amount) AS max_val
FROM Orders
)
UPDATE Orders o
SET total_amount_norm = (o.total_amount - s.min_val) /
NULLIF(s.max_val - s.min_val, 0)
FROM stats s;
Add a generated column for automatic updates in PostgreSQL 12+: ALTER TABLE Orders ADD COLUMN total_amount_norm numeric GENERATED ALWAYS AS ((total_amount - (SELECT MIN(total_amount) FROM Orders)) / NULLIF((SELECT MAX(total_amount) FROM Orders) - (SELECT MIN(total_amount) FROM Orders),0)) STORED;
Use window functions for one-pass computations:
SELECT id,
(total_amount - MIN(total_amount) OVER ()) /
NULLIF(MAX(total_amount) OVER () - MIN(total_amount) OVER (), 0) AS total_amount_norm
FROM Orders;
Parameters:
• Column to scale (e.g., total_amount
)
• Aggregate function (MIN
, MAX
, AVG
, STDDEV_POP
)
• NULLIF divisor guard
• Optional PARTITION BY
to normalize per customer or month.
Store raw data separately; create views or generated columns for normalized values. Always handle zero variance with NULLIF
. Document the scaling method in column comments.
Divide-by-zero: Forgetting NULLIF when max = min
. Fix: wrap the denominator with NULLIF(expr,0)
.
Mixed scales: Normalizing per-row instead of over the desired partition. Fix: use OVER ()
for global or OVER (PARTITION BY)
for subgroup scaling.
No. Unless you run an UPDATE, queries leave the raw figures untouched.
Yes—store min/max or mean/stddev and apply the inverse formula to recover original values.
Absolutely—add a WHERE
clause or normalize inside a date-filtered CTE.
Window functions scan the data once, so performance is generally good. Indexes are not required because aggregates read all rows anyway.
Yes—create a view that calculates normalized columns on the fly, keeping the base table unchanged.
Only scale columns used in comparative analytics or ML models. Operational fields such as IDs or counts rarely need normalization.