How to NORMALIZE Unicode Strings in Oracle

How do I use Oracle’s NORMALIZE function to standardize Unicode text?

NORMALIZE returns a Unicode string converted to the specified normalization form (NFC, NFD, NFKC, or NFKD), making text comparisons reliable.

Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)

Oops! Something went wrong while submitting the form.

Description

Example H2

Example H3

What problem does NORMALIZE solve?

Strings containing accents or composed characters can compare or sort unpredictably. NORMALIZE converts every code point to a canonical form, so “é” and “e01” become identical.

When should I call NORMALIZE?

Call it before storing or comparing user-entered names, product titles, or email addresses to avoid duplicate keys, case issues, and mismatched joins.

What is the exact syntax?

Use NORMALIZE(char_expr [, form]). form is optional and defaults to NFC. Valid forms: NFC, NFD, NFKC, NFKD.

Why choose NFC over NFD?

NFC keeps characters in their composed form (single code point) and is recommended for storage. NFD splits characters into base + combining marks, useful for advanced search functions.

How do I normalize accented customer names?

Wrap the column in NORMALIZE during inserts or updates:INSERT INTO Customers (name, email) VALUES (NORMALIZE(:name,'NFC'), :email);

Can I normalize data in bulk?

Yes. Use an UPDATE with SET column = NORMALIZE(column):UPDATE Customers SET name = NORMALIZE(name, 'NFC');

Does NORMALIZE affect indexes?

After bulk normalization, rebuild function-based indexes or unique keys that rely on the text column to ensure the index uses the new values.

Best practices for NORMALIZE

Normalize at the point of entry, stick to one form (NFC), create deterministic function-based indexes, and document the policy so every service layer follows it.

Why How to NORMALIZE Unicode Strings in Oracle is important

How to NORMALIZE Unicode Strings in Oracle Example Usage


-- Find duplicate customer names that differ only by composition
WITH normalized AS (
  SELECT id, NORMALIZE(name, 'NFC') AS norm_name
  FROM Customers
)
SELECT a.id AS cust1, b.id AS cust2, a.norm_name
FROM normalized a
JOIN normalized b ON a.norm_name = b.norm_name AND a.id < b.id;

How to NORMALIZE Unicode Strings in Oracle Syntax


NORMALIZE(char_expr [, form])

-- Parameters
char_expr   : Any CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB expression.
form        : Optional VARCHAR2 literal. One of 'NFC' (default), 'NFD', 'NFKC', 'NFKD'.

-- Example in ecommerce context
UPDATE Customers
SET name = NORMALIZE(name, 'NFC');

Common Mistakes

Using different normalization forms in different parts of the application causes inconsistent comparisons. Pick one form (preferably NFC) and enforce it everywhere.
Normalizing only during SELECT queries instead of on INSERT/UPDATE leads to permanent data discrepancies. Always normalize on write or run a one-time bulk update.