How to NORMALIZE Unicode Strings in Oracle

Galaxy Glossary

How do I use Oracle’s NORMALIZE function to standardize Unicode text?

NORMALIZE returns a Unicode string converted to the specified normalization form (NFC, NFD, NFKC, or NFKD), making text comparisons reliable.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.

Description

Table of Contents

What problem does NORMALIZE solve?

Strings containing accents or composed characters can compare or sort unpredictably. NORMALIZE converts every code point to a canonical form, so “é” and “e01” become identical.

When should I call NORMALIZE?

Call it before storing or comparing user-entered names, product titles, or email addresses to avoid duplicate keys, case issues, and mismatched joins.

What is the exact syntax?

Use NORMALIZE(char_expr [, form]). form is optional and defaults to NFC. Valid forms: NFC, NFD, NFKC, NFKD.

Why choose NFC over NFD?

NFC keeps characters in their composed form (single code point) and is recommended for storage. NFD splits characters into base + combining marks, useful for advanced search functions.

How do I normalize accented customer names?

Wrap the column in NORMALIZE during inserts or updates:INSERT INTO Customers (name, email) VALUES (NORMALIZE(:name,'NFC'), :email);

Can I normalize data in bulk?

Yes. Use an UPDATE with SET column = NORMALIZE(column):UPDATE Customers SET name = NORMALIZE(name, 'NFC');

Does NORMALIZE affect indexes?

After bulk normalization, rebuild function-based indexes or unique keys that rely on the text column to ensure the index uses the new values.

Best practices for NORMALIZE

Normalize at the point of entry, stick to one form (NFC), create deterministic function-based indexes, and document the policy so every service layer follows it.

Why How to NORMALIZE Unicode Strings in Oracle is important

How to NORMALIZE Unicode Strings in Oracle Example Usage


-- Find duplicate customer names that differ only by composition
WITH normalized AS (
  SELECT id, NORMALIZE(name, 'NFC') AS norm_name
  FROM Customers
)
SELECT a.id AS cust1, b.id AS cust2, a.norm_name
FROM normalized a
JOIN normalized b ON a.norm_name = b.norm_name AND a.id < b.id;

How to NORMALIZE Unicode Strings in Oracle Syntax


NORMALIZE(char_expr [, form])

-- Parameters
char_expr   : Any CHAR, VARCHAR2, NCHAR, NVARCHAR2, CLOB, or NCLOB expression.
form        : Optional VARCHAR2 literal. One of 'NFC' (default), 'NFD', 'NFKC', 'NFKD'.

-- Example in ecommerce context
UPDATE Customers
SET name = NORMALIZE(name, 'NFC');

Common Mistakes

Frequently Asked Questions (FAQs)

Does NORMALIZE change the string length?

Yes, NFD and NFKD often increase length because they decompose characters into base + combining marks. Plan VARCHAR sizes accordingly.

Is NORMALIZE expensive?

It is CPU-bound and scales linearly with text length. For bulk jobs, batch commits and monitor CPU usage.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie Logo
Bauhealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.