How to Normalize Data in MySQL

Galaxy Glossary

How do I normalize data in MySQL?

Normalization splits wide, redundant tables into smaller, related ones so each fact is stored once, preventing anomalies and saving space.

Sign up for the latest in SQL knowledge from the Galaxy Team!
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.

Description

Table of Contents

What does “normalize data” mean in MySQL?

Normalization restructures tables so that each fact lives in exactly one place, eliminating redundancy, update anomalies, and inconsistent data. In practice you split wide, repeated-field tables into smaller ones linked by primary and foreign keys and aim for third normal form (3NF).

When should I normalize my ecommerce tables?

Normalize before production if you design a fresh schema or during migrations when you notice duplicate customer or product details spread across many rows. Normalization improves storage, speeds updates, and prevents mismatched totals during financial reporting.

How do I extract unique entities?

Use SELECT DISTINCT or GROUP BY to pull unique customers, products, or dates into dedicated tables, then reference them with foreign keys. This step is the bridge between a denormalized import and a clean relational model.

Example: Split customer details into Customers

CREATE TABLE Customers (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
email VARCHAR(255) UNIQUE,
created_at DATETIME DEFAULT NOW()
);

INSERT INTO Customers (name, email, created_at)
SELECT DISTINCT customer_name, customer_email, MIN(order_date)
FROM Orders_raw;

How do I link child tables after normalization?

Add foreign keys and indexes to preserve referential integrity and maintain query speed. Always update application queries to use the new surrogate keys rather than the old text columns.

Example: Create Orders referencing Customers

CREATE TABLE Orders (
id INT AUTO_INCREMENT PRIMARY KEY,
customer_id INT NOT NULL,
order_date DATE,
total_amount DECIMAL(10,2),
CONSTRAINT fk_orders_customer
FOREIGN KEY (customer_id) REFERENCES Customers(id)
);

What’s the safest workflow to normalize live data?

1) Create new normalized tables. 2) Back up existing data. 3) Migrate data with INSERT…SELECT inside a transaction. 4) Add triggers or views to keep old and new structures in sync during cut-over. 5) Switch applications to the new tables. 6) Drop legacy tables once validated.

Why How to Normalize Data in MySQL is important

How to Normalize Data in MySQL Example Usage


-- Normalize Products and OrderItems
CREATE TABLE Products (
  id INT AUTO_INCREMENT PRIMARY KEY,
  name VARCHAR(150),
  price DECIMAL(10,2),
  stock INT
);

INSERT INTO Products (name, price)
SELECT DISTINCT product_name, product_price
FROM OrderItems_raw;

CREATE TABLE OrderItems (
  id INT AUTO_INCREMENT PRIMARY KEY,
  order_id INT,
  product_id INT,
  quantity INT,
  CONSTRAINT fk_oi_order FOREIGN KEY (order_id) REFERENCES Orders(id),
  CONSTRAINT fk_oi_product FOREIGN KEY (product_id) REFERENCES Products(id)
);

INSERT INTO OrderItems (order_id, product_id, quantity)
SELECT r.order_id, p.id, r.quantity
FROM OrderItems_raw r
JOIN Products p ON p.name = r.product_name;

How to Normalize Data in MySQL Syntax


-- 1. Extract unique customers from denormalized Orders_raw
INSERT INTO Customers (name, email, created_at)
SELECT DISTINCT customer_name, customer_email, MIN(order_date)
FROM Orders_raw
GROUP BY customer_name, customer_email;

-- 2. Create Orders table referencing Customers
CREATE TABLE Orders (
  id INT AUTO_INCREMENT PRIMARY KEY,
  customer_id INT NOT NULL,
  order_date DATE,
  total_amount DECIMAL(10,2),
  CONSTRAINT fk_orders_customer FOREIGN KEY (customer_id)
    REFERENCES Customers(id) ON DELETE RESTRICT ON UPDATE CASCADE
);

-- 3. Migrate order rows
INSERT INTO Orders (customer_id, order_date, total_amount)
SELECT c.id, o.order_date, o.total_amount
FROM Orders_raw o
JOIN Customers c ON c.email = o.customer_email;

Common Mistakes

Frequently Asked Questions (FAQs)

Can I normalize without downtime?

Yes. Create new tables, migrate data inside a transaction, and use triggers or views so old queries keep working until you cut traffic to the new schema.

Does normalization hurt SELECT performance?

Proper indexing keeps joins fast. The smaller row size often outweighs the extra join cost, especially on SSD-backed servers.

Should every table reach 3NF?

Most transactional tables should. For read-heavy analytics, selective denormalization or materialized views may perform better.

Want to learn about other SQL terms?

Trusted by top engineers on high-velocity teams
Aryeo Logo
Assort Health
Curri
Rubie Logo
Bauhealth Logo
Truvideo Logo
Welcome to the Galaxy, Guardian!
You'll be receiving a confirmation email

Follow us on twitter :)
Oops! Something went wrong while submitting the form.