Practical SQL techniques to locate, protect, anonymize, and delete personal data in a MySQL database to meet GDPR obligations.
GDPR fines can reach 4% of global revenue. You must safeguard personal data, prove you minimized it, and delete it on request. MySQL stores PII in tables such as Customers and Orders, so compliance is critical.
Query INFORMATION_SCHEMA to scan column names for keywords like “email,” “name,” or “address.” Save results in a review table so you can track remediation status.
SELECT table_name, column_name
FROM INFORMATION_SCHEMA.COLUMNS
WHERE table_schema = 'shop'
AND column_name REGEXP 'email|name|phone';
Keep only what you need. Remove obsolete columns or tables, and archive raw logs outside the production schema. Use ALTER TABLE to drop or rename sensitive columns.
ALTER TABLE Customers
DROP COLUMN phone_number;
Pseudonymization masks data but allows recovery; anonymization removes any link. Use AES_ENCRYPT for reversible encryption and SHA2 or random tokens for irreversible masking.
-- Reversible encryption
UPDATE Customers
SET email = TO_BASE64(AES_ENCRYPT(email, 'secret_key'))
WHERE id = 42;
-- Irreversible hashing
UPDATE Customers
SET name = SHA2(name, 256)
WHERE created_at < DATE_SUB(NOW(), INTERVAL 3 YEAR);
Create a stored procedure that deletes or anonymizes customer data across Customers, Orders, and OrderItems while keeping referential integrity through NULLs or surrogate keys.
CALL forget_customer(42);
-- Example procedure
CREATE PROCEDURE forget_customer(IN cust_id INT)
BEGIN
DELETE FROM Orders WHERE customer_id = cust_id;
DELETE FROM Customers WHERE id = cust_id;
END;
Enable general_log only for sessions with elevated privileges, or use binary logs and a dedicated audit table triggered by SELECT, UPDATE, or DELETE events.
CREATE TRIGGER log_customer_access
AFTER SELECT ON Customers
FOR EACH ROW
INSERT INTO audit_log(user, action, accessed_at)
VALUES (CURRENT_USER(), 'SELECT Customers', NOW());
Automate data retention checks with EVENTS, restrict permissions via GRANT, encrypt backups, and document all processes in a GDPR register. Review your schema after every release.
No. Encrypt only fields that contain personal data. Over-encrypting slows queries and complicates indexing.
Yes, if they are irreversibly hashed and you inform users. However, they still count as personal data if they can be re-identified.
No. You must also update processes: get consent, monitor access, and document retention. SQL changes are one piece of a larger compliance strategy.