Using Google Sheets’ REGEX functions to systematically strip noise, validate patterns, and standardize phone numbers into a consistent format.
Google Sheets supports the full PCRE2 regular-expression engine through functions like REGEXREPLACE
, REGEXEXTRACT
, and REGEXMATCH
. With a single formula you can purge punctuation, enforce country codes, or re-format inconsistent inputs into a clean, analytics-ready column.
Dirty phone numbers break text messages, hinder deduplication, and cause CRM integrations to fail. Manual cleaning doesn’t scale, and simple SUBSTITUTE
chains miss edge cases. Regular expressions (regex) let you codify every quirk—from stray spaces to optional extension markers—in one maintainable rule. Automating this step reduces data entry errors and protects downstream dashboards, email campaigns, and customer-support workflows.
\d
→ any digit 0-9\D
→ any non-digit[^0-9]
→ explicit negation of digits*
→ 0 or more+
→ 1 or more{n}
→ exactly n repetitions{n,}
→ at least n repetitionsParentheses group sub-patterns and capture matches for reuse in the replacement string, e.g. (\d{3})(\d{3})(\d{4})
.
Start by removing parentheses, dashes, spaces, and plus signs:
=REGEXREPLACE(A2, "[^0-9]", "")
Result: (415) 555-1234
→ 4155551234
US numbers missing the leading 1
? Use alternation to add it when length=10:
=IF(LEN(B2)=10, "1" & B2, B2)
For multi-country sheets, pair REGEXMATCH
with specific prefixes.
E.164 expects +
plus country code and no punctuation:
="+" & REGEXREPLACE(C2, "[^0-9]", "")
Now +14155551234
is ready for Twilio or your CRM.
Preserve extensions while cleaning:
=ARRAYFORMULA(IFNA(REGEXREPLACE(A2:A, "[\s-]*(ext|x|extension)[\s-]*([0-9]+)", ";ext=\\2")))
415-555-1234 x99
→ +14155551234;ext=99
The pattern ^(?:\+?1\s*)?(?:\(?\d{3}\)?)[\s.-]*\d{3}[\s.-]*\d{4}$
matches most US numbers regardless of punctuation. Use it inside REGEXMATCH
to validate a column before bulk SMS campaigns.
Need to reformat only numbers already containing a country code? Combine REGEXMATCH
inside an IF
wrapper.
=ARRAYFORMULA(
IF(A2:A="","",
"+" & REGEXREPLACE(
IF(LEN(REGEXREPLACE(A2:A, "[^0-9]", ""))=10,
"1" & REGEXREPLACE(A2:A, "[^0-9]", ""),
REGEXREPLACE(A2:A, "[^0-9]", "")
),
"[^0-9]", "")
)
))
This single formula ingests raw user input, prepends the US country code when needed, and outputs E.164-compliant strings for every row in the sheet.
/* */
) in Apps Script or in a Notes field so future teammates understand the logic.ARRAYFORMULA
to avoid copy-pasting down thousands of rows.REGEXMATCH
to spot invalid rows before you overwrite them with bad data.Regex is slow: Sheets evaluates regex formulas cell-by-cell but stays performant up to tens of thousands of rows.
Digits only is enough: Not true—extensions, leading zeros, and country codes require nuanced handling.
One pattern fits all countries: Each numbering plan has its own rules; build country-specific regexes or use a library when coverage must be global.
Imagine exporting Shopify orders: the phone field includes "+1 (415) 555-1234"
, "415.555.9876"
, and "0044 20 7930 4832"
. Feeding these directly into a messaging API leads to failures. A two-step regex—first detecting the country code (^(00|\+)?(\d+)
) and then stripping non-digits—standardizes everything into clean E.164 strings.
While Galaxy focuses on SQL rather than spreadsheets, the mindset is identical: codify messy text transformations as deterministic logic instead of manual edits. If you later move your Sheets data into a database, comparable REGEXP_REPLACE
statements can live inside Galaxy’s SQL editor and benefit from its AI-powered suggestions.
Accurate phone numbers are critical for SMS marketing, customer support, and fraud detection. Regex automates the cleanup process, reducing manual work, preventing failed messages, and ensuring that downstream systems receive uniform, reliable data.
Use REGEXREPLACE(A2, "[^0-9]", "")
. The pattern [^0-9]
targets every non-digit, and the empty replacement string deletes them.
Yes. After stripping non-digits, test the length; if it equals 10, prepend “1”. Otherwise, assume the user supplied a code.
Sheets handles thousands of regex evaluations quickly. For six-figure rows consider moving to BigQuery and writing a REGEXP_REPLACE
query inside Galaxy’s SQL editor.
Galaxy’s SQL editor lets you run similar REGEXP_REPLACE
statements directly in your database. Its AI copilot can suggest patterns and validate results, extending the same cleaning logic beyond Google Sheets.