SQL doesn't have a built-in string splitting function. This concept explores common workarounds to achieve this using string functions like SUBSTRING, CHARINDEX, and recursive CTEs.
Splitting strings by delimiters is a common task in SQL, but unlike some programming languages, SQL doesn't have a direct 'split' function. To achieve this, we need to leverage string manipulation functions. One common approach involves using functions like `SUBSTRING` and `CHARINDEX` to extract substrings based on the delimiter's position. Another method, particularly useful for handling multiple splits, is using a recursive Common Table Expression (CTE). This approach iteratively extracts substrings until the delimiter is no longer found. This allows for more dynamic and flexible string splitting, especially when dealing with varying numbers of substrings within a single string.
Understanding string manipulation techniques like splitting is crucial for data cleaning and transformation tasks. It allows you to extract meaningful information from strings stored in your database, enabling more complex queries and data analysis.
Most relational databases let you combine SUBSTRING()
with CHARINDEX()
(or INSTR()
/ POSITION()
depending on the dialect) to locate the delimiter and extract the left-most token. By repeatedly applying these two functions—or wrapping them in a loop or CTE—you can peel off each segment of the original string without relying on a proprietary SPLIT
function.
If you only need the first or second token, a single pass with SUBSTRING()
and CHARINDEX()
is fine. But when the input contains an unknown or variable number of delimiters—like a comma-separated ID list—a recursive Common Table Expression shines. The CTE repeatedly finds the next delimiter, emits the token, trims the processed part, and continues until no delimiter remains, returning each piece as its own row. This approach is set-based, easy to integrate with joins, and scales better than nested scalar functions.
Galaxy’s context-aware AI Copilot can auto-generate the entire recursive CTE or SUBSTRING()
/CHARINDEX()
pattern for you. Simply describe the source column and delimiter in natural language, and the Copilot suggests a ready-to-run query inside the editor. It also adapts the syntax to your database dialect, annotates each step for readability, and saves the snippet to a shared Collection so teammates can reuse or endorse it—no more copying SQL into Slack or Notion.