Splitting a Column Into Columns

Every now and then, there may be some occasions where someone has stuffed some delimited strings into a single column. Here a few ways this could happen:

  • Someone is avoiding normalization. Oops.
  • There’s an “EmailTo” column where it’s possible to have multiple email addresses in one field.
  • It’s easier to parse in SQL Server? That could be a stretch. I’m reaching.

Whatever the reason is, we can figure out how to split a column into multiple columns. If you’re an Excel expert, you might be familiar with this task. You know Excel, right? That spreadsheet tool that we share a love-hate relationship with.

In Excel, we can accomplish this task pretty easily by using the Convert Text to Columns Wizard and specifying a delimiter. The nice thing about that is we don’t need to tell it how far out to split. It’ll create as many columns as it needs for the delimiter you specify. If there’s data already in the field that would be overwritten, it’s even kind enough to warn you and let you go ahead anyway.

Not quite as easy using SQL Server.

We can still get this done in SQL Server, but it’s not going to be pretty. I’ll be using the ROW_NUMBER() function along with a PIVOT to accomplish this. To actually split the strings by a delimiter, I’ll be using the STRING_SPLIT function.

Prep time.

Let’s go ahead and build a sample table and fill it with some random strings.

CREATE TABLE dbo.Junk
(
	JunkId int NOT NULL IDENTITY
        CONSTRAINT PK_Junk PRIMARY KEY CLUSTERED,
	Trash varchar (128) NOT NULL
);

INSERT INTO dbo.Junk (Trash)
VALUES ('Green, Yellow, Blue'),
       ('Purple, Red'),
       ('Cyan'),
       ('Silver, Turquoise');

Pretty straight forward. Looking at the table, we should see this:

JunkIdTrash
1Green, Yellow, Blue
2Purple, Red
3Cyan
4Silver, Turquoise

Remember, what we want to see is something like this:

JunkIdTrashUnoTrashDosTrashTres
1GreenYellowBlue
2PurpleRedNULL
3CyanNULLNULL
4SilverTurquoiseNULL

Here’s the code to do that.

WITH CTE
AS
(
SELECT J.JunkId,
       TRIM(SS.value) AS Trash,
       ROW_NUMBER() OVER (PARTITION BY J.JunkId ORDER BY (SELECT 1/0)) AS RowNum
FROM dbo.Junk AS J
CROSS APPLY STRING_SPLIT(J.Trash, N',') AS SS
)
SELECT PVT.JunkId, 
       PVT.[1] AS TrashUno, 
       PVT.[2] AS TrashDos, 
       PVT.[3] AS TrashTres
FROM CTE
PIVOT
(
    MIN(CTE.Trash) FOR RowNum IN ([1], [2], [3])
) AS PVT

An explanation is due.

The first thing I wanted to focus on was splitting the strings. I decided to use STRING_SPLIT here. It’s being used with CROSS APPLY since STRING_SPLIT is a table-valued function. To be on the safe side, the “value” column is being trimmed as well. ROW_NUMBER came into play to keep these strings with their associated “JunkId”.

Since all of this magic is happening within a CTE, I can PIVOT these results. These rows can now be rotated or transposed using that helpful “RowNum” column. I used MIN as my aggregation function, but MAX would work as well for this. Since I’ve only got three possible results from ROW_NUMBER, I only needed three pivoted columns. This could work for as many columns as you wish. However, you could run into resource limitations or just be approached by an angry DBA. Whichever comes first.

Exit strategy.

Before you do something like this, please think first about “why” you’re doing this. Are multiple columns necessary? To split something like a “LastName, FirstName” field, most likely. A lot of times, you should consider putting these into multiple rows rather than columns. It all depends on your unique scenario.

Have you ever had to do this before? Have you used another method? I’d love to hear about it.

Thanks for reading!

Leave a Reply

Your email address will not be published. Required fields are marked *