How to split a string in SQL Server the right way

Jeff Johnson Posted on August 25, 2014 Posted in Uncategorized 13 Comments

I’ve seen many string splitting functions in SQL Server and many of them overwrite the string that is being split over and over which causes terrible performance by re-allocating memory every time. This is a terrible way to split a string in SQL Server.

Today I crafted a much more efficient SQL Server string splitting function. If you need varchar data, you can change the nvarchar to varchar with no problem.

I compared this against a table value parameter and it actually outperformed the table value parameter! The reason I believe, is because I have a very large string (about 10,000 items once it’s split) and that’s a LOT of transact SQL for SQL Server to parse.

SET NOCOUNT ON

-- You will want to change nvarchar(MAX) to nvarchar(50), varchar(50) or whatever matches exactly with the string column you will be searching against
DECLARE @SplitStringTable TABLE (Value nvarchar(MAX) NOT NULL)
DECLARE @StringToSplit nvarchar(MAX) = 'your|string|to|split|here'
DECLARE @SplitEndPos int
DECLARE @SplitValue nvarchar(MAX)
DECLARE @SplitDelim nvarchar(1) = '|'
DECLARE @SplitStartPos int = 1

SET @SplitEndPos = CHARINDEX(@SplitDelim, @StringToSplit, @SplitStartPos)

WHILE @SplitEndPos > 0
BEGIN
	SET @SplitValue = SUBSTRING(@StringToSplit, @SplitStartPos, (@SplitEndPos - @SplitStartPos))
	INSERT @SplitStringTable (Value) VALUES (@SplitValue)
	SET @SplitStartPos = @SplitEndPos + 1
	SET @SplitEndPos = CHARINDEX(@SplitDelim, @StringToSplit, @SplitStartPos)
END

SET @SplitValue = SUBSTRING(@StringToSplit, @SplitStartPos, 2147483647)
INSERT @SplitStringTable (Value) VALUES(@SplitValue)

SET NOCOUNT OFF

-- You can select or join with the values in @SplitStringTable at this point.

0 0 votes

Article Rating

This site uses Akismet to reduce spam. Learn how your comment data is processed.

13 Comments

Newest

Oldest Most Voted

Inline Feedbacks

View all comments

How to: How do I split a string so I can access item x | SevenNet

10 years ago

[…] I cover a much better way to do a string split here: http://www.digitalruby.com/split-string-sql-server/ […]

Aaron Bertrand

10 years ago

Can you show the table-valued parameter, the code you used that did the same thing, and how exactly you measured the performance? This iterative approach has certainly not been anywhere near the top performers in my testing… so much so that I even left it out of my round-up two years ago.

Author

Jeff Johnson

10 years ago

Reply to Aaron Bertrand

It wasn’t scentific. I had a string with 10,000 guids in it, and had two query windows open in SQL management studio. Timing with the clock app on my iPhone, the string split way took about 6 seconds, the table value way took about 10. The table value code ended up being a bunch of INSERT INTO @p5 VALUES (‘GUID-GOES-HERE’).

Aaron Bertrand

10 years ago

Reply to Jeff Johnson

Table-valued parameter or table variable? I’m still not clear I understand exactly what you were comparing to what.

Aaron Bertrand

10 years ago

Reply to Jeff Johnson

For example, if you have 10,000 individual insert statements, then of course that will take longer (in fact I bet 99.9% of the time was spent processing all of those individual insert statements). The point of the TVP is that you have this set of 10,000 things already as a set somewhere (e.g. a DataTable in C#). Then you just pass that *set* in, and no looping, or 10,000 inserts, or string parsing are required.

Author

Jeff Johnson

10 years ago

Reply to Aaron Bertrand

It was a table value parameter. The generated SQL for it showed thousands of insert statements into a parameter. Is there a way to pass a DataTable directly as a parameter and not have it use all those insert statements?

Aaron Bertrand

10 years ago

Reply to Jeff Johnson

Of course there is. How did you build the string of 10,000 GUIDs?

Author

Jeff Johnson

10 years ago

Reply to Aaron Bertrand

String parameter to sqlcommand.

Aaron Bertrand

10 years ago

Reply to Jeff Johnson

So in C# you did something to build that list of 10,000 strings, right? Put them in a DataTable instead, and pass it as a Structured parameter type to a table-valued parameter. This should not produce “generated SQL” of 10,000 individual inserts.

Author

Jeff Johnson

10 years ago

Reply to Aaron Bertrand

Sorry I mis-informed you. For the table value parameter I used an array of SqlDataRecord I believe.

Aaron Bertrand

10 years ago

Reply to Jeff Johnson

Well that’s where your looping of 10,000 inserts come from, not from the use of a table variable.

Aaron Bertrand

10 years ago

Reply to Jeff Johnson

Example here:

http://sqlperformance.com/2012/08/t-sql-queries/splitting-strings-now-with-less-t-sql

Author

Jeff Johnson

10 years ago

Reply to Aaron Bertrand

I also tried the Moden way from your website, and after 10 seconds it had only got through about 2000 of my GUID’s, so I just killed the timer.

Aaron Bertrand

10 years ago

Reply to Jeff Johnson

Yes, Jeff has admitted that his approach is not designed to handle strings longer than 8K well at all (and my testing proved that as well). And you would need that because you can’t fit 10,000 GUIDs in 8,000 characters.

How to split a string in SQL Server the right way

Related Posts:

Links