Pages

Men

rh

7/09/2013

SQL Server CLR and TSQL functions to parse a delimited string

Problem

There are several tips and articles on the internet that discuss how to split a delimited list into multiple rows. This tip shows two approaches to this problem a T-SQL function and a CLR function.

Solution

The following examples show two different approaches to splitting a delimited list into multiple rows.  Both approaches return the same result set, but this tip will show the different techniques to deploy the functions.

T-SQL Version

There are several versions of a T-SQL approach that you can find on the internet, but here is one that parses a string based on a delimiter and returns one row per value.  The code has been put together to show you how this is done versus finding the most optimal approach for this problem. 

Just copy the code below and execute it in a query window.

CREATE FUNCTION [dbo].[fnParseStringTSQL] (@string NVARCHAR(MAX),@separator NCHAR(1))
RETURNS @parsedString TABLE (string NVARCHAR(MAX))
AS 
BEGIN
   DECLARE @position int
   SET @position = 1
   SET @string = @string + @separator
   WHILE charindex(@separator,@string,@position) <> 0
      BEGIN
         INSERT into @parsedString
         SELECT substring(@string, @position, charindex(@separator,@string,@position) - @position)
         SET @position = charindex(@separator,@string,@position) + 1
      END
     RETURN
END

CLR Version

This can be done either manually or you can do this using Visual Studio. If you deploy from Visual Studio a lot of these steps are simplified.

Step 1 - CLR code
Before we get started the first thing that needs to be done is to enable the CLR on your SQL Server.  This can be done by using the SQL Server Surface Area Configuration tool.  Refer to this tip CLR String Sort Function in SQL Server 2005 for more information.

Copy and save the VB.Net code below in a file called:  C:\fnParseString.vb or whatever you prefer.

Imports System.Collections
Imports Microsoft.SqlServer.Server
Partial Public Class UserDefinedFunctions
    <Microsoft.SqlServer.Server.SqlFunction( _
                FillRowMethodName:="GetNextToken", _
                 TableDefinition:="StringCol NVARCHAR(MAX)")> _
    Public Shared Function parseStringCLR(<SqlFacet(MaxSize:=-1)> ByVal Input As String, _
      ByVal Separator As Char) As IEnumerable
        Dim Result() As String
        Result = Input.Split(Separator)
        Return Result
    End Function
    Public Shared Sub GetNextToken(ByVal row As Object, ByRef TheToken As String)
        TheToken = CStr(row)
    End Sub
End Class

Step 2 - Compile CLR Code - (You only need to do this if you are not using Visual Studio to develop and deploy)

In order to use this code, the code has to be compiled first to create a DLL.

The following command is run from a command line to compile the CLR code using the vbc.exe application.  This is found in the .NET 2.0 framework directory.  This may be different on your server or desktop.  Also, this code should be compiled on the machine where the code will run.
So from a command line run the following command:

C:\WINDOWS\Microsoft.NET\Framework\v2.0.50727\vbc /target:library C:\fnParseString.vb

The code should now be compiled in a file called: C:\fnParseString.dll

Step 3 - Create Assembly and Function - (You only need to do this if you are not using Visual Studio to develop and deploy)

After the code has been compiled you need to create the assembly and the function with SQL Server.  To do this, run these commands in the database where you want to create the function. 
For the function you will see three components that are referenced CLRFunctions.UserDefinedFunctions .parseStringCLR .
  • CLRFunctions - the assembly reference
  • UserDefinedFunctions - the class reference in the VB code
  • parseStringCLR - the function reference in the VB code
CREATE ASSEMBLY CLRFunctions FROM 'C:\fnParseStringCLR.dll' 
GO 
CREATE FUNCTION [dbo].fnParseStringCLR(@string [nvarchar](4000), @separator [nchar](1))
RETURNS TABLE (
[StringCol] [nvarchar](max) NULL
) WITH EXECUTE AS CALLER
AS 
EXTERNAL NAME CLRFunctions.UserDefinedFunctions.parseStringCLR

Step 4 - Test It
To test the functions, run the following SELECT statements.

SELECT * FROM dbo.fnParseStringTSQL('SQL Server 2000|SQL Server 2005|SQL Server 2008|SQL Server 7.0','|')
GO
SELECT * FROM dbo.fnParseStringCLR('SQL Server 2000|SQL Server 2005|SQL Server 2008|SQL Server 7.0','|')
GO
SELECT * FROM dbo.fnParseStringTSQL('Apple,Banana,Pear',',')
GO
SELECT * FROM dbo.fnParseStringCLR('Apple,Banana,Pear',',')
GO

Here is the output from the above queries:
Result set for the first two queries


Result set for the second two queries



Here is another test that has over 1000 delimited items.  With this amount of data we can see that the CLR code is faster. Although each time you run this you may get different execution times, in my tests the CLR was always almost three times faster.

SELECT * FROM dbo.fnParseStringTSQL( '980,365,771,404,977,818,474,748,975,884,347,521,396,791,775,848,438,715,492,781,679,912,830,950,788,
387,738,435,799,475,429,415,967,373,721,882,717,708,837,393,740,360,954,990,794,378,327,741,468,516,
832,753,970,420,744,531,761,811,874,504,747,332,809,897,956,368,409,716,968,386,846,441,431,458,418,
961,479,507,505,367,454,523,425,872,804,981,964,328,362,861,927,841,765,423,939,326,766,714,405,344,
72,494,850,955,770,923,926,722,810,867,400,789,871,900,526,513,364,935,512,371,436,384,316,892,875,
988,906,374,895,911,844,798,745,342,319,330,760,983,478,978,783,973,986,868,467,855,853,388,449,773,
801,529,887,948,838,735,839,793,397,518,893,728,440,860,511,994,831,750,833,524,680,883,466,899,907,
375,933,985,942,414,486,905,894,739,462,885,940,357,514,443,749,754,946,947,725,343,473,452,807,931,
790,959,930,898,896,965,483,510,957,322,460,951,996,982,859,448,966,366,713,445,497,491,515,787,915,
743,470,972,932,944,489,485,732,525,758,772,379,806,325,411,976,805,755,395,752,389,958,995,417,902,
4,727,987,936,506,889,925,402,324,731,851,352,464,392,891,345,795,945,803,917,825,482,469,937,840,
434,383,496,488,450,828,377,774,391,706,934,484,522,729,819,842,490,908,422,763,757,826,920,916,350,
922,461,432,463,834,733,530,780,351,815,863,487,876,824,974,370,356,407,821,394,446,724,382,928,742,
465,817,528,778,969,341,372,999,849,711,442,730,428,857,869,993,759,918,406,710,862,949,796,878,919,
843,881,533,971,401,943,444,784,723,509,800,814,426,412,457,476,455,822,477,720,493,910,416,355,764,
410,421,984,782,736,707,776,471,989,456,903,886,317,914,430,802,827,779,880,399,953,459,962,904,519,
726,433,852,769,997,888,439,879,709,756,527,762,734,390,424,941,835,359,2,921,346,864,866,960,1,858,
719,369,532,979,413,358,323,398,534,873,447,321,403,495,348,991,913,823,797,820,816,847,385,865,376,
746,777,520,381,854,829,767,768,481,361,408,718,331,380,877,751,924,792,813,786,963,812,318,363,329,
929,3,535,427,992,909,508,437,938,320,998,419,901,517,349,856,808,480,870,737,785,453,890,952,451,
836,845,712,975,884,347,521,396,791,775,848,438,715,492,781,679,912,830,950,788,387,738,435,799,475,
429,415,967,373,721,882,717,708,837,393,740,360,954,990,794,378,327,741,468,516,832,753,970,420,744,
531,761,811,874,504,747,332,809,897,956,368,409,716,968,386,846,441,431,458,418,961,479,507,505,367,
454,523,425,872,804,981,964,328,362,861,927,841,765,423,939,326,766,714,405,344,472,494,850,955,770,
923,926,722,810,867,400,789,871,900,526,513,364,935,512,371,436,384,316,892,875,988,906,374,895,911,
844,798,745,342,319,330,760,983,478,978,783,973,986,868,467,855,853,388,449,773,801,529,887,948,838,
735,839,793,397,518,893,728,440,860,511,994,831,750,833,524,680,883,466,899,907,375,933,985,942,414,
486,905,894,739,462,885,940,357,514,443,749,754,946,947,725,343,473,452,807,931,790,959,930,898,896,
965,483,510,957,322,460,951,996,982,859,448,966,366,713,445,497,491,515,787,915,743,470,972,932,944,
489,485,732,525,758,772,379,806,325,411,976,805,755,395,752,389,958,995,417,902,4,727,987,936,506,
889,925,402,324,731,851,352,464,392,891,345,795,945,803,917,825,482,469,937,840,434,383,496,488,450,
828,377,774,391,706,934,484,522,729,819,842,490,908,422,763,757,826,920,916,350,922,461,432,463,834,
733,530,780,351,815,863,487,876,824,974,370,356,407,821,394,446,724,382,928,742,465,817,528,778,969,
341,372,999,849,711,442,730,428,857,869,993,759,918,406,710,862,949,796,878,919,843,881,533,971,401,
943,444,784,723,509,800,814,426,412,457,476,455,822,477,720,493,910,416,355,764,410,421,984,782,736,
707,776,471,989,456,903,886,317,914,430,802,827,779,880,399,953,459,962,904,519,726,433,852,769,997,
888,439,879,709,756,527,762,734,390,424,941,835,359,2,921,346,864,866,960,1,858,719,369,532,979,413,
358,323,398,534,873,447,321,403,495,348,991,913,823,797,820,816,847,385,865,376,746,777,520,381,854,
829,767,768,481,361,408,718,331,380,877,751,924,792,813,786,963,812,318,363,329,929,3,535,427,992,
909,508,437,938,320,998,419,901,517,349,856,808,480,870,737,785,453,890,952,451,836,845,712' ,',')


SELECT * FROM dbo.fnParseStringCLR( '980,365,771,404,977,818,474,748,975,884,347,521,396,791,775,848,438,715,492,781,679,912,830,950,788,
387,738,435,799,475,429,415,967,373,721,882,717,708,837,393,740,360,954,990,794,378,327,741,468,516,
832,753,970,420,744,531,761,811,874,504,747,332,809,897,956,368,409,716,968,386,846,441,431,458,418,
961,479,507,505,367,454,523,425,872,804,981,964,328,362,861,927,841,765,423,939,326,766,714,405,344,
72,494,850,955,770,923,926,722,810,867,400,789,871,900,526,513,364,935,512,371,436,384,316,892,875,
988,906,374,895,911,844,798,745,342,319,330,760,983,478,978,783,973,986,868,467,855,853,388,449,773,
801,529,887,948,838,735,839,793,397,518,893,728,440,860,511,994,831,750,833,524,680,883,466,899,907,
375,933,985,942,414,486,905,894,739,462,885,940,357,514,443,749,754,946,947,725,343,473,452,807,931,
790,959,930,898,896,965,483,510,957,322,460,951,996,982,859,448,966,366,713,445,497,491,515,787,915,
743,470,972,932,944,489,485,732,525,758,772,379,806,325,411,976,805,755,395,752,389,958,995,417,902,
4,727,987,936,506,889,925,402,324,731,851,352,464,392,891,345,795,945,803,917,825,482,469,937,840,
434,383,496,488,450,828,377,774,391,706,934,484,522,729,819,842,490,908,422,763,757,826,920,916,350,
922,461,432,463,834,733,530,780,351,815,863,487,876,824,974,370,356,407,821,394,446,724,382,928,742,
465,817,528,778,969,341,372,999,849,711,442,730,428,857,869,993,759,918,406,710,862,949,796,878,919,
843,881,533,971,401,943,444,784,723,509,800,814,426,412,457,476,455,822,477,720,493,910,416,355,764,
410,421,984,782,736,707,776,471,989,456,903,886,317,914,430,802,827,779,880,399,953,459,962,904,519,
726,433,852,769,997,888,439,879,709,756,527,762,734,390,424,941,835,359,2,921,346,864,866,960,1,858,
719,369,532,979,413,358,323,398,534,873,447,321,403,495,348,991,913,823,797,820,816,847,385,865,376,
746,777,520,381,854,829,767,768,481,361,408,718,331,380,877,751,924,792,813,786,963,812,318,363,329,
929,3,535,427,992,909,508,437,938,320,998,419,901,517,349,856,808,480,870,737,785,453,890,952,451,
836,845,712,975,884,347,521,396,791,775,848,438,715,492,781,679,912,830,950,788,387,738,435,799,475,
429,415,967,373,721,882,717,708,837,393,740,360,954,990,794,378,327,741,468,516,832,753,970,420,744,
531,761,811,874,504,747,332,809,897,956,368,409,716,968,386,846,441,431,458,418,961,479,507,505,367,
454,523,425,872,804,981,964,328,362,861,927,841,765,423,939,326,766,714,405,344,472,494,850,955,770,
923,926,722,810,867,400,789,871,900,526,513,364,935,512,371,436,384,316,892,875,988,906,374,895,911,
844,798,745,342,319,330,760,983,478,978,783,973,986,868,467,855,853,388,449,773,801,529,887,948,838,
735,839,793,397,518,893,728,440,860,511,994,831,750,833,524,680,883,466,899,907,375,933,985,942,414,
486,905,894,739,462,885,940,357,514,443,749,754,946,947,725,343,473,452,807,931,790,959,930,898,896,
965,483,510,957,322,460,951,996,982,859,448,966,366,713,445,497,491,515,787,915,743,470,972,932,944,
489,485,732,525,758,772,379,806,325,411,976,805,755,395,752,389,958,995,417,902,4,727,987,936,506,
889,925,402,324,731,851,352,464,392,891,345,795,945,803,917,825,482,469,937,840,434,383,496,488,450,
828,377,774,391,706,934,484,522,729,819,842,490,908,422,763,757,826,920,916,350,922,461,432,463,834,
733,530,780,351,815,863,487,876,824,974,370,356,407,821,394,446,724,382,928,742,465,817,528,778,969,
341,372,999,849,711,442,730,428,857,869,993,759,918,406,710,862,949,796,878,919,843,881,533,971,401,
943,444,784,723,509,800,814,426,412,457,476,455,822,477,720,493,910,416,355,764,410,421,984,782,736,
707,776,471,989,456,903,886,317,914,430,802,827,779,880,399,953,459,962,904,519,726,433,852,769,997,
888,439,879,709,756,527,762,734,390,424,941,835,359,2,921,346,864,866,960,1,858,719,369,532,979,413,
358,323,398,534,873,447,321,403,495,348,991,913,823,797,820,816,847,385,865,376,746,777,520,381,854,
829,767,768,481,361,408,718,331,380,877,751,924,792,813,786,963,812,318,363,329,929,3,535,427,992,
909,508,437,938,320,998,419,901,517,349,856,808,480,870,737,785,453,890,952,451,836,845,712' ,',')


Using the functions against data in a table

A reader asked how these functions can be used against data that exists in a table.  Here is a simple example of populating a table and then using these functions:

CREATE TABLE #temp (id int, stringData varchar(100))
INSERT INTO #temp VALUES (1,'SQL Server 2000,SQL Server 2005,SQL Server 2008,SQL Server 7.0,SQL Server 2012,SQL Server 2008 R2')
INSERT INTO #temp VALUES (2,'Apple,Banana,Pear')


SELECT y.id, fn.string 
FROM #temp AS y CROSS APPLY dbo.fnParseStringTSQL(y.stringData, ',') AS fn


DROP TABLE #temp


SummaryWith the above you can see the two different approaches to tackling this issue. For the most part you are probably fine with either code set, but do some testing to see which approach works best.


Source cocollected from mssqltips.com

No comments :

Post a Comment