Samstag, 11. März 2017

T-SQL: PARSING VALUES BASED ON NTH OCCURENCE OF A CHARACTER IN A STRING



I recently had to parse values stored in a column that had following pattern: -'Word1-Word2-Word3-Word4-' etc.
SQL Server already has the built in function CHARINDEX() which is quite helpful, but the issue is that it only returns the position of the 1st occurence.
Technically I could nest one CHARINDEX() in another one, but that would create a big piece of code, is a little hard to read plus not a dyamic solution. So, I decided to create a function that returns the values based on input parameters:

CREATE FUNCTION dbo.ParseStringValue(
@SplitCharacter nvarchar(5),
@TargetString nvarchar(50),
@NthOccurence int)

RETURNS nvarchar(50)

AS
BEGIN

DECLARE @Pos1 int = 0,
@Pos2 int = 0,
@Count int = 0,
@Result nvarchar(25)

WHILE @Count <= @NthOccurence
BEGIN

SET @pos1 = @pos2
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)
SET @Count = @Count +1

END

SET @Result = SUBSTRING(@TargetString, @pos1+1, (@pos2-1)- @pos1)

RETURN (@Result)
END


I will use this string for our sample '-Aziz-Sharif-blogspot-com-' and will search for the value behind the 4th occurence of the split character '-'

The function will be called in the SELECT part and I need to pass 3 parameters 
@SplitCharacter: The seperator value '-'
@TargetString: The string value - here a fixed string for demonstration, could be a column as well
@NthOccurence: The nth occurence of a character in a string I wish to be returned

SELECT dbo.ParseStringValue('-', '-Aziz-Sharif-blogspot-com-', 4)

The while loop will go through the same piece of code and stop until it reached the nth occurence:

WHILE @Count <= @NthOccurence
BEGIN

SET @pos1 = @pos2
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)
SET @Count = @Count +1

END


I will run the query in debug mode and break down the values of each run.











Within the loop I define the positions of the split character. Since I need to find the values between 2 characters, I have declared 2 integer variables: @pos1 and @pos2.


SET @pos1 = @pos2
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)


@pos2 is passed to @pos1. So @pos1 stores the most recent value and @pos2 returns the next position after @pos1 of the split character I'm searching for:
SET @pos2 = CHARINDEX(@SplitCharacter, @TargetString, @pos1 + 1)
The @pos1 + 1 part will define ther start position .

The final relevant values stored are:
@pos1 = 22 -> '-Aziz-Sharif-blogspot-com-'
@pos1 = 26 -> '-Aziz-Sharif-blogspot-com-'

Since I need to extract the value between the 2 split characters, I need to add 1 position to the start position and deduct 1 from the end position:
SET @Result = SUBSTRING(@TargetString, @pos1+1, (@pos2-1)- @pos1)

The function returns the value after the 4th occurence in the defined string