DOWNLOAD THE CODE:
Download the Code 24517.zip

Let's see what happens when you run the query against the larger data set in titles_big:

SELECT SUM(dbo.WordRepeatedNumTimes
  (title, 'the'))
FROM titles_big

The query still runs, but it took 2 minutes and 23 seconds to run on my laptop. A simple SUM() of the price column returned in a few milliseconds. So, you can learn a few lessons from this example:

  • UDFs let you extend T-SQL in many useful ways.
  • You shouldn't perform an activity just because you can. T-SQL is a set-based language, and you must consider the performance implications of performing non­set-based activity such as this solution in the database.
  • Usually you can find more than one way to solve a SQL problem. The solution I wrote works and solves the reader's problem. However, in this case, the reader is probably better off thinking outside the box and implementing full-text indexing.

You can also create a set-based solution for counting the number of times a word appears in a string. SQL Server Magazine contributing editor Itzik Ben-Gan suggested this clever and simple solution. The following query shows how many times the word the appears in the title column:

SELECT title, (LEN(title) - LEN(REPLACE(title, 'the', ')))/LEN('the')
FROM titles_big

And the next query sums the occurrences of the across the entire table:

SELECT SUM((LEN(title) - LEN(REPLACE(title, 'the', ')))/LEN('the'))
FROM titles_big

You simply replace all occurrences of the target word with an empty string, then compare the length of the string before and after the replacements. So if the original string is nine characters longer than the replacement string and the target word the is three characters long, you can see that you removed three occurrences of the target word. You can usually replace a row-by-row approach with set-based logic, as Ben-Gan's example shows.

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

Great function! Excellent use of udf.

Kenneth Johnson

 
 

ADS BY GOOGLE