UDFs help you work with a denormalized table
T-SQL programmers often need to handle data that comes in a denormalized form. Certain tables might appear in a denormalized form because they store data that was imported from legacy systems, for example, or from text files. Also, many online transaction processing (OLTP) environments aren't properly normalized even though they should be. Let's look at an example problem that involves temperature measurement samples stored in a denormalized form. I'd like to thank SQL Server MVPs Linda Wierzbicki and Steve Kass, who came up with some of the tricks that I use in this article.
Temperature Samples
The Bottoms Up fictional winery has vineyards at several sites. The owner gathers temperatures from each site as often as 30 times a day and performs various statistical calculations based on the measured temperatures. He uses the results to determine how much to water and fertilize the grapes. A legacy system records the temperatures and produces a daily text file in which each record contains a site ID and a comma-delimited array of temperature measurements in Celsius. The system then loads the text file to a table in SQL Server.
The table holds one column containing the site ID and another, character-based column containing the comma-delimited list of temperatures. You need to write queries that calculate various statistics such as average, minimum, and maximum temperatures for each site. Run the script in Listing 1 to create the TempSamples table and populate it with sample data.
The current table format doesn't even meet the first normal form (1NF) requirement of having no repeating groups. A normalized form of the table, such as the one that the code in Listing 2 creates, would be much more convenient to code against. So, you decide to write some code that will populate a copy of the TempSamples tablelet's call it NormTempSampleswith a normalized version of the data.
Solution That Uses a UDF
One way to approach the problem is to work out a way to extract a specific element from the comma-delimited list of elements. If you're working with SQL Server 2000, you can write a user-defined function (UDF) that performs the task. The function accepts two arguments: the ordinal position of the element that you want to extract and the string containing the comma-delimited list of elements. Run the script that Listing 3, page 18, shows to create the fn_getelement() UDF. Later, I discuss another solution that works in both SQL Server 2000 and 7.0.
Let's look at how the UDF works. First, the function validates the input. I'll discuss just one validity check that might need explanation. The logical expression
IF @ord > LEN(@str) - LEN(REPLACE(@str, ',', '')) + 1
returns true only if the specified element ordinal position is beyond the end of the list. The trick to correctly using this expression is to subtract the length of the string (without the commas) from the length of the original string (with the commas). In this case, you get the number of elements minus 1, so you need to add 1 back in. Next, the function finds the starting position of the desired element by using a WHILE loop that advances from one comma to the next until the number of iterations reaches the element in the desired position. Finally, the function extracts the desired element from the comma-delimited list, converts the character-based element to the integer data type, and returns the element.
After you've extracted a desired element from the list, you face another obstacle. You need to duplicate each site's row as many times as you have measurements for that site; then, you need to generate the appropriate sample number for each row so that you can invoke the function once for each sample. For example, if site A has five elements, you need to generate five rows with sample numbers ranging from 1 to 5. Then you invoke the function, supplying the list of measurements and the sample number as arguments. You can do all this by first creating an auxiliary table containing the numbers from 1 to the maximum possible number of measurements30, in this case. Running the code in Web Listing 1 (available online) generates the Nums auxiliary table and populates it with the desired range of numbers.
Prev. page  
[1]
2
next page