• subscribe
April 26, 2001 12:00 AM

Adding Performance

SQL Server Pro
InstantDoc ID #20131
Downloads
20131.zip

Use mathematics to optimize T-SQL code

Editor's Note: Send your experts-only T-SQL tips to Itzik Ben-Gan at blackbelt@sqlmag.com. If we use your tip in the magazine, you'll recieve $100 and an exclusive T-SQL Black Belt shirt.

Adding Performance
How many ways can you solve a given business problem by using T-SQL? In most cases, you have several options. But how many times have you considered using mathematics to help you solve business problems? You probably don't often consider that approach if your problem isn't purely mathematical. However, with the aid of mathematics, you can sometimes reduce the amount of T-SQL code you need to solve a problem and optimize the code's performance considerably. In this article and the next, I'll present some business problems and show you how to use a mathematical approach to solve them.

Calculating a Group Product
This example, which Shelly Beckwith of Paternity Testing Corporation and Ted Malone (founder of SQLtrain.com) provided, comes from a biological testing facility where a table is created in a SQL Server database to store the results of individual biological tests. Each test has a specific test ID and a case number the tester assigns before the tests begin. Any case number can have multiple test results. You need to find a final result that is the product of all individual test results for a given case number. Listing 1 contains a script that creates the Testresults table and populates it with sample data.

This problem calls for an aggregate query because it requires the aggregate product of all individual test results for each case number. T-SQL provides aggregate functions such as SUM(), MIN(), MAX(), and COUNT(), but it doesn't include a PRODUCT() aggregate function. However, you can approach this problem from several other directions. For example, you can declare a cursor on a query that sorts the Testresults table's rows by case ID and test ID and assign the value 1 to a variable. Then, use T-SQL code to create a loop to fetch the cursor rows one by one, multiplying the variable by the current row's value. Every time the caseid column value changes, the T-SQL code inserts into a temporary table a row that contains the cumulative product of the test result values, which is stored in the variable, and resets that variable back to 1.

Another approach is to use a temporary table and a loop instead of cursors. The solution in Listing 2 demonstrates the temporary table approach, which the biological testing facility originally used. The code uses a SELECT INTO statement to copy all the rows from the Testresults table into the #T1 temporary table, then adds another column called used and populates it with zeros. The code processes the rows in #T1 in a loop, starting with the row that has the lowest test ID and continuing with the next test ID until it has processed all rows. In each iteration of the loop, the code updates the current row's result column to its current value multiplied by the test result value of the previous processed row that has the same case ID. If the code has processed no rows with the same case ID, it multiplies the value by 1. When the loop finishes, the row with the maximum test ID within each case ID holds the product of all the test result values with the same case ID. The test result values in this specific scenario can't be negative, so the maximum test result value for each case ID in the #T1 table is the desired product, and the GROUP BY query in Listing 2 returns the desired result, which Table 1 shows.

You could also use a mathematical approach to tackle this problem. If you could use one aggregate query to produce the desired result, you'd almost certainly have better performance than the other approaches I discussed. For example, T-SQL provides the SUM() aggregate function. If you could use SUM() to calculate a product, your work would be done. I'll save you the trouble of taking out your mathematics books to search for formulas that translate addition to multiplication: You can find one such formula in the world of logarithms.



ARTICLE TOOLS

Comments
  • Diana
    6 years ago
    Sep 05, 2006

    Hi Diana,

    Sorry for the late reply; I was offline (hiking)...

    Here's my reply to Jeff:

    Since the log functions in T-SQL are based on imprecise floating point
    datatypes you should expect a certain level of inaccuracy, and should
    not rely on such calculations for fractional values. I'd recommend
    taking a look at the series of articles about custom aggregations and
    considering the alternative methods (cursors, pivoting, user defined
    aggregates).

    Regards,
    Itzik

  • Jeff
    6 years ago
    Jul 25, 2006

    If the values are fractional such as 0.001, Log returns 0. Is there a way to calculate products of small fractional numbers??? I tried increasing the precision for the scale, but it will then mess up large numbers.

    Any input would be appreciated.

    -Jeff

  • Joe Celko
    11 years ago
    Aug 25, 2001

    Here is another version of the PRD() aggregate function:

    Here is a version of the aggregate product function in SQL. You will need to have the logorithm and exponential functions. They are not standards, but they are very common.

    The idea is that there are three special cases - all positive numbers, one or more zeroes, and some negative numbers in the set.
    You can find out what your situation is with a quick test on the sign() of the minimum value in the set.

    Within the case where you have negative numbers, there are two sub-cases: (1) an even number of negatives or (2) an odd number of negatives. You then need to apply some High School algebra to determine the sign of the final result.

    SELECT CASE MIN (SIGN(nbr))
    WHEN 1 THEN EXP(SUM(LN(nbr))) -- all positive numbers
    WHEN 0 THEN 0.00 -- some zeroes
    WHEN -1 -- some negative numbers
    THEN (EXP(SUM(LN(ABS(nbr))))
    * (CASE WHEN
    MOD (SUM(ABS(SIGN(nbr)-1)/ 2)), 2) = 1
    THEN -1.00 ELSE 1.00 END)
    ELSE NULL END AS big_pi
    FROM NumberTable;

    --CELKO--

You must log on before posting a comment.

Are you a new visitor? Register Here
  • SP1?
    I know there is a SP1 for SQL 2008 R2 available....and there is a "feature pack" as well... ...
  • SQL database mirroring
    I have SQL Server 2008 R2 Enterprise 64bit on Windows 2008 R2 Enterprise 64bit.  Each SQL Server has...
  • Dell Compellent Disk Drive
    Does anybody has experience with Dell Compellent Disk Drive? Basically, this system manages all disk...
  • Sql server performance tuning
    I need to find a tool that help me to optimize sql server,queries,improve the performance and solve ...