• subscribe
April 26, 2001 12:00 AM

Adding Performance

SQL Server Pro
InstantDoc ID #20131
Downloads
20131.zip

Using Logarithms
First, let's review some logarithm concepts. Given the equation ay = x, where x is a positive real number and a is a number other than 1, the logarithm of x to base a is y. You can represent this relationship as loga x = y. Let's call this equation EQ1. One rule of exponents says that loga (xy) = loga x + loga y. This equation, which I'll call EQ2, seems promising because it turns multiplication into addition and vice versa. T-SQL provides the LOG10() function, which calculates to base 10 the logarithm of an argument you pass to it. You can sum all the LOG10() result values for each case ID as the following query shows:

SELECT
	caseid,
	SUM(LOG10(result)) AS sum_log
FROM Testresults
GROUP BY caseid

Now, let's represent the test result values in a certain case ID as v1, v2, ..., vn. If you place those values into EQ2, you get log10(v1 *v2 * ... *vn) = log10 v1 + log10 v2 +...+ log10 vn. For each case ID, you already know the result of the equation's right side. According to EQ1, if you raise 10 to the power of the right side's result, you get the product of all values for that particular case ID. In T-SQL terms, you could revise the query as follows:

SELECT
	caseid,
	POWER(10., SUM(LOG10(result))) AS 'final result'
FROM Testresults
GROUP BY caseid

This query is sufficient if all the test result values are positive, but if at least one test result value is zero, the query returns a domain error because the LOG10() function doesn't accept zero or negative values. Let's insert the following zero result value row into the Testresults table:

INSERT INTO Testresults VALUES(3, 0)

Listing 3 shows the previous query as I revised it to handle zeros. It uses a CASE expression to calculate the number of zero-value occurrences in the result column for each case ID. If at least one zero test-result value exists in a certain case ID, the script returns a zero value as the final result; otherwise, it calculates the product of all test-result values.

In the biological testing facility scenario, test results can't be negative, but you can generalize the previous query to let it support negative test result values as well. First, insert the following negative-result value row into the Testresults table:

INSERT INTO Testresults VALUES(2, -1)

The script in Listing 4 uses a CASE expression to calculate the number of negative test-result values. If the number of negative test-result values is odd, the script multiplies the product of the absolute test-result values by -1; otherwise, it multiplies the product by 1.

Now that I've demonstrated how you can use mathematics to shorten your T-SQL code and improve its performance, you can apply this type of approach to many different scenarios. Next month, I'll discuss another example that can benefit from such an approach.



ARTICLE TOOLS

Comments
  • Diana
    6 years ago
    Sep 05, 2006

    Hi Diana,

    Sorry for the late reply; I was offline (hiking)...

    Here's my reply to Jeff:

    Since the log functions in T-SQL are based on imprecise floating point
    datatypes you should expect a certain level of inaccuracy, and should
    not rely on such calculations for fractional values. I'd recommend
    taking a look at the series of articles about custom aggregations and
    considering the alternative methods (cursors, pivoting, user defined
    aggregates).

    Regards,
    Itzik

  • Jeff
    6 years ago
    Jul 25, 2006

    If the values are fractional such as 0.001, Log returns 0. Is there a way to calculate products of small fractional numbers??? I tried increasing the precision for the scale, but it will then mess up large numbers.

    Any input would be appreciated.

    -Jeff

  • Joe Celko
    11 years ago
    Aug 25, 2001

    Here is another version of the PRD() aggregate function:

    Here is a version of the aggregate product function in SQL. You will need to have the logorithm and exponential functions. They are not standards, but they are very common.

    The idea is that there are three special cases - all positive numbers, one or more zeroes, and some negative numbers in the set.
    You can find out what your situation is with a quick test on the sign() of the minimum value in the set.

    Within the case where you have negative numbers, there are two sub-cases: (1) an even number of negatives or (2) an odd number of negatives. You then need to apply some High School algebra to determine the sign of the final result.

    SELECT CASE MIN (SIGN(nbr))
    WHEN 1 THEN EXP(SUM(LN(nbr))) -- all positive numbers
    WHEN 0 THEN 0.00 -- some zeroes
    WHEN -1 -- some negative numbers
    THEN (EXP(SUM(LN(ABS(nbr))))
    * (CASE WHEN
    MOD (SUM(ABS(SIGN(nbr)-1)/ 2)), 2) = 1
    THEN -1.00 ELSE 1.00 END)
    ELSE NULL END AS big_pi
    FROM NumberTable;

    --CELKO--

You must log on before posting a comment.

Are you a new visitor? Register Here