Using Logarithms
First, let's review some logarithm concepts. Given the equation ay = x, where x is a positive real number and a is a number other than 1, the logarithm of x to base a is y. You can represent this relationship as loga x = y. Let's call this equation EQ1. One rule of exponents says that loga (xy) = loga x + loga y. This equation, which I'll call EQ2, seems promising because it turns multiplication into addition and vice versa. T-SQL provides the LOG10() function, which calculates to base 10 the logarithm of an argument you pass to it. You can sum all the LOG10() result values for each case ID as the following query shows:
SELECT
caseid,
SUM(LOG10(result)) AS sum_log
FROM Testresults
GROUP BY caseid
Now, let's represent the test result values in a certain case ID as v1, v2, ..., vn. If you place those values into EQ2, you get log10(v1 *v2 * ... *vn) = log10 v1 + log10 v2 +...+ log10 vn. For each case ID, you already know the result of the equation's right side. According to EQ1, if you raise 10 to the power of the right side's result, you get the product of all values for that particular case ID. In T-SQL terms, you could revise the query as follows:
SELECT
caseid,
POWER(10., SUM(LOG10(result))) AS 'final result'
FROM Testresults
GROUP BY caseid
This query is sufficient if all the test result values are positive, but if at least one test result value is zero, the query returns a domain error because the LOG10() function doesn't accept zero or negative values. Let's insert the following zero result value row into the Testresults table:
INSERT INTO Testresults VALUES(3, 0)
Listing 3 shows the previous query as I revised it to handle zeros. It uses a CASE expression to calculate the number of zero-value occurrences in the result column for each case ID. If at least one zero test-result value exists in a certain case ID, the script returns a zero value as the final result; otherwise, it calculates the product of all test-result values.
In the biological testing facility scenario, test results can't be negative, but you can generalize the previous query to let it support negative test result values as well. First, insert the following negative-result value row into the Testresults table:
INSERT INTO Testresults VALUES(2, -1)
The script in Listing 4 uses a CASE expression to calculate the number of negative test-result values. If the number of negative test-result values is odd, the script multiplies the product of the absolute test-result values by -1; otherwise, it multiplies the product by 1.
Now that I've demonstrated how you can use mathematics to shorten your T-SQL code and improve its performance, you can apply this type of approach to many different scenarios. Next month, I'll discuss another example that can benefit from such an approach.