SideBar    The Logical Puzzle
DOWNLOAD THE CODE:
Download the Code 93521.zip

Randomization is a common need in programming languages such as T-SQL. Prime examples of T-SQL randomization needs include generating a random integer value in a range 1 through n, returning randomly sorted data, returning random n rows, and returning random n rows per group. Let's walk through each of those randomization needs and determine how to provide solutions in T-SQL. In particular, I'll focus on how to improve the solutions by using some of the new T-SQL features in SQL Server 2005, including the use of TOP with input expressions, the APPLY operator, and the TABLESAMPLE option.

Generating a Random Integer Value
Given an input integer value @n, you're tasked with generating a random value in the range 1 through @n. This classic request is probably one of the most common randomization needs. However, the solution in T-SQL isn't particularly straightforward.

One common solution among programmers is to use the RAND function, as follows:

DECLARE @n AS INT; 
SET @n = 10; 
SELECT CAST(RAND()*@n AS INT) + 1; 

RAND generates a float value in the range 0 through 1. By multiplying the result of RAND by @n, you get a float value in the range 0 through @n. Converting a float value to an integer doesn't round the result but rather floors it (i.e., gets rid of the fraction). Adding 1 ensures that the minimum result value will be 1 and not 0.

This solution has two problems. One problem is that the RAND function isn't truly random.The function accepts an integer seed as input, then calculates a float value based on the input in a deterministic manner. For example, when you run

SELECT RAND(42); 

several times, you'll always get the 0.71435594503451 float value back.

Even when you invoke the RAND function without specifying a seed as input, SQL Server calculates a new seed value based on the previous RAND invocation. For example, if you run

SELECT RAND(42); 
SELECT RAND(); 
SELECT RAND(); 

several times, you'll always get the same results back: 0.71435594503451, 0.04100 99860282736, and 0.649384188470534.

Sometimes, you might want to rely on such behavior. For example, if you need to generate repeatable values for a demonstration or for performance tests, all you need to do is invoke the RAND function with a known seed at the beginning of your script and later invoke it without a seed. However, if you expect truly unpredictable random values, RAND without an input won't give you what you need.

If you need to generate an unpredictable random float value in the range 0 through 1, you need to provide a random integer seed to the RAND function. At first, this situation appears to be a catch-22, but it isn't.You can generate a random integer seed with the expression CHECKSUM(NEWID()).NEWID() generates a new globally unique identifier (GUID) with each invocation, and the CHECKSUM function generates an integer checksum value based on the input.Thus, the following expression generates a random float value in the range 0 through 1:

SELECT RAND(CHECKSUM(NEWID())); 

Another problem with the original solution I provided for generating a random integer value in the range 1 through @n is that RAND generates a float value in the range 0 through 1— inclusive.That is, the float values 0E0 and 1E0 are included in the range of values that RAND can return. Therefore, the original solution will return @n + 1 if RAND returns exactly 1E0. Of course, the probability to get exactly 1E0 back from RAND is very low—so low that you might be willing to take the risk. But if you want to ensure that the result will be in the range 1 through @n, you can subtract a very small fraction from @n.

Here's the revised solution for returning a random integer value in the range 1 through @n, addressing both problems:

DECLARE @n AS INT; 
SET @n = 10; 
SELECT CAST(RAND(CHECKSUM(NEWID())) *
  (@n - 0.00000000001) AS INT) + 1; 

This expression is lengthy, and you might consider such a solution awkward. Fortunately, I've learned a simpler solution from SQL Server MVP Steve Kass:

DECLARE @n AS INT; 
SET @n = 10; 
SELECT 1 + ABS(CHECKSUM(NEWID())) % @n; 

The CHECKSUM(NEWID()) expression generates a random integer value in the range of values supported by a four-byte integer datatype. The ABS(CHECKSUM (NEWID())) % @n expression returns a random integer in the range 0 through (@n - 1). Add 1, and you get a random integer in the range 1 through @n.

   Prev. page   [1] 2 3     next page
 
 

ADS BY GOOGLE