A simple technique for populating tables and creating queries
Many T-SQL developers run into situations in which a simple one-column rowset containing sequential numbers (e.g., from 1 to 1000) or a set of sequential dates in a given range would be useful in performing a tricky SELECT operation or populating a table with test data. The most common solution in such cases is to create a temporary table and a simple loop to generate the required values. However, you can use Cartesian product operations to generate number, date, and time sequences without loops, then you can write user-defined functions (UDFs) that use such sequences to create sophisticated queries.
The Cartesian product of two rowsets is a rowset containing all possible combinations of rows from the first two rowsets. These row combinations are called tuples. Figure 1, page 2, shows how a Cartesian product operation works. In Figure 1, joining rowset A and rowset B produces the Cartesian product that rowset C shows.
In T-SQL, you can use the CROSS JOIN logical operator to obtain the Cartesian product of two or more rowsets. The following code example shows how to obtain the Cartesian product of two rowsets:
SELECT * FROM A CROSS JOIN B
The next example shows how to obtain the Cartesian product of three rowsets:
SELECT * FROM A CROSS JOIN B CROSS JOIN C
Generating Number and Date Sequences in T-SQL
To generate a sequence of numbers, you can use an operation that creates a Cartesian product, as Listing 1 shows. The first block of SELECT statements, combined by UNION ALL operators, results in a one-column rowset that contains the 10 numbers 0, 1, 2, 3, 4, 5, 6, 7, 8, and 9. The second block of SELECT statements results in a rowset containing the numbers 0, 10, 20, 30, 40, 50, 60, 70, 80, and 90. The CROSS JOIN operator after the first block of SELECT statements produces a two-column rowset, which contains the following pairs: (0, 0), (0, 10), ... (0, 90); (1, 0), (1, 10), ... (1, 90); ... (9, 0), (9, 10), ... (9, 90). The outer SELECT statement sums the members of each pair to obtain a sequence of numbers from 0 to 99.
You can use this technique to generate a wider range of sequences. For example, to obtain a sequence of numbers from 0 to 9999, add two SELECT blocksone for hundreds and one for thousandsas Listing 2 shows. Again, the outer SELECT statement sums members of each tuple to obtain the number sequence. Each tuple now contains four members because four source rowsets are used in the operation that created the Cartesian product.
To generate a date sequence, you can use the number sequence technique and the T-SQL function DATEADD(). Suppose you need to generate a sequence of dates from 1/1/2001 to 3/1/2001. First, you generate the sequence of numbers from 0 to 59. Then, use DATEADD() to add each number to the starting date 1/1/2001 as a day count. Listing 3 shows the query that produces this date sequence. The inner SELECT statements are the sequence generators that produce numbers from 0 to 59. The outer SELECT statement uses DATEADD() to generate dates from 1/1/2001 to 3/1/2001 (a total of 59 days). By changing the value of DATEADD()'s first parameter to mm, ww, or hh, you can easily obtain month, week, or hour sequences.
UDFs for Sequence Generation
You've probably noticed that the longer the sequence you want to generate, the longer and more cumbersome the T-SQL code becomes. In fact, when you use T-SQL SELECT statements to generate a sequence, the T-SQL code segment might be significantly larger than the code segment that uses the sequence values.
Fortunately, you can manage code length by using one of the most powerful features in SQL Server 2000table-valued UDFs. As its name suggests, this type of function returns a table; therefore, you can use table-valued UDFs anywhere that you can use a table or view in a T-SQL query. The optimal way to manage code length is to implement a sequence generator as a table-valued UDF. Then, you can simply reference the sequence generator's UDF in a FROM clause instead of writing a long, bulky code segment every time you need to generate a sequence.
Listing 4, page 4, shows two UDFs that implement a sequence generator. The fn_p() function generates one-column rowsets such as (0, 1, 2, 3, 4, 5, 6, 7, 8, 9) or (0, 10, 20, 30, 40, 50, 60, 70, 80, 90). These rowsets serve as "dimensions" for Cartesian product operations. First, the fn_p() function uses the value of the @exp parameter and the T-SQL function POWER() to calculate the power of 10. Then, fn_p() uses a block of SELECT statements linked by UNION ALL operators to generate the resulting rowset. The outer INSERT statement copies generated rows into the return variable @result. So, with a value of 0, the fn_p() function returns the rowset (0, 1, 2, 3, 4, 5, 6, 7, 8, 9), fn_p(1) returns the rowset (0, 10, 20, 30, 40, 50, 60, 70, 80, 90), and so on.