DOWNLOAD THE CODE:
Download the Code 94376.zip

Solution 2: Sophisticated Recursive Query
My next solution is more sophisticated than the previous one and is also based on a recursive query. Listing 3 shows the solution code, including the function’s definition. The code defines a recursive CTE called SQR that generates as many rows as the ceiling of the square root of @max. Then a second CTE called Product produces a Cartesian product (cross join) of two instances of SQR. This product will have at least @max rows—or a bit more, since I used the expression CEILING(SQRT(@ max)). Without calculating the ceiling, you might get fewer rows in the product. So it’s better to return more rows than you might need, then use the TOP option in the outer query to obtain exactly the number of rows you need.

The outer query uses the ROW_ NUMBER function to actually generate the sequence of numbers. Notice that the ROW_NUMBER function uses a column called const in the ORDER BY clause. This column is nothing more than a constant that I produced in the SQR CTE. The optimizer is smart enough to figure out that it’s a constant, thus avoiding an explicit sort operation altogether. This solution is very efficient.

Run the following code to test the function (again, with results discarded):

SELECT n FROM
  dbo.fn_nums(10000000) AS Nums
OPTION (MAXRECURSION 0);

The query ran for eight seconds in my test system. That’s very fast—however, you need to specify the MAXRECURSION hint here as well.

Solution 3: Nonrecursive Query
My last solution doesn’t rely on recursive CTEs; therefore, its benefit is that you don’t need to specify any hints in the query against the function. Listing 4 shows the solution code.

The code defines a CTE called C0, which returns two rows with a single column called const that holds a constant value. Here, the CTE’s purpose is simply to generate two rows; it doesn’t really matter what those rows contain. Then a series of CTEs (C1, C2, C3, C4, C5, C6) double the rows of the CTE defined before them by performing a cross join between two instances of the previous CTE. The number of rows you can potentially get via a CTE Cn is 2^(2^n). For example, having six CTEs besides C0 can potentially yield 18, 446,744,073,709,552,000 rows. Having five CTEs besides C0 can potentially yield about four billion rows. In other words, six CTEs would be sufficient in practical terms for any number of rows that you might need.

The outer query then uses the TOP option to return the number of requested rows (@max), and a ROW_NUMBER function actually generates the sequence of integers (again, by using a column holding a constant in the ORDER BY clause). The nice thing about the optimization of this code is that the optimizer is smart enough to figure that it needs to generate only @max rows. It doesn’t bother to generate any more rows, although logically it might seem as if the code is doing so. Therefore, this solution is very fast.

To test the function, run the following code (again, with results discarded):

SELECT n FROM
 dbo.fn_nums(10000000) AS Nums;

This code runs for nine seconds for 10 million rows. It’s slightly slower than the previous solution, but its important advantage is that you don’t need to specify any hints in the query against the function.

Fast and encapsulated
I’ve examined three different solutions for generating a virtual auxiliary table of numbers via a table-valued function. The first solution uses a simple recursive query that iterates once per row. This solution is the slowest and requires you to specify a hint in the query against the function. The second solution produces a Cartesian product between two instances of a recursive CTE containing as many rows as the square root of @max. This solution is very efficient, but it also requires you to specify a hint in the query against the function. The third solution uses a series of CTEs, each doubling the number of rows from the previous CTE. This solution doesn’t rely on recursive CTEs—rather, nonrecursive ones. It’s both very fast and has the advantage that you don’t need to specify a hint in the query against the function, so we can safely announce it as the winner!

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

I believe there is an error in code listing #3, "Nums" should be "SQR".

[Itzik: you're right; I'll ask for a fix. Thanks!]

MarkGodfrey2

Article Rating 3 out of 5

Hello - here is Itzik's correction. We'll get the file fixed.

The error was: first FROM clause referred to Nums and should refer to SQR. Regards, Itzik

DianaMay

Article Rating 5 out of 5

 
 

ADS BY GOOGLE