Feel free to experiment with the arguments in this solution's code, testing how changing the number of steps affects the result, and so on. Recall that the query that returns the table of steps is only part of the solution. You now have the missing part denoted as <query that calculates steps> in the following pseudo code:
SELECT step, count(*) AS cnt
FROM Samples JOIN (<query that calculates
steps>) AS Steps
ON value > f AND value <= t
WHERE measid = @measid
AND dt >= @fromdt AND dt < @todt
GROUP BY step
Replace the missing part with the query that Listing 4 shows. Listing 5 shows the complete solution, which produces the desired histogram that Figure 2 shows. Of all the measures for the requested counter and period, divided into five steps, one measure (26) fell within the first step's range, two measures (33, 35) fell in the second, and two measures (47, 50) fell in the fifth.
If, like me, you prefer the modular development approach to simplify your code and its maintenance, you could write a user-defined function (UDF) that accepts as arguments the number of steps, measurement ID, and date range and returns a steps table. Run the code that Listing 6 shows to create the fn_steps function.
To test the function, run the following code and verify that you get the results that Figure 1 shows:
SELECT * FROM fn_steps(1, 5, '20030101', '20030102')
Now you can use the fn_steps table instead of the derived table Steps, as the code in Listing 7 shows. You get the histogram that Figure 2 shows.
Solution 2: Calculating the Step Number On the Fly
This puzzle has another solution that doesn't involve generating a table of steps; it calculates the step number on the fly. With this solution, you don't need an auxiliary numbers table. Notice the FROM clause in callout A of Listing 8, page 18. You cross-join two derived tablesone called S, containing the rows from Samples that match the provided arguments, and the other called R, containing the minimum measurement value and the whole range size. The following expression, which you write in the SELECT list, calculates the step number:
floor((value-mn) / (1.0*range/@numsteps)) + 1
AS step
This expression uses logic similar to what you used to calculate the boundaries in the previous solution. The expression (value-mn) gives you the position of the value within the range, and (1.0*range/@numsteps) gives you the step size. By dividing the first operand by the second, flooring the result, and adding 1, you get the step number. Now that you have attached to each row a step number matching the criteria from Samples, you can use this query in a derived table called RS so that you can group the result by the calculated step number. Listing 8 shows the complete solution, in which the outer query groups the result by step number and counts the rows for each step.
About Performance
Of the two solutionsone using an auxiliary table and a modular approach and the other calculating step numbers on the flythe second solution is shorter and more elegant. To find out which performs better, you can use the script that Listing 9 shows to populate the Samples table with 1 million rows for testing. The rows contain measurement IDs in the range 1 to 10, with one sample per counter recorded every minute (1440 a day) and dates in the range January 1, 2003, to March 11, 2003. I compared the two solutions from Listing 5 and Listing 8, using periods ranging from 1 day to 1 month. In my tests, both solutions performed well, with the second performing a bit better both in terms of duration and in terms of I/O, so I'd probably use that one.