Last month I presented a task to calculate the maximum number of concurrent sessions for each application. I provided code to create and populate a table called Sessions. Web Listing 1 contains the code to create and populate the Sessions table with a small set of rows just to check the correctness of the solutions. Web Listing 2 contains the code to create a helper table function called GetNums, which returns a table result with a sequence of integers of a requested size. Web Listing 3 contains the code to populate the Sessions table with a large set of rows to test the performance of the solutions.
As a reminder, the task at hand involves calculating, for each application, the maximum number of concurrent sessions. That is, for each application, calculate the maximum number of simultaneously active sessions. Recall that in case one session ends at exactly the same point in time that another session starts, you need to implement a rule dictating whether both are considered active at that point. For our purposes, the assumption is that they aren’t. For the small set of sample data given in Web Listing 1, the desired output is shown in Table 1.
Table 1: Desired Output from Solution| app |
mx |
| app1 |
4 |
| app2 |
3 |
Last month I presented two solutions that I’ve been using for years. One is a set-based solution that uses a subquery with a count aggregate. I referred to that solution as the original set-based solution. I explained that the algorithmic complexity of that solution (or rather, the way its execution plan scales) is quadratic. That is, if you increase the number of rows per partition (application) by a factor of f for the same period of time, the run time increases by a factor of f2. So beyond very small partitions, the solution doesn’t scale well.
The second solution that I presented was a cursor-based solution. I explained that the algorithmic complexity of that solution is linear. That is, if the number of rows per partition increases by a factor of f, the run time also increases by a factor of f. The cursor-based solution scales better than the original set-based solution, but it has the obvious disadvantages of cursors related to readability, maintainability, and not being in accord with the relational model.
For years I looked for a set-based solution that performs better than the cursor-based solution for all partition sizesto no availuntil recently. In this article I present a new set-based solution with linear complexity that performs better than the cursor-based solution. I developed the solution based on insights of Darryl Page from the UK, who attended one of my classes in which I gave this problem as an exercise. I also present another set-based solution that’s based on language elements that SQL Server doesn’t yet support (as of SQL Server 2008). Once such support is introduced, the solution is likely to perform better than all others.
New Set-Based Solution
Listing 1 contains the new set-based solution that I recently developed based on Darryl Page’s insights. Figure 1 shows the execution plan for the solution.