• subscribe
February 19, 2003 12:00 AM

Calculating Concurrent Sessions

When is using cursors the best solution?
SQL Server Pro
InstantDoc ID #37636
Downloads
37636.zip

Usually, set-based solutions perform better than iterative solutions that use cursors and loops, but there are exceptions. One problem for which I haven't found a set-based solution that performs better than the iterative one is the subject of this month's exercise. Performance aside, the problem is quite a brainteaser, and I urge you to try to find set-based solutions for it to test your T-SQL skills. I'd like to thank Assaf Fraenkel, a senior consultant at Microsoft Consulting Services, who got me acquainted with the puzzle and suggested one approach to solving it. Thanks also to Dejan Sarka and Dieter Nöth, whose ideas are embedded in the solutions I present.

The Max Concurrent Sessions Problem
The problem involves calculating the maximum number of concurrent sessions for each application that an organization uses. For this problem, a table called Sessions stores information about application use. Each row contains one session's worth of data, including the application, user, host, start time, and end time. Run the script that Listing 1 shows to create the Sessions table and populate it with sample data.

To distinguish between the rows in the Sessions table, I used the key_col column, a surrogate key that has the IDENTITY property. Theoretically, one user can open more than one session at the same time against the same application from the same host. Your task is to calculate the maximum number of concurrent sessions per application, regardless of user, host, or period. In other words, calculate the maximum number of sessions that were active at any time for each application.

Table 1 shows the desired results. The IT department uses this information to calculate payments to the applications' suppliers, which charge by the maximum number of concurrent sessions. You can assume that the table doesn't contain more than 1 month's worth of sessions. You now have all the information required to solve the problem.

Solution 1: Custom Time Slots
One approach to solving the problem is to generate all possible time slots between the minimum and maximum existing start times in the table, then count the number of sessions in which each time slot occurs for each application. A time slot is the start time of a time slice. Because the table stores session start and end times as a smalldatetime data type, they're accurate to the minute. For example, after you populate the Sessions table, the minimum and maximum session start times should be 2003-02-12 08:30 and 2003-02-12 15:30, respectively. The code you write should split the period between the minimum and maximum start times into 1-minute slots: 2003-02-12 08:30, 2003-02-12 08:31, 2003-02-12 08:32, ..., 2003-02-12 15:29, 2003-02-12 15:30. Next, count the number of sessions that took place during each time slot for each application. To verify that a certain session took place during a given time slot, use the following expression:

timeslot >= session_starttime AND
timeslot < session_endtime

Finally, of all the counts you got for each application and time slot, return the maximum count for each application.

To generate all possible time slots in the table, first create an auxiliary table with as many numbers as the maximum number of minute slots you anticipate. Because the table contains no more than a month's worth of data, you have no more than 44,640 (60*24*31) minute slices. If you want to test the solution against, for example, 3 years' worth of data, make sure you populate the Nums table with 3 years' worth of numbers (60*24*365*3= 1,576,800). Running the script that Listing 2 shows creates the Nums auxiliary table with a month's worth of numbers.

The code that Listing 3 shows creates the VTimeSlots view, which calculates the actual minute time slots in the period between the minimum and maximum start times in the Sessions table. The query that Listing 4 shows calculates the desired result. The innermost subquery matches each time slot in the VTimeSlots view with all sessions that took place at that time. The query groups the result—derived table TSAPP—by the timeslot and app columns to count the number of concurrent sessions per application. The query then groups the result—derived table C—by app to calculate the maximum number of concurrent sessions per application.

This solution implements a fairly simple concept; however, it isn't scalable. The problem is in the potentially huge number of time slots generated; a 31-day month has 44,640 minute time slots, and the query looks for matching sessions for each of those. Think of the implications of handling longer periods or using the datetime data type, which has an accuracy of 3 1/3 milliseconds.



ARTICLE TOOLS

Comments
  • Kate Luxemburg
    9 years ago
    Mar 17, 2003

    Addendum to my previous comment: The max number of session-pairs to test for concurrency, where n is the number of sessions, is (n * n-1)/2 With use of a variable that decrements the max possible concurrency as the number of session-pairs remaining to be examined for concurrency decreases, and a variable that compares the max possible concurrency to the max concurrency found so far, it is possible to declare the max found and break out of the loop in as few as n compares (that is where all sessions are concurrent).

  • Kate Luxemburg
    9 years ago
    Mar 17, 2003

    Correction to addendum: The min number of session-pair compares is n-1, not n. (I am working with a laptop keyboard and "skating rink" mouse-equivalent as I enter text in this comment window-- that makes it difficult to edit the comment.)

  • Kate Luxemburg
    9 years ago
    Mar 16, 2003

    The solution is clearly and methodically presented BUT consider that it is overkill to solve the problem by examining all possible minutes in the interval for concurrency, given that limit of concurrency is the total number of sessions in the interval for the app-- that is, the limit is reached if there exists any one second in which all the sessions are "on." Order the sessions for each app by start time and number them sequentially. In a table with more than one app, the primary key would be appName + seqNo. For each app, session 1 start < session 2 start < session 3 start ad fin. Examine session pairs for concurrency rather than timeslices. Session 2 is concurrent with session 1 if session 2's start time < session 1's end time. Session 3 is concurrent with 1&2 if its start time < the min(end time) of 1&2. Say there are six sessions for a particular app during the interval. The max number of tests necessary to determine max concurrency is 15-- that is, compare 1&2, 1&3, 1&4, 1&5, 1&6, 2&3, 2&4, 2&5, 2&6, 3&4, 3&5, 3&6, 4&5, 4&6, and 5&6. This query generates the set to examine:

    select s.seqNo,c.seqNo,
    s.startTime, c.startTime,
    isnull(s.endTime,getdate()), isnull(c.endTime,getdate())
    from appSessionsInterval s join appSessionsInterval c
    on s.seqno < c.seqno
    order by s.seqNo, c.seqNo

    Then I used a cursor on this set to count concurrency. Send me email if you want the Transact-SQL.

You must log on before posting a comment.

Are you a new visitor? Register Here