• subscribe
October 15, 2009 12:00 AM

Calculating Concurrent Sessions, Part 1

Discover a set-based and a cursor-based solution
SQL Server Pro
InstantDoc ID #102734

When you’re faced with a T-SQL querying problem, finding a set-based solution (as opposed to a cursor-based solution) is recommended for a number of reasons. First, set-based solutions are in accord with the relational model. Also, they typically require less code and less maintenance than cursor-based solutions. In addition, set-based solutions tend to perform better.

For many years, I couldn’t find a good-performing set-based solution for a particular querying problem that involves maximum concurrent sessions. I had an adequate set-based solution in terms of its logic, but it didn’t perform well. The only good-performing solution I had was cursor-based. I recently used this problem as an exercise in a class I was teaching, and one of my students—Darryl Page from the UK—had an idea that ultimately led to a good-performing set-based solution. In this article, I cover my original set-based solution, the cursor-based solution, and the performance and scaling ramifications of both solutions. Next month I’ll present the improved set-based solution based on Darryl’s insight, as well as a set-based solution that relies on standard language elements that aren’t yet implemented in SQL Server (as of SQL Server 2008) but hopefully will be supported in a future version.

The Problem
The task at hand involves a table called Sessions that keeps track of user sessions against an application. Suppose that every month you need to produce a report for the previous month’s activity, including the maximum number of concurrent sessions for each application. That is, for each application, you must calculate the maximum number of sessions that were active at the same time during the period of interest. You might need such information, for example, for billing purposes for per-user license types. Use the code in Listing 1 to create the Sessions table in the tempdb database (for test purposes) and populate it with sample data.

For simplicity, assume that the Sessions table contains only sessions during the period of interest, so you don’t need to filter a specific date range. Note that you should explicitly define the term concurrent for your customers. For example, if a session ends at the exact point in time that another session starts, are the sessions considered concurrent? For the purposes of our task, let’s assume that two such sessions are not considered concurrent. In this case, Table 1 shows the desired output for your solution, given the sample data in Listing 1.

Table 1: Desired Output from Solution
app mx
app1 4
app2 3

The indexing strategy that would help speed up both a set-based and a cursor-based solution is quite straightforward. Create one index on the key-list (app, starttime, endtime) and another on (app, endtime, starttime). Note that you must populate the Sessions table with a larger volume of sample data if you want to run performance tests. Use the code in Listing 2 to create a helper table function called GetNums that returns a sequence of integers of a requested size.

Use the code in Listing 3 to populate the Sessions table with a requested number of rows by querying the GetNums helper function. The code in Listing 3 populates the Sessions table with 10,000 rows, but you can adjust the number according to your needs by assigning your desired value to the variable @numrows. The sessions are associated with 10 different applications using randomization (1 + ABS(CHECKSUM(NEWID())) % 10), so about one tenth of the rows are associated with each application. I didn’t make the number of partitions (applications) variable, because adding partitions has a linear effect on all solutions (linear complexity), as I will show. A more interesting topic for discussion is how changing the partition size affects the different solutions’ performance. Changing the value of @numrows changes the partition size, because the number of partitions remains constant (10).



ARTICLE TOOLS

Comments
  • R Barry
    3 years ago
    Dec 01, 2009

    You know, after looking at my solution again, I am not sure that I should have included "keycol" in it, it performs much better without it (leveraging the indexes, no doubt).

  • R Barry
    3 years ago
    Nov 28, 2009

    Heh. That dog-gone Peter Larsson almost always manages to one-up me! :-)

  • Tim
    3 years ago
    Nov 12, 2009

    My solution which is not as good as the one already posted reduced the r/2 factor based on the maximum duration found for each application. Not exactly linear, but close:

    WITH AppTimestamps AS
    (SELECT app, starttime AS ts FROM dbo.Sessions),
    MaxDurs AS
    (SELECT app, max(endtime-starttime) AS maxdur FROM dbo.Sessions group by app),
    Both AS
    (Select A.app, ts, maxdur From AppTimeStamps A
    Join MaxDurs M on A.app = M.app),
    Counts AS
    (SELECT app,
    (SELECT count(*)
    FROM dbo.Sessions AS S
    WHERE T.app = S.app
    AND S.starttime > (T.ts-T.maxdur)
    AND T.ts >= S.starttime
    AND T.ts < S.endtime)
    AS cnt
    FROM Both AS T)

    SELECT app, MAX(cnt) AS mx
    FROM Counts
    GROUP BY app

  • R Barry
    3 years ago
    Oct 26, 2009

    Hmm, not sure how you post code here, all of my codes line-breaks got stripped-out.

  • R Barry
    3 years ago
    Oct 26, 2009

    This seems to work for me:

    ;WITH cteSplitStartStop AS
    (
    SELECT app, endtime as daytim, 0 as IsStart, keycol
    FROM IBGSessions
    UNION ALL
    SELECT app, starttime, 1 as IsStart, keycol
    FROM IBGSessions
    )
    , cteRunningCounts AS
    (
    SELECT app
    , IsStart
    , ROW_NUMBER() OVER(PARTITION BY app ORDER BY daytim, IsStart, keycol) as AllEvents
    , ROW_NUMBER() OVER(PARTITION BY app, IsStart ORDER BY daytim, keycol) as TypEvents
    FROM cteSplitStartStop
    )
    SELECT app
    , MAX( 2*TypEvents - AllEvents ) AS MaxSessions
    FROM cteRunningCounts
    WHERE IsStart = 1 -- only check the starts
    GROUP BY app;

    [Reply:

    Excellent! I got similar solutions from Ben Flanaghan and Peter Larsson on behalf of Arnold Fribble. This type of solution warrants its own article, and I’m planning to cover it in a future article as the third part in the series.

    Cheers,
    Itzik]

You must log on before posting a comment.

Are you a new visitor? Register Here