• subscribe
January 18, 2005 12:00 AM

Nonexistent Ranges

With logic, even querying nonexistent data is no problem
SQL Server Pro
InstantDoc ID #44817
Downloads
44817.zip

Solution 3: Using an Auxiliary Table of Dates
The third solution applies to SQL Server 2005 only. This solution involves using an auxiliary table that contains all possible dates you need to cover. In this example, you need only 31 rows to contain all possible dates in January 2005, but in practice, the table would likely contain several years' worth of data. Still, the auxiliary table would be small (hundreds to perhaps thousands of rows) even if you need to cover several years.

Run the code that Listing 6 shows to create the Dates auxiliary table and populate it with 10 years' worth of data. The first step of this solution is to return all nonexistent sales dates during January 2005 for each store. You need to cross-join the Dates auxiliary table with the Stores table and use a NOT EXISTS filter that checks that there's no row in Sales that contains the date from Dates and the store ID from Stores. Run the code that Listing 7 shows to return all nonexistent dates for each store during January 2005.

Now that you have all individual dates when each store wasn't open, you're actually dealing with the same problem I covered last month—grouping ranges of consecutive dates—so you can apply any of the solutions I discussed in "Grouping Ranges." The code in Listing 8 applies the row-numbering solution, in which you calculate a grouping factor by subtracting from the sales date a number of days equal to the row's number (according to date order for each store separately).

At callout A in Listing 8, the Common Table Expression (CTE) I named GrpCTE combines the previous step's query, which returns individual nonexistent dates, and the calculation of the grouping factor (date minus row number). The outer query, at callout B, simply groups all consecutive dates for each store by specifying store ID and the grouping factor in the GROUP BY clause. In the SELECT list, you calculate each gap's starting and ending points by requesting the MIN and MAX dates for each group.

Which Solution Is Best?
When I need to pick one of several solutions, I usually examine two aspects: performance and simplicity. I tested all three of these sample solutions against much larger tables. Solutions 1 and 3 have similar run times and are approximately twice as fast as Solution 2. However, Solution 2 is by far the simplest of the three. If the performance of all solutions were acceptable (i.e., if they all finished fast enough to meet the users' needs), I'd probably choose Solution 2. Otherwise, Solution 1 seems best because it applies to all versions of SQL Server.

As you probably noticed, the toughest part of solving this month's T-SQL puzzle was the logical manipulation required to devise a solution. After you've devised your strategy, translating the logical solution to T-SQL is easy. To help you keep in practice and improve your logical deduction capabilities, I provided an extra logical challenge last month in the form of a pure logic puzzle. You can find the solution in the "Logical Puzzle" sidebar, along with a new logic problem to puzzle over.



ARTICLE TOOLS

Comments
  • David
    7 years ago
    Jan 21, 2005

    Generating sequential dates is something that should be in my toolbox. I made a function out of it as follows:
    CREATE Function dbo.udtBuildSeqDateTable(@StartDate as Datetime, @EndDate as DateTime)
    Returns @DateSeq Table (SeqDate DateTime)
    AS Begin
    Declare @Rcnt int
    --Insert the start date into temp table
    Insert Into @DateSeq(SeqDate) Values(@StartDate)
    Set @Rcnt=@@RowCount
    --error checking
    If @EndDate<@StartDate Return
    --double the number of rows with each insert
    WHILE @Rcnt < @EndDate-@StartDate BEGIN
    INSERT INTO @DateSeq(SeqDate) Select SeqDate+@Rcnt From @DateSeq order by SeqDate
    Select @Rcnt=@RCnt+@@RowCount
    END
    --Get rid of any excess rows
    Delete From @DateSeq Where SeqDate>@EndDate
    Return
    End
    --------------
    As an example of how to use the function, I needed to know which dates were missing from my Batch table:
    SELECT Seq.SeqDate AS MissingDates
    FROM PBME.dbo.udtBuildSeqDateTable('20030501', '20031001') AS Seq LEFT OUTER JOIN
    dbo.Batch ON Seq.SeqDate = dbo.Batch.BatchDate
    WHERE (dbo.Batch.BatchDate IS NULL)
    -----
    You could then use the algorithm from the previous article to show ranges.

    Dave C.

  • David
    7 years ago
    Jan 21, 2005

    thought provoking aricle

You must log on before posting a comment.

Are you a new visitor? Register Here