Solution 3: Using an Auxiliary Table of Dates
The third solution applies to SQL Server 2005 only. This solution involves using an auxiliary table that contains all possible dates you need to cover. In this example, you need only 31 rows to contain all possible dates in January 2005, but in practice, the table would likely contain several years' worth of data. Still, the auxiliary table would be small (hundreds to perhaps thousands of rows) even if you need to cover several years.
Run the code that Listing 6 shows to create the Dates auxiliary table and populate it with 10 years' worth of data. The first step of this solution is to return all nonexistent sales dates during January 2005 for each store. You need to cross-join the Dates auxiliary table with the Stores table and use a NOT EXISTS filter that checks that there's no row in Sales that contains the date from Dates and the store ID from Stores. Run the code that Listing 7 shows to return all nonexistent dates for each store during January 2005.
Now that you have all individual dates when each store wasn't open, you're actually dealing with the same problem I covered last monthgrouping ranges of consecutive datesso you can apply any of the solutions I discussed in "Grouping Ranges." The code in Listing 8 applies the row-numbering solution, in which you calculate a grouping factor by subtracting from the sales date a number of days equal to the row's number (according to date order for each store separately).
At callout A in Listing 8, the Common Table Expression (CTE) I named GrpCTE combines the previous step's query, which returns individual nonexistent dates, and the calculation of the grouping factor (date minus row number). The outer query, at callout B, simply groups all consecutive dates for each store by specifying store ID and the grouping factor in the GROUP BY clause. In the SELECT list, you calculate each gap's starting and ending points by requesting the MIN and MAX dates for each group.
Which Solution Is Best?
When I need to pick one of several solutions, I usually examine two aspects: performance and simplicity. I tested all three of these sample solutions against much larger tables. Solutions 1 and 3 have similar run times and are approximately twice as fast as Solution 2. However, Solution 2 is by far the simplest of the three. If the performance of all solutions were acceptable (i.e., if they all finished fast enough to meet the users' needs), I'd probably choose Solution 2. Otherwise, Solution 1 seems best because it applies to all versions of SQL Server.
As you probably noticed, the toughest part of solving this month's T-SQL puzzle was the logical manipulation required to devise a solution. After you've devised your strategy, translating the logical solution to T-SQL is easy. To help you keep in practice and improve your logical deduction capabilities, I provided an extra logical challenge last month in the form of a pure logic puzzle. You can find the solution in the "Logical Puzzle" sidebar, along with a new logic problem to puzzle over.