When requesting a MAX or MIN aggregate of a column in a nonpartitioned table, provided that you have an index on the aggregated column, the optimizer uses a plan that scans only one row in the leaf of the index. Consider the following query as an example:
SELECT MAX(col2) FROM dbo.T1;
Suppose that T1 is a nonpartitioned table and that you created a nonpartitioned index on T1(col2). The optimizer will generate a plan that scans the leaf of the index from the tail backwards and stops after one row. Similarly, if you ask for MIN(col2), the optimizer will generate a plan that scans the leaf of the index from the head forward and stops after one row. In both cases the plan is extremely efficient and involves very little work.
Partitioned tables and indexes are different; unless you apply the aggregate to the partitioning column, you’ll get an inefficient plan in both SQL Server 2008 and SQL Server 2005. In this article I explain how MAX and MIN aggregates against partitioned tables are optimized, and I provide a better-performing workaround.
Sample Data and Test Environment
Run the code in Listing 1 to generate the sample data that I use in my example. Note that this script will take a few minutes to run. The first part of the code in Listing 1 creates a database called TestMinMax. The second part creates a helper table function called GetNums, which returns a sequence of integers in the range 1 through @n, where @n is an integer input. The third part creates a partitioned table called T1, with the partitioning column being col1, and populates it with 10,000,000 rows in five partitions. The code creates a clustered index on col1, partitioned the same as the table by col1, as well as a nonclustered index on col2, also partitioned the same as the table by col1.
My test machine has an Intel Core i7 processor (quad core with hyperthreading—total of eight logical processors), 4GB of RAM at 1333MHz, and a single 7200RPM hard drive. The execution plans and performance measures that I present are for queries run against SQL Server 2008 SP1 but are similar for SQL Server 2005 SP3.
Querying the Partitioned Table
If you ask for a MIN or MAX aggregate of the partitioning column (col1 in our case), you get an efficient plan that scans only one row in the first or last nonempty partition. As an example, consider the following query (call it Query 1) and its execution plan, which Figure 1 shows.
SELECT MAX(col1) AS mx
FROM dbo.T1;
If you examine the properties of the Clustered Index Scan operator, you’ll notice that only one partition of the index idx_col1 is accessed (partition 5), and only one row is scanned in that index. I got the following I/O and time statistics for this query on my test machine:
I/O: Table 'T1'. Scan count 1, logical reads 3
Time: CPU time = 0 ms, elapsed time = 80 ms.
This plan is extremely efficient.