• subscribe
March 16, 2010 11:09 PM

MAX and MIN Aggregates Against Partitioned Tables

An efficient workaround provides improved performance
SQL Server Pro
InstantDoc ID #103574

In many cases, the list of partitions isn’t static; instead, partitions are added and removed fairly frequently. So instead of using a constant list of partitions, you can simply query the sys.partitions view dynamically like so:

SELECT MAX(A.pmx) AS mx
FROM sys.partitions AS P
  CROSS APPLY ( SELECT MAX(T1.col2) AS pmx
                FROM dbo.T1
                WHERE $PARTITION.PF1(T1.col1) = P.partition_number ) AS A
WHERE P.object_id = OBJECT_ID('dbo.T1')
  AND P.index_id = INDEXPROPERTY( OBJECT_ID('dbo.T1'), 'idx_col2', 'IndexID' );

This query (call it Query 5) queries the sys.partitions view, filtering only the partitions associated with the index idx_col2 in the table dbo.T1. The rest is the same as in Query 4. Figure 5 shows the execution plan for Query 5. 

As you can see, this plan is very similar to the one for Query 4, only instead of the Constant Scan operator, you see a Clustered Index scan of the clustered index on the system table sys.sysrowsets. I got the following performance measures for this query:

I/O: Table 'T1'. Scan count 5, logical reads 15, Table 'sysrowsets'. Scan count 1, logical reads 2
Time: CPU time = 0 ms, elapsed time = 118 ms.

You can calculate a MIN aggregate in a similar manner:

SELECT MIN(A.pmn) AS mn
FROM sys.partitions AS P
  CROSS APPLY ( SELECT MIN(T1.col2) AS pmn
                FROM dbo.T1
                WHERE $PARTITION.PF1(T1.col1) = P.partition_number ) AS A
WHERE P.object_id = OBJECT_ID('dbo.T1')
  AND P.index_id = INDEXPROPERTY( OBJECT_ID('dbo.T1'), 'idx_col2', 'IndexID' );

 

For Now, the Workaround Works

In this article I describe an optimization shortcoming in SQL Server 2008 SP1 and SQL Server 2005 SP3, related to MAX and MIN aggregate calculations against a partitioned table. With partitioned tables, the optimizer doesn’t use an index on the aggregate column efficiently unless the aggregated column also happens to be the partitioning column. I provide a workaround that results an efficient plan. Hopefully, Microsoft will enhance SQL Server in the future to address such calculations more efficiently without the need for a workaround.



ARTICLE TOOLS

Comments
  • White
    2 years ago
    May 04, 2010

    Another workaround is to not partition the index that will be used for MIN or MAX aggregates, or create a duplicate non-partitioned index. The QO is smart enough to choose the more efficient plan (with TOP) on the extra index.

    The non-partitioned index will have to be dropped and re-created either side of any SWITCH operations, but in mitigation these tend to occur in a maintenance window anyway.

    The advantage of this method is that existing queries do not have to be rewritten. The APPLY idea is technically neat but not exactly convenient :)

  • Wild
    2 years ago
    Apr 02, 2010

    It would have been more efficient to use a text version for Listing 1 instead of a jpg file!!

You must log on before posting a comment.

Are you a new visitor? Register Here