DOWNLOAD THE CODE:
Download the Code 96767.zip

 Executive Summary:

Microsoft SQL Server 2005 provides an automatic statistics-updating feature to help ensure optimum performance of SQL Server queries. You can further improve the quality of SQL Server index statistics—to help tune query performance—by using UPDATE STATISTICS and two new trace flags in Microsoft SQL Server 2005 SP1.

If you've ever researched performance tuning, you're well aware that correct indexing is probably the most important area on which to spend your tuning efforts. Good performance requires good indexes - but for SQL Server's optimizer to recognize the usefulness of your indexes, you need high-quality statistics. Starting in SQL Server 7.0, SQL Server updated index statistics automatically, which improved the chances of your statistics being up to date. Prior to version 7.0, statistics could only be updated manually; thus, the first task on most performance-troubleshooting checklists was to update the statistics on all tables needed by the troublesome queries.

Automatic updating of your index statistics is a great feature, but it isn't perfect. We'll look at how SQL Server's optimizer determines when statistics should be updated, then learn about some new trace flags in SQL Server 2005 SP1 that give you more control over the automatic - statistics-updating functionality.

Keeping Statistics Updating On
By default, all SQL Server databases are created with the AUTO_UPDATE_STATISTICS database option enabled. For most tables, the work required to update statistics won't affect your system's performance enough for you to consider turning off this option. Of course, there are exceptions, but generally you should assume that this option should stay on unless you're certain that your applications run better with it off. In most cases, the performance degradation you'll experience if statistics aren't regularly updated will be much worse than performance problems you might encounter by leaving this option on.

If you believe automatic statistics updating impedes your system performance, you have a couple of alternatives for capturing statistics. One is to turn off automatic updates for a single table or index, then use ALTER INDEX and set the STATISTICS_NORECOMPUTE option to ON. Another choice is to use a SQL Server 2005 feature called AUTO_UPDATE_STATISTICS_ASYNC, which can be set with ALTER DATABASE. If this option is on, SQL Server will automatically update statistics in the background, and the query that detected that statistics needed updating won't wait for the statistics updating to occur. Individual queries won't be affected by the updating of statistics, and the system as a whole will benefit by having the statistics regularly updated. You can read more about ALTER INDEX and ALTER DATABASE in SQL Server Books Online (BOL).

Although having automatic statistics updating is rarely detrimental, one problem with this feature is that it's activated only after a substantial number of changes have occurred in the index key values. Although very small tables might have their index statistics updated more aggressively, any table with more than 500 rows needs to have at least 20 percent of its rows modified, or the number of rows needs to increase by at least 20 percent in order for SQL Server to detect that statistics are out of date. This means that for a 100,000-row table, you have to update or insert at least 20,000 rows.

If the updates or inserts are evenly distributed throughout the table, statistics updating might not be necessary. For example, if you modify 10 percent of the rows in the table, and the updates are evenly spread throughout the table, the original statistics could still continue to provide satisfactory estimates of the number of rows that would satisfy a query.

Updating Statistics After a Range of Values Changes
However, if the data changes so that there are many new values greater than the previous maximum data value, it might be impossible to obtain good plans until the statistics are updated. Let's look at an example.

Run the code in Listing 1 to build a table called dbo.details in the AdventureWorks database, and copy the rows from Sales.SalesOrderDetail into the table. Listing 1 then builds a clustered and nonclustered index on the table.

The dbo.details table has 121,317 rows and 1,258 pages. You should be aware that the nonclustered index on SalesOrderID will be used when only a few rows from the dbo.details table are accessed. For this table, the cutoff is about 350 rows. A clustered index scan, which is the same as a table scan, is used for more than about 350 rows. Look at the execution plans for the following two queries to verify this.

SELECT * FROM details
WHERE SalesOrderID > 75000
SELECT * FROM details
WHERE SalesOrderID < 56000

The first query returns 301 rows and uses a nonclustered index seek and a lookup into the clustered index. The second query returns 57,877 rows and uses a clustered index scan.

The maximum value in the SalesOrderID column is 75,123. Let's update several thousand rows to be greater than that maximum. The following query updates 14,148 rows to be greater than the previous maximum.

UPDATE dbo.details
SET SalesOrderID = 76000
WHERE SalesOrderID < 47000
GO

When you look at the plan and number of logical reads for the following SELECT statement, you'll see that SQL Server continues to perform a nonclustered index seek because the statistics don't reflect the large number of rows greater than the previous maximum.

SET STATISTICS IO ON
SELECT * FROM dbo.details
WHERE SalesOrderID > 75000
GO

Continued on page 2.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.