Executive Summary:
|
Microsoft SQL Server 2005 provides an automatic statistics-updating feature to help ensure optimum performance of SQL Server queries. You can further improve the quality of SQL Server index statistics—to help tune query performance—by using UPDATE STATISTICS and two new trace flags in Microsoft SQL Server 2005 SP1.
|
If you've ever researched performance tuning,
you're well aware that correct indexing is
probably the most important area on which
to spend your tuning efforts. Good performance
requires good indexes - but for SQL Server's optimizer
to recognize the usefulness of your indexes, you need high-quality statistics. Starting in SQL
Server 7.0, SQL Server updated index statistics
automatically, which improved the chances of your
statistics being up to date. Prior to version 7.0,
statistics could only be updated manually; thus, the
first task on most performance-troubleshooting
checklists was to update the statistics on all tables
needed by the troublesome queries.
Automatic updating of your index statistics is a
great feature, but it isn't perfect. We'll look at how SQL
Server's optimizer determines when statistics should
be updated, then learn about some new trace flags in
SQL Server 2005 SP1 that give you more control over
the automatic - statistics-updating functionality.
Keeping Statistics Updating On
By default, all SQL Server databases are created
with the AUTO_UPDATE_STATISTICS database
option enabled. For most tables, the work required
to update statistics won't affect your system's performance
enough for you to consider turning off this
option. Of course, there are exceptions, but generally
you should assume that this option should stay
on unless you're certain that your applications run
better with it off. In most cases, the performance
degradation you'll experience if statistics aren't
regularly updated will be much worse than performance
problems you might encounter by leaving
this option on.
If you believe automatic statistics updating
impedes your system performance, you have a
couple of alternatives for capturing statistics. One
is to turn off automatic updates for a single table
or index, then use ALTER INDEX and set the
STATISTICS_NORECOMPUTE option to ON.
Another choice is to use a SQL Server 2005 feature
called AUTO_UPDATE_STATISTICS_ASYNC,
which can be set with ALTER DATABASE. If this
option is on, SQL Server will automatically update
statistics in the background, and the query that
detected that statistics needed updating won't wait
for the statistics updating to occur. Individual queries
won't be affected by the updating of statistics,
and the system as a whole will benefit by having
the statistics regularly updated. You can read more
about ALTER INDEX and ALTER DATABASE
in SQL Server Books Online (BOL).
Although having automatic statistics updating
is rarely detrimental, one problem with this feature
is that it's activated only after a substantial number
of changes have occurred in the index key values.
Although very small tables might have their index
statistics updated more aggressively, any table
with more than 500 rows needs to have at
least 20 percent of its rows modified, or the
number of rows needs to increase by at least
20 percent in order for SQL Server to detect
that statistics are out of date. This means that for a
100,000-row table, you have to update or insert at
least 20,000 rows.
If the updates or inserts are evenly distributed
throughout the table, statistics updating might not
be necessary. For example, if you modify 10 percent
of the rows in the table, and the updates are evenly
spread throughout the table, the original statistics
could still continue to provide satisfactory estimates
of the number of rows that would satisfy a query.
Updating Statistics After a Range
of Values Changes
However, if the data changes so that there are many
new values greater than the previous maximum data value, it might be impossible to obtain good plans until
the statistics are updated. Let's look at an example.
Run the code in Listing 1 to build a table called
dbo.details in the AdventureWorks database, and
copy the rows from Sales.SalesOrderDetail into the
table. Listing 1 then builds a clustered and nonclustered
index on the table.
The dbo.details table has 121,317 rows and 1,258
pages. You should be aware that the nonclustered
index on SalesOrderID will be used when only a few
rows from the dbo.details table are accessed. For this
table, the cutoff is about 350 rows. A clustered index
scan, which is the same as a table scan, is used for
more than about 350 rows. Look at the execution
plans for the following two queries to verify this.
SELECT * FROM details
WHERE SalesOrderID > 75000
SELECT * FROM details
WHERE SalesOrderID < 56000
The first query returns 301 rows and uses a nonclustered
index seek and a lookup into the clustered
index. The second query returns 57,877 rows and
uses a clustered index scan.
The maximum value in the SalesOrderID column
is 75,123. Let's update several thousand rows to be
greater than that maximum. The following query
updates 14,148 rows to be greater than the previous
maximum.
UPDATE dbo.details
SET SalesOrderID = 76000
WHERE SalesOrderID < 47000
GO
When you look at the plan and number of logical
reads for the following SELECT statement, you'll see
that SQL Server continues to perform a nonclustered
index seek because the statistics don't reflect the large
number of rows greater than the previous maximum.
SET STATISTICS IO ON
SELECT * FROM dbo.details
WHERE SalesOrderID > 75000
GO
Continued on page 2.
Prev. page  
[1]
2
next page