DOWNLOAD THE CODE:
Download the Code 97809.zip

Executive Summary:
Microsoft SQL Server 2005 can create statistics on unindexed table columns--called column statistics-which along with index statistics can help the SQL Server query optimizer determine the most efficient query plans. To enable column statistics, you need to make sure the auto_create_statistics database option is set to ON. Column statistics can help the query optimizer choose a better plan in specific cases, such as for queries involving multiple tables or tables that already have indexes.


Using statistics on unindexed columns can help determine the best query plans

SQL Server 2005 generates and automatically maintains statistics for every index you create, a feature that can help you in your query performance-tuning efforts. (See “Making the Most of Automatic Statistics Updating,” October 2007, InstantDoc ID 96767 for more information about automatic updating of index statistics.) In addition to providing statistics for indexes, SQL Server can create statistics on unindexed table columns—called column statistics. Along with index statistics, column statistics help the SQL Server optimizer create efficient query plans. By understanding how and when SQL Server creates column statistics and how they’re used, you can recognize situations when you could use them to improve the optimizer’s performance.

Working with Column Statistics

By default, SQL Server automatically creates statistics every time an unindexed column is referenced in a query’s WHERE or ON clause. This default applies only when the auto_create_statistics database option is set to ON, which is the default setting in the model database and thus in all new databases you create.

The examples I’ll employ in my discussion of column statistics use copies of the Person.Contact and Human Resources.Employee tables from the AdventureWorks database. To follow along, first make sure that auto_create_statistics is set to ON in the AdventureWorks database, by running the following statement:

ALTER DATABASE AdventureWorks
SET auto_create_statistics ON;

Next, run the code in Listing 1, which creates the two tables in the database. You can see all statistics that exist on a specified table by using the sp_helpstats system procedure. By default, sp_helpstats shows only the column statistics, but when you add a second parameter, 'ALL', the procedure will return both column statistics and index statistics.

To see how sp_helpstats works, run the following code to view the statistics on the dbo.Contact table before you run any queries against it:

EXEC sp_helpstats
‘dbo.Contact’, ‘ALL’;

(Some code in this article wraps to multiple lines because of space constraints in print.) You should see the message This object does not have any statistics or indexes. However, if you query the dbo.Contact table, then examine the statistics again, as the following code shows, you should see different results.

SELECT * FROM dbo.Contact
WHERE FirstName like ‘K%’;
GO
EXEC sp_helpstats ‘dbo.Contact’;

Your output should look similar to that in Table 1.

The results (without the 'ALL' parameter) show you the automatically created statistics on the FirstName column, but they’d also return any statistics you created manually by using the CREATE STATISTICS command. The automatically created statistics have a system-generated name that always starts with the eight characters “_WA_Sys_.” These eight characters are followed by a hexadecimal value indicating the ordinal position of the column in the table (the fourth column, in this case) and a hex value representing the table’s object_id. For the output in Table 1, the hex value 76EBA2E9 is equivalent to the decimal value 1995154153, which the object_name function confirms is the object_id of the dbo.Contacts table. Manually created statistics can have any name you choose as long as it’s a legal identifier name.

SQL Server 2005 also provides a catalog view called sys.stats that lets you list your column statistics, but to get the basic information provided by the sp_helpstats procedure, you’d need to join sys.stats with sys.stats_columns and sys.columns and include special processing for the cases where manually created statistics included multiple columns. If you also want to return information on index statistics, you must also include the sys.indexes view.

For the most part, SQL Server manages column statistics the same as index statistics, regardless of whether column statistics are automatically or manually created. If the auto_update_statistics database option is set to ON, column statistics will be automatically updated as your data changes. The point at which SQL Server will update the statistics is based on the same recompilation threshold that index statistics use—namely, when 20 percent of the rows in the table are modified. The UPDATE STATISTICS command will update all statistics on a table, both index and column statistics. Manually created statistics can be on multiple columns, but unlike index statistics, there’s no column or byte limit on column statistics.

Statistics Changes in SQL Server 2005

Microsoft completely rewrote the algorithms for SQL Server 2005 that determine how it gathers statistics, but you shouldn’t notice this change in your applications when you migrate from an earlier SQL Server version, except for perhaps better performance because the algorithms make the statistics more accurate.

There’s one difference you might notice after upgrading to SQL Server 2005, if you’re used to examining the database catalog and index ID values in sysindexes. In SQL Server 2000, the sysindexes.indid column is a tinyint, and values 2 through 250 are used for both index ID and statistics ID values. A clustered index always has a value in the indid column of 1. So although the documentation says you can have 249 nonclustered indexes on a single table, if you have many statistics on a table, in SQL Server 2000 you can run out of possible indid values before you create anywhere near 249 nonclustered indexes.

Continue to page 2

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

DOWNLOAD THE CODE and Listing links do not work :-(

seequell

Article Rating 3 out of 5

Thanks, interesting artical

shaunt_khaldtiance@hotmail.com

Article Rating 5 out of 5