DOWNLOAD THE CODE:
Download the Code 7632.zip

Use these tips to get the best performance from your database

It can be difficult to determine whether a table scan or an index will provide better query performance. The answer depends, in part, on the percentage of rows returned, but another important factor is whether you use a clustered or a nonclustered index. The SQL Server 7.0 query optimizer doesn't always choose the fastest method. I've done some testing to determine when to second-guess the optimizer, and I share my conclusions here.

Table Scan and Index Access
The SQL Server optimizer's job is to choose the best way to execute a query. The optimizer uses indexes to improve query execution time. When you query a table that doesn't have indexes, or if the optimizer decides not to use an existing index or indexes, the system performs a table scan.

Index access is an access method in which SQL Server uses an existing index to read and write data pages. Because index access significantly reduces the number of I/O read operations, it often outperforms a table scan.

In a table scan, SQL Server sequentially reads the table's data pages to find rows that belong to the result set. SQL Server 7.0's use of Index Allocation Map (IAM) pages significantly increases table scan performance. (For more information about IAM pages, see Inside SQL Server, "The New Space Management," April 1999.) SQL Server 7.0 reads IAM pages to build a sequential list of disk addresses, which lets the database system optimize I/O performance when it uses large sequential read operations. In earlier releases of SQL Server, data pages are in a double-linked chain and SQL Server reads each page individually to move the pointer to the next page, resulting in a series of single-read operations.

Clustered and Nonclustered Indexes
Because an index's contribution to query performance depends significantly on its type, I'll first introduce two different index types: clustered and nonclustered. When you use the CLUSTERED option in a CREATE INDEX statement, you specify a clustered index for columns. A clustered index physically sorts the table's contents in the order of the specified index columns. SQL Server lets you create only one clustered index per table because it can't physically order the table rows in more than one way. SQL Server navigates from the b-tree structure's root, which was built for a particular clustered index to the leaf nodes. (SQL Server uses a b-tree data structure to construct each clustered or nonclustered index.) A clustered index property has leaf nodes that contain data pages.

When you use the NONCLUSTERED option in a CREATE INDEX statement, you specify a nonclustered index (it's also the default). A nonclustered index has the same index structure as a clustered index, but with two important differences. Nonclustered indexes don't change the row's physical order in the table, and the nonclustered index's leaf level consists of an index key plus a bookmark.

Generally, a bookmark shows where to find the row that corresponds to the nonclustered index key. The bookmark of a nonclustered index key can have two forms, depending on the table's form. If a clustered index exists for other columns of the table, the nonclustered index's bookmark points to the b-tree structure of the table's clustered index. If the table doesn't have a clustered index, the bookmark is identical to the row's Relative Identifier (RID).

SQL Server uses a nonclustered index to search for data in one of two ways. If you have a heap (a table without a clustered index), SQL Server first traverses the nonclustered index structure and then uses the row identifier to retrieve a row. If you have a table with a clustered index, however, SQL Server traverses the nonclustered index structure, then traverses the index structure of the table's clustered index. When SQL Server uses a clustered index to search for data, it starts from the root of the corresponding b-tree structure and usually after three or four read operations, it reaches the leaf nodes where the data is stored. For this reason, traversing a clustered index's index structure is usually significantly faster than traversing the index structure of the corresponding nonclustered index.

From this information on table scans, index access, and clustered and nonclustered indexes, you can see that answering which access method is faster isn't straightforward. One of the most important factors to determine whether an index provides better query performance than a corresponding table scan is SQL Server's choice between a clustered and a nonclustered index.

The following examples use the orders table in the Northwind database, but for performance reasons, I slightly modified the table. First, I use only a subset of the original table's columns because you don't need all the columns in the orders table for this example. Second, you need to increase the number of rows in the table to show any significant differences between the performance of the different types of operations. Listing 1 and Listing 2, page 42, show the queries to create and load the table, respectively.

You first create the nonclustered index with the following Transact SQL (T-SQL) statement:

CREATE INDEX i_orders_orderid ON orders(orderid)

For the second test, you first drop the existing nonclustered index, then create the clustered index with the following T-SQL statements:

DROP INDEX orders.i_orders_orderid
CREATE CLUSTERED INDEX c_orderid ON orders(orderid)
   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

fine to search and get its useful

Anonymous User

Article Rating 4 out of 5

What does your analysis show when query is only one of thousands executing against same table in a tight window of time. Don't table scans create more collisions between those compettative processes ?

GSRSIS

Article Rating 3 out of 5