DOWNLOAD THE CODE:
Download the Code 7632.zip

Table Scan vs. Nonclustered Index Access
Whether a table scan will perform better than a nonclustered index access depends on the table size, the row's storage pattern, the row's length, and the percentage of rows the query returns. First, because a table scan can profit more from the read-ahead process for a large table, the percentage of returned rows for an efficient table scan in a large table is lower than the percentage in medium-sized tables. Second, processing I/O operations logically (i.e., using a nonclustered index) depends on the row's physical location on the disk. If the rows are widely dispersed, the number of I/O operations will be high and the percentage of returned rows where table scan is more efficient than the corresponding nonclustered index access will be lower than in a case where the row's storage pattern is contiguous. Third, row length influences how many rows can fit into a single data page. When more rows fit on a page, read I/O operations during a table scan perform better than when only a few rows fit on a single page.

My tests show that a table scan often starts to perform better than a nonclustered index access when at least 10 percent of the rows are selected. I also found that the optimizer switches from nonclustered index access to table scan prematurely (i.e., when nonclustered index access still shows better response time than the corresponding table scan). In many cases, the optimizer forces a table scan for queries with result sets of approximately 5 percent, although the table scan becomes more efficient than index access at selectivities of 8 to 10 percent.

You can use the INDEX query hint to improve performance. (SQL Server supports several query hints that you can use to force the optimizer to use a particular index.) The T-SQL statement with the INDEX(i_orders _orderid) query hint forces SQL Server to use the existing nonclustered index i_orders _ orderid, as Screen 2 shows.

Screen 1 shows the execution plan with the table scan; Screen 2 also shows a typical execution plan for the nonclustered index with the Bookmark Lookup step. (Recall that a nonclustered index's leaf level consists of an index key plus a bookmark that shows where SQL Server has to look to find the row corresponding to that key.) The optimizer first uses a table scan; after I applied the INDEX query hint, it used the index access. The selectivity of both queries was approximately 10 percent.

The response time of the query that executes by using the indexed access was slightly better than the other query's response time. The optimizer decided to perform the table scan although SQL Server defined the nonclustered index, as Screen 2 shows.

Table Scan vs. Clustered Index Access
For several reasons, the clustered index usually performs better than the nonclustered index. When the system scans a clustered index, it doesn't need to leave the b-tree structure to scan data pages because the pages already exist at the leaf level. Also, a nonclustered index requires more I/O operations than the corresponding clustered index. Either the nonclustered index needs to read data pages after traversing the b-tree structure, or if a clustered index for another table's columns exists, the nonclustered index also needs to read the clustered index's b-tree structure.

For these reasons, you can expect a clustered index to perform better than a table scan even when selectivity is low. (Low selectivity means that the percentage of returned rows is high.) My tests confirmed the assumption that when selectivity is 75 percent or less, a clustered index access is faster than a table scan. However, my tests showed that the optimizer continues to use clustered index access in the cases where selectivity is greater than 75 percent.

Screen 3, page 44, shows the query that has about 80 percent selectivity. Although my tests show that this query's table scan performs slightly better than the corresponding clustered index access, the optimizer chooses the existing index. Therefore, if you have an existing clustered index and selectivity of a query that uses the index as 75 percent or more, I recommend that you test to determine whether a table scan will perform better than clustered index access. If the table scan performs better than the corresponding clustered index access, use the INDEX(0) query hint to deactivate clustered index access and force the use of a table scan instead. The following T-SQL statement forces the use of a table scan for the query in Screen 3:

SELECT * FROM orders (index (0))
WHERE orderid BETWEEN 11000000 and 88000000

Generally, the query optimizer chooses the fastest execution plan, but the optimizer isn't perfect. For frequently used and complex queries, you need to test within specific selectivity ranges to determine whether the optimizer's decision is correct. For a clustered index, if the query's selectivity is greater than 75 percent, compare the index access and table scan execution times to see which is faster. For a nonclustered index, compare the nonclustered index access and table scan when the query's selectivity is between 5 percent and 10 percent to see which is faster. You'll find these tips will help you achieve the best performance from your database system.

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

fine to search and get its useful

Anonymous User

Article Rating 4 out of 5

What does your analysis show when query is only one of thousands executing against same table in a tight window of time. Don't table scans create more collisions between those compettative processes ?

GSRSIS

Article Rating 3 out of 5

 
 

ADS BY GOOGLE