Get the inside scoop about using covering indexes

Using nonclustered indexes in a certain way can make them almost as useful as clustered indexes. As I mentioned in "Time for a Tune-Up," August 2001, if the keys of a nonclustered index include all the columns referenced in a query, SQL Server doesn't have to access the table's data pages at all. Because the index's leaf level contains all the necessary information, which is logically stored in order of the first index key, that leaf level is almost like a table with a clustered index. A nonclustered index that contains all the data a query needs is called a covering index; a query that such an index supports is called a covered query.

Covering Indexes
In "Time for a Tune-Up," I looked at a couple of examples of when a covering index on a single column can be useful. One such situation is when you're interested only in the values in one column. Let's say I want to know the ID values for all customers in the Northwind database whose ID value starts with the letter C. The only information I need is in the customerID column, so a nonclustered index on customerID would cover this query:

SELECT  DISTINCT customerID FROM  Orders
WHERE  customerID LIKE 'C%'

Covering indexes are also often useful when you're dealing with aggregates. Remember that a nonclustered index contains a row with the index key value for every row in the base table, even if duplicates exist. Suppose you want to know how many rows in the Orders table have the customerID value ERNSH. A nonclustered index on customerID will contain an index row for every instance of ERNSH, so SQL Server can just count the entries in the index's leaf level:

SELECT  count(*) FROM  Orders
WHERE  customerID = 'ERNSH'

The previous two examples show queries that are covered by nonclustered indexes on a single column. Even more useful as covering indexes are composite indexes. Let's build a composite index on the Northwind database's Customers table on the country and city columns:

CREATE INDEX location_index ON Customers(country, city)

SQL Server can then use this composite index in queries such as

SELECT  city FROM  Customers
WHERE  country = 'Brazil'

If you run the above query with STATISTICS PROFILE and STATISTICS IO on, the execution plan shows that SQL Server used an index seek on the index location_index:

SELECT [city]=[city] FROM [Customers] WHERE [country]=@1
 |--Index Seek(OBJECT:([Northwind].[dbo]
 .[Customers].[location_index]), SEEK:
 ([Customers].[Country]=Convert([@1])) ORDERED FORWARD)

SQL Server used the index to traverse the index levels and find the first index row containing Brazil. Then, SQL Server could move horizontally through the index, retrieving the city values from the index rows until it encountered a country value that wasn't equal to Brazil.

Nothing in the query's plan specifically tells you that SQL Server is using a covering index, but you can compare this plan with the plan of the following similar noncovered query, which needs data from a column that isn't in the index:

SELECT CompanyName, city FROM Customers
WHERE  country = 'Brazil'

This query's plan shows a bookmark lookup operation as its middle step. The covered query, which was selecting just the city column, didn't show a bookmark-lookup, which means that SQL Server never accessed the data. In a bookmark-lookup operation, SQL Server takes the pointer from the nonclustered index row and uses it to access data in the table.

Another difference between the covered and noncovered queries is the number of logical reads they require. The covered query requires only one logical read, whereas the noncovered query requires 19 page accesses.

You can use covering indexes in two ways. For the previous covered query, SQL Server traverses the index from the root to the leaf. It stops looking when it gets to the leaf because it has found in the index all the data values it needs; it never needs to access the actual data pages. I refer to the situation in which SQL Server starts at the root and traverses each level to the leaf as a vertical use of a covering index.

SQL Server can also use a covering index by simply scanning the leaf level. The sysindexes table contains the page address of the first page at the leaf level; each page contains pointers to both the next page and the previous page. When scanning the leaf level, SQL Server doesn't traverse the index in the usual manner (i.e., starting at the root) but instead accesses the index more like it does during a table scan. The case when SQL Server just scans the leaf level is a horizontal use of a covering index.

Here's an example of a query that uses a covering index horizontally:

SELECT DISTINCT customerID FROM Orders

This query asks to see every distinct customerID value in the Orders table. One method SQL Server uses to find distinct values is to first sort the data. Because the key values in a nonclustered index's leaf level are already sorted, scanning the leaf level and returning only one row for each distinct customerID value in the leaf is a very straightforward operation for SQL Server.

SQL Server also horizontally scans a nonclustered index for queries that perform an aggregate on every data value in the index. If you had a nonclustered index on the Quantity column in the Order Details table, SQL Server could use that index to easily find the average quantity value. To test this idea, build the index, then run the aggregate query as follows:

CREATE INDEX QtyIndx ON [Order Details](Quantity)
GO
SELECT  AVG(Quantity) FROM  [Order Details]

The execution plan shows a nonclustered index scan of the QtyIndx index, then a stream aggregation that computes the average, but no access of the data itself.

Create a Cover-Up
Covering indexes provide the fastest response time possible in many cases, so if you have a crucial query that won't use your previously created indexes, consider creating a covering index. You might be able to create such an index by adding another column to an existing index, thereby creating a composite index. You can use a composite covering index for many kinds of queries. In the Northwind database, the Orders table has an index on the customerID column, which lets you execute this query quickly:

SELECT  employeeID  FROM Orders
WHERE customerID  = 'ALFKI'
   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

I can't see the next article on how the SQL Server query optimizer know whether an index will be useful.

Marrow Yung