• subscribe
January 18, 2005 12:00 AM

Implicit or Explicit Index Keys

Sometimes redundancy is a good good thing
SQL Server Pro
InstantDoc ID #44807
Downloads
44807.zip

Having proved that I was on the right track, I kept looking for a query that would perform better when I included the clustered index columns in the nonclustered index definition and used a specific column order. The following query shows a performance difference:

SELECT OrderID, OrderNum, UnitPrice, Discount
FROM BigDetails
WHERE OrderID = 11077
AND Discount = .02 AND UnitPrice <20

To verify the difference between explicitly including the clustered index key and not including it, rebuild the nonclustered index on only the two columns that aren't included in the clustered index:

CREATE INDEX OrderID_Price
ON BigDetails(OrderID, UnitPrice)
WITH DROP_EXISTING

If you SET STATISTICS IO ON and run the preceding SELECT query, you should see that it takes SQL Server 25 logical reads to return the 512 rows of data.

Now rebuild the index to include the clustered columns, but in a different order, so that the columns used in the WHERE clause are listed before those not used and the most selective columns are listed before the least selective:

CREATE INDEX OrderID_Price
ON BigDetails(OrderID, Discount, UnitPrice, OrderNum)
WITH DROP_EXISTING

If you now rerun the SELECT query, you should see that it takes SQL Server only four logical reads to return the same 512 rows of data, which is about 20 percent of the work to run the query with the original index.

Just like the ORDER BY queries, these are covered queries. One advantage of including the clustered index columns in the nonclustered indexes (explicitly or not) is that you increase your chance of having a covered query. For this reason, I limited my testing to finding covered queries that would show a performance difference.

You can get performance benefits from explicitly listing the clustered index columns in a nonclustered index definition, but only if you want them in something other than the default order. My testing showed that many queries behave the same no matter what order you define the keys in, but some perform better with a different ordering. If you're considering building a covering index for a query, make sure you test various permutations of the index column order before deciding on the best solution.

Explicitly including your clustered index key columns in your nonclustered index definitions has one more advantage. Suppose you know that you'll always need a composite index on Column1, Column2, and Column3. If Column3 is your clustered index key, you could leave it out of the nonclustered index definition and still have the index you need. However, if you later drop or redefine your clustered index, you might not have the composite index on the same three columns anymore. By explicitly including all the necessary columns in the nonclustered index definition, you'll always have those columns no matter what happens to any other indexes.

Another MVP pointed out the drawback of including the columns. SQL Server has a size limitation of 900 bytes for all index key columns combined. If you don't explicitly list the clustered index columns in the nonclustered index definition, they don't count against the 900-byte limit (but SQL Server treats the columns as index keys for the nonclustered index). If you list the columns explicitly and they bring your total key size to more than 900 bytes, index creation fails.



ARTICLE TOOLS

Comments
  • Jonathan
    4 years ago
    Jan 07, 2008

    Excellent article! I only wish I would have gotten to it sooner. :)

  • Angela
    6 years ago
    Mar 17, 2006

    great article

You must log on before posting a comment.

Are you a new visitor? Register Here