In my July 2001 column, "Are You in Tune?" (InstantDoc ID 21038), which began a series about query tuning, I mentioned that one of the best ways to ensure that your queries are running as fast as possible is to check that your tables are properly indexed. However, I didn't discuss index structures in depth. A clear understanding of how SQL Server organizes indexes can be a big advantage in determining what kind of indexes can be useful for which queries. This month, I look at how the information in indexes is organized. And I discuss some of the tools you can use to examine the structures of your indexes and gain an even deeper understanding of the valuable information they contain.

Turn Over a New Leaf
SQL Server supports two kinds of indexes, clustered and nonclustered. SQL Server organizes indexes as trees, with one page at the root level, multiple pages at the leaf level, and zero or more levels in between. As Figure 1 shows, indexes are usually represented as an upside-down tree, with the single-page root at the top and the leaf level at the bottom. When I talk about one page being above another, I mean that it is closer to the root. The index pages above the leaf level are called node pages. Each index row in every node page contains two things: the index key value, which is the first key value on a page at the next level of the index, and the page address of that page, which starts with the specified key. In Figure 1, the intermediate level has three pages: 25, 57, and 102. The level above (which is the root) contains each of the first key values from these three pages plus the numbers of the pages that contain those values. SQL Server manages both clustered and nonclustered indexes as balanced trees, or B-trees, which means that all the branches always have the same number of levels as the other branches between the one-page root and the leaf level.

Clustered and nonclustered indexes have certain similarities in the content of their leaf levels. In both types, every key value appears in the leaf level, and all the keys are in order by their data types. For example, if the index is on the firstname column, the leaf level will have a row for every first-name value, in alphabetical order according to the collation of the column or database. If the index is on birthdate, every birth date will appear chronologically in the leaf level. Clustered and nonclustered indexes differ in what other values the leaf level contains. For a clustered index, the leaf level is the data, so the leaf level contains all the columns in the table. In a nonclustered index, each index row in the leaf level contains a pointer to the data.

The size of an index depends primarily on the size of the index key or keys. An index can have as many as 16 columns for its keys, and the key size is the sum of all the column sizes, measured in bytes. (See the "Data Types" section in SQL Server Books Online—BOL—for a list of how many bytes each type of data requires.) In addition to index keys, each node-level index row contains a pointer to a page at the next level of the index, and page pointers are 6 bytes each. Each index row contains some additional overhead bytes in the node levels. The amount of overhead varies depending on the type of index—clustered or nonclustered, unique or nonunique. If the table has a clustered index, each nonclustered leaf row contains the corresponding clustered key value as the pointer, so the size of your clustered key directly affects the size of every nonclustered index.

How Big Is Big?
SQL Server builds an index from the bottom up: It builds the leaf level first, then sorts it before building the next level. SQL Server doesn't have to build a clustered index's leaf level because the data already exists, but it does have to sort the entire data set. For a nonclustered index, SQL Server takes all the key values (including duplicates) from the data and stores them each in an index row along with the pointer to the location of the data row containing that key. A nonclustered index's leaf level can be quite large compared to the other levels of the index because the leaf level contains an entry for every key value in the underlying table. However, the nonclustered index's leaf level can be quite a bit smaller than the table itself.

For example, consider a table of 10,000 pages and 500,000 rows that has a clustered key of a 5-byte fixed-length character type. If this table also has a nonclustered index on a 5-byte fixed-length character column, the leaf-level rows (level 0) of the nonclustered index will be 11 bytes long because they'll each contain the nonclustered index key (5 bytes), the corresponding clustered index key (5 bytes), and 1 overhead byte. Because each page has 8096 bytes available and 8096 bytes divided by 11 index bytes per row is 736 index rows, an index page for this nonclustered index can hold 736 index rows at this leaf level. The index will need 500,000 index rows, one for each row of data, so the index will have 680 leaf pages.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.