• subscribe
March 21, 2001 12:00 AM

The Fill-Factor Truth

SQL Server Pro
InstantDoc ID #19851
Downloads
19851.zip

Creating an index with a low fill factor means that your table has room to grow before page splitting would become a cause for concern. However, if your pages are only partially filled, more pages are required to hold all your data, and the index can become quite large. Microsoft defined the fill-factor value of 0 as a compromise between having room to grow and not making the table and indexes much larger than necessary. With a fill factor of 0, the leaf level is full, but the pages in the upper levels of the index, which might also need to be split if they become full, have some room for growth.

In SQL Server releases before 6.5, a fill-factor value of 100 means that not just the leaf level of an index but all levels would be completely full. In general, you use this value only for tables that are primarily read-only. In SQL Server 6.5, Microsoft added another option to the CREATE INDEX command. If you also include the option WITH PAD_INDEX, whatever fill-factor value you specify applies to all levels of an index, not just the leaf level. So creating an index with a fill-factor value of 0 and including the PAD_INDEX option is the same as using a fill-factor value of 100. But what would a fill factor of 100 with PAD_INDEX mean? (Note that you can specify a FILLFACTOR value when defining a primary key or unique constraint on a table because SQL Server automatically builds an index to support those constraints. However, you can't specify the PAD_INDEX option along with a constraint definition; you can use it only in the CREATE INDEX statement.)

The Question
Starting with SQL Server 7.0, I began frequently seeing the question: What is the real difference between a fill factor of 0 and a fill factor of 100? Although SQL Server 2000 and 7.0 Books Online (BOL) specify that a fill factor of 100 means that all levels of the index are full, many people apparently don't think this is true. I kept meaning to figure it out for myself "one of these days," but that day took several years in arriving. Finally, on a recent cross-country flight, I ran some tests and found the answer to this question.

My testing consisted primarily of using two undocumented DBCC commands: DBCC IND and DBCC PAGE. These commands aren't listed in BOL, and Microsoft doesn't officially support them, although Microsoft articles occasionally mention DBCC PAGE. To use these undocumented commands, you must first enable trace flag 3604 to return all output from undocumented DBCC commands to the client screen. DBCC IND, which takes three arguments, tells you which pages belong to an index. The first two arguments are the database name and the table name; the third argument controls how much detail you want in the output. A value of -1 returns information about all levels of all indexes on a table. Listing 1 creates a table with 1000 rows in the Pubs database and generates sequentially increasing integer values from 1 to 1000. I wanted numeric values because they're easy to generate. However, I wanted to store them as character strings because the DBCC PAGE command displays character strings more clearly than numeric values, which would help if I used the DBCC PAGE command to see the pages' full contents.

After generating the 1000 rows, the script uses the default fill factor of 0 to create a unique clustered index, then uses DBCC IND to generate a list of all the pages that belong to the table and its indexes. DBCC IND returns 12 columns of output, but I was interested in only four of the columns for my testing:

  • PagePID is the page ID; I used this value with the second DBCC command.
  • IndexID is 0 for the data pages, 1 for pages in the upper levels of the clustered index, and greater than 1 for nonclustered indexes.
  • PageType is 1 for data pages, 2 for index pages, and 10 for Index Allocation Map (IAM) pages.
  • IndexLevel is the index level at which the page occurs. If IndexID is 0, IndexLevel is also 0. The highest IndexLevel value of any IndexID is for the root page; only one page will have that IndexLevel value.

The DBCC IND command in the script in Listing 1 returns page numbers for four pages with PageType 2; of those, three have IndexLevel 0 and one has IndexLevel 1. In addition, if you're working with a database that has more than one data file, you'll also be interested in the PageFID column, which identifies the file containing the page.

Even though the leaf level of a clustered index is the data, the output of the DBCC IND command doesn't treat the data pages as index pages. However, these data pages are the ones that the fill factor applies to because they're the leaf level of the clustered index. You can identify these data pages by their PageType of 1, their IndexID of 0, and their IndexLevel of 0.

On the Trail
The next step of my investigation was to choose one of the index's leaf-level pages and supply it as an argument to the DBCC PAGE command. In SQL Server 2000 and 7.0, this command takes four arguments. The first is the database name or database ID, the second is the file number, and the third is the page number, taken from the output of DBCC IND. The fourth argument is the desired format for the output. For example, a value of 0 means you want to see just the page header. Here's the command I ran:

DBCC PAGE (pubs, 1, 288, 0)

The value on the page header that I was interested in is called m_slotCnt; it tells the number of rows on that page. When I ran the script in Listing 1 and then used DBCC PAGE to look at one of the data pages, I saw that the number of rows on the page was 20. When I ran a similar DBCC PAGE on one of the index pages at the lower level (i.e., not the root), I saw that the number of rows was 19. This was a result I expected; the data pages were full. With rows containing 390 bytes of data, plus about 10 bytes of overhead, only 20 of them can fit on a page, and my data pages have that maximum number. The index rows are about the same size because the index is on the char(390) column, but instead of having 20 index rows on a page, I have only 19 because SQL Server left room for one additional index row.

Using these two tools, DBCC IND and DBCC PAGE, I was able to answer the question: What really is the difference between a fill factor of 0 and a fill factor of 100? Running the same script, but explicitly supplying a FILLFACTOR of 100, I found the number of rows in the data pages and the index pages to be the same as for a clustered index with a FILLFACTOR of 0. I believe I've given you enough information about how to use these tools so that you can confirm this answer for yourself and possibly do more thorough testing. You could also investigate what happens when you specify the PAD_INDEX option when you have a FILLFACTOR of 100. If you don't have the time right now and aren't planning any cross-country flights with your laptop, you can wait until next month to find out what happens with a FILLFACTOR of 100 and the PAD_INDEX option specified. I'll finish showing you the research I did and the results I came up with, and I'll also show you a couple of other questions I was able to answer by using the DBCC PAGE command.



ARTICLE TOOLS

Comments
  • Anonymous User
    7 years ago
    Mar 02, 2005

    This article is really useful for who are seeking a job,they can prepare for interview.

You must log on before posting a comment.

Are you a new visitor? Register Here