• subscribe
April 26, 2001 12:00 AM

More Fill-Factor Mysteries

SQL Server Pro
InstantDoc ID #20242
Downloads
20242.zip

First, let me give you some background about page splitting in SQL Server. SQL Server always keeps the keys in an index sorted, and an index's leaf level contains every key value in sorted order. Remember that for a clustered index, the leaf level is the data, so SQL Server sorts all the data by the clustered index key. If a new row is inserted into the table, that row must go in its proper sorted position in the leaf level of every index on the table. The problems I present in the next section are relevant both to the table itself, as the leaf level of a clustered index, and to the leaf level of any nonclustered indexes. But for simplicity, I'll discuss just the clustered index leaf level.

If a page is completely full and an INSERT operation tries to insert a new row on that page, SQL Server must split the page. This split involves finding another free page and linking it into the logical sequence of pages for this index. SQL Server needs to adjust the m_nextPage and m_prevPage values so that the new page now follows the split page, and the page that formerly followed the split page now follows the new page. Instead of using DBCC PAGE to look at the m_nextPage and m_prevPage values, you can refer to the output from DBCC IND, which shows the next page's Page ID in the NextPagePID column and the previous page's Page ID in the PrevPagePID column.

After the new page is part of the list of pages at the leaf level, SQL Server moves approximately half the rows from the full page onto the new page. Now both pages are about half full, and the new row can go onto one of the half-full pages.

Roll It Back
But what happens if, after all this work, the INSERT transaction is canceled and SQL Server must roll back the insert? In SQL Server releases before 7.0, SQL Server would undo the page split, put all the rows back where they came from, and unlink the new page from the list. But if another process immediately needed to insert into that full page, SQL Server would have to split the page all over again. Starting with SQL Server 7.0, after the page split has occurred and SQL Server has moved the rows from the full page to the new page, that activity is committed. Even if SQL Server rolls back the INSERT operation, the page remains split.

Here's what I suspect the other instructor might have been thinking. In general, you use a fill factor of 100 only for a table that is completely—or almost completely—static. Inserting a row might be a rare occurrence, and if SQL Server rolled back the insert after the page split, you'd no longer have perfectly full data pages. In that case, undoing the split might make sense because future inserts would be rare.

To demonstrate an example of splitting completely full pages, let's create a table like the one I created last month, but insert only even numbers as values. Then, insert rows with odd-numbered values between the even-numbered rows. The script that Listing 1 shows creates the table. On the table, the script builds a clustered index that has a fill factor of 100, so all the data pages will be completely full. We don't need many rows, so let's just insert rows containing all the even numbers from 2 to 100.

For this example, you only need to use the DBCC IND command and find out how many data pages are in the leaf level of the clustered index. Running DBCC IND on the table shows you five rows, representing five pages: One is the Index Allocation Map (IAM) page, one is an index page, and the other three are data pages. Listing 2 shows a transaction containing a statement to insert a row with a value of 11 on the first page of the table between the "10" row and the "12" row. Before terminating the transaction, the script executes DBCC IND again; you can see that the table now has four data pages. The script rolls back the transaction, and the table still has four data pages: SQL Server didn't undo the split.

Now run the script from Listing 1 again, but this time take out the comment markers to drop the previous table and set the fill factor to 100 for the clustered index. DBCC IND will initially show you three data pages again. Running the code in Listing 2 will give you identical results: The INSERT causes a page split and a new data page, and that new data page remains even after you roll back the transaction.

Listing 3 shows three more inserts you can use as a final test to see what happens after SQL Server splits the page. First, the script inserts the 11 again, then it inserts a 23. Because SQL Server had already split the first page of the table, it has room to insert both rows, so no additional page splitting occurs. However, when the script tries to insert a row containing 55, that row must go on the original second page of the table, which is still full. Another split occurs, and DBCC IND shows that the table now has five data pages.

One way you can know that a page has split is to run DBCC IND to determine how many pages are in the table after an INSERT or UPDATE operation. But you might want to know which rows moved to the new page and which ones stayed on the original page. You might also wonder where the split occurred. If SQL Server splits a page in half, does that mean half the number of rows or half the number of bytes? The two answers could be different if you have variable-length rows and the page contains both short and long rows. To answer these questions, I'll examine the DBCC PAGE command in more detail next month.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here