• subscribe
July 20, 2006 12:00 AM

Stretching the 8K Row Limit

Discover how SQL Server 2005 stores row-overflow data
SQL Server Pro
InstantDoc ID #50490
Downloads
50490.zip

Over the last few months, I've discussed how SQL Server 2005 keeps track of storage space that tables and indexes use and that the new metadata structures are a bit more complex than those in SQL Server 2000, but also offer greater flexibility. In "Managing Data Space," June 2006, InstantDoc ID 50009, I explain that each table (or index) can have data stored in three different forms: regular IN_ROW data, large object (LOB) data, and row-overflow (ROW_OVERFLOW) data, which is new in SQL Server 2005.You might find the row-overflow capability particularly useful if you have tables containing multiple variable-length columns, whose total size is usually within the maximum row size of 8060 bytes but may occasionally exceed that. Let's take a closer look at how SQL Server works with the new row-overflow data.

Bigger Rows
The new ability to store some of a row's data off the actual data page is the first change in the maximum row length that we've seen since SQL Server 7.0. In SQL Server 7.0, the size of data pages changed from 2K to 8K. Along with the increase in page size, the maximum size of a data row increased correspondingly: In SQL Server 7.0, the maximum row size increased to 8060 bytes from 1962 bytes in SQL Server 6.5. Both of these maximum-size values include several bytes of overhead stored with the row on the physical pages, so that the total size of all the table's defined columns needed to be slightly less than the maximum size. In fact, SQL Server 2005 changed the error message you get if you try to create a table with rows that exceed the maximum, which in some cases (as you'll see shortly) is still limited to the same maximum that it was in SQL Server 2000. For example, in SQL Server 2005, if you execute the CREATETABLE statement in Listing 1, using column definitions that add up to exactly 8060 bytes, you'll get the error message that Figure 1 shows. (Note that the error messages and some code lines in this article wrap to multiple lines because of space constraints.)

In this message, you can see the number of overhead bytes (7) that SQL Server 2005 wants to store with the row itself and realize that for some rows, the previous size limit still applies. SQL Server 2005 stores only variable-length columns in the special row-overflow pages, and only if all the fixed-length columns will fit into the regular IN_ROW data limit.The table in Listing 1 has all fixed-length columns.

Now let's look at a table that has all variable-length columns. Run the code in Listing 2 to create a table with rows that have a maximum defined length of much greater than 8060 bytes. In fact, if you ran this CREATETABLE statement on SQL Server 7.0, you'd get an error and the table wouldn't be created at all. In SQL Server 2000, the table will be created, but you'll get a warning.Although my columns are all varchar, varbinary, nvarchar and sql_variant columns, columns that use the (CLR) user-defined types can potentially also be stored on row-overflow data pages.

As you see when you run the code in Listing 2, you get no such warning in SQL Server 2005. If you then insert a row that has variable-length columns with sizes that add up to more than 8060 bytes, the row in the statement in Listing 3 will be successfully inserted.

Where's My Data?
To determine whether SQL Server is storing any data in row-overflow data pages for a particular table, we can use the DBCC IND command, which I discuss in detail in "Index Internal Information," January 2005, InstantDoc ID 44572.The SQL Server 2005 version of DBCC IND returns three additional columns that reflect the new structures, including partitions and row-overflow data. These three new columns are PartitionNumber, PartitionID, and iam_chain_type. iam_chain_type describes the type of data stored on the page and has one of these values: in-row data, row-overflow data, or LOB data.

The script in Listing 4 creates a new table, similar to one I created in "Index Internal Information," to hold the rows that DBCC IND returns. Note that the characters sp_ at the beginning of an object name in the master database mean that the object can be accessed from any database without qualifying it with the source database name. DBCC IND returns one row for each page belonging to a specified table or index. The following statement runs DBCC IND on the bigrows table and saves the result in sp_table_pages:

INSERT INTO sp_table_pages 
    exec ('dbcc ind 
    ( tempdb, bigrows, -1)' ) 

To see only relevant columns from the output—PageFID (the page's file ID), PagePID (the page ID), iam_chain_type, and pageType—you can run this statement:

SELECT PageFID, PagePID, 
  iam_chain_type, pageType 
FROM sp_table_pages 

Table 1 shows the output for the previous statement. (The actual page numbers you get will most likely be different.) In the DBCC IND output, both row-overflow pages and LOB pages come back with a pageType value of 3; thus we also need to see the iam_chain_type value to identify whether the page is row-overflow or LOB data. Index Allocation Map (IAM) pages report the same iam_chain_type value as the type of page they're keeping track of, and we need the pageType value to see the 10 that indicates an IAM page.The output in Table 1 shows that there's one page of regular in-row data, with its IAM page, and one page of row-overflow data,with its IAM page.

SQL Server stores variable-length columns on row-overflow pages only under certain conditions. The determining factor is the length of the row itself. It doesn't matter how full the regular page is into which SQL Server is trying to insert the new row. SQL Server will construct the row normally, and only if the row itself needs more than 8060 bytes will some of its columns be stored off the row and on the overflow pages. Each column in the table is either completely on the row or completely off the row.This means that a 4000-byte variable-length column can't have half its bytes on the regular data page and half on a row-overflow page. If a row is less than 8060 bytes and there's no room on the page where SQL Server is trying to insert it, SQL Server will apply its usual page-splitting algorithms.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here