DOWNLOAD THE CODE:
Download the Code 92772.zip

As I discussed last month in “Stretching the 8K Row Limit,” InstantDoc ID 50490, SQL Server 2005 uses row-overflow pages to store data in variable-length columns that exceed the 8060-byte maximum row size.But even with the ability to store more than 300 columns of variable-length data of up to 8000 bytes each, in every row, you might still want to circumvent the 8K-per-column limitation sometimes. You can do so by using SQL Server’s support for large object (LOB) data and the MAX specifier with varchar columns.

Large Objects in SQL Server
SQL Server has long been able to store large objects in LOB columns, by using the text and image data types (since SQL Server 4.0) and, additionally, the Unicode ntext data type (since SQL Server 7.0). Let’s take a closer look at LOB data by running the code in Listing 1, which creates the hugerows table with a fixed-length column of 1000 bytes, a variable-length column of up to 8000 bytes, and a text field and inserts one row into the table.

By default, the text data is stored completely off the row on another type of page, with only a 16-byte pointer in the row itself. So the data row inserted in Listing 1 will be only 2016 bytes long, plus a few overhead bytes, and fits on the regular data page. The LOB data will be stored on one or more additional LOB pages. One limitation in the INSERT statement in Listing 1 is although I’ve attempted to insert 10,000 bytes into the text field, which can hold 2GB of data, the REPLICATE function can’t return a value of type text and quietly truncates anything longer than 8000 bytes to that size. So if I run the following SELECT statement, SQL Server reports that the length of column c is 8000 bytes:

SELECT datalength(c) FROM hugerows

The code in Listing 2 executes DBCC IND on the hugerows table, captures the output, and returns some of the most relevant columns. Table 1 shows Listing 2’s output: one page for the regular in-row data, one page for the Index Allocation Map (IAM) for the in-row data,one IAM page for the LOB data, and two pages for the LOB data. (Your output, of course, will show different values for the actual page numbers.)

LOB Data Storage
Beginning with SQL Server 7.0, LOB data is stored in a tree structure, similar to the trees used to store indexes. This structure makes accessing part of the data efficient if you know the offset within the column. If the amount of data in a LOB column is less than about 40KB, the text pointer in the data row points to an 84- byte text root structure. This structure forms the root node of the B-tree structure, and each row has its own root. The root node points to the blocks of text, ntext, or image data. One of the LOB data pages in Table 1 is a page containing the root structure for the first row; the others are the actual LOB data—the string of 8000 c’s.

Let’s add another row to the hugerows table, by using the statement in Listing 3. Now when you run the code in Listing 2 again, you should get only one more row of output, indicating the LOB page needed to store the new text data with the string of f’s.The root structure for this second row can be stored on the same page as the first row’s root structure; no new page is needed. Although the data for LOB columns is arranged logically in a B-tree, physically both the root node and the individual blocks of data can be spread throughout the LOB pages for the table—wherever space is available. In this case, since the actual data took almost an entire page, SQL Server put the root structures on a separate page, but that might not always happen. If the text data took only half a page, the root structure could be stored on the same page, and you wouldn’t need a separate LOB page just for the root structures. The size of each block of data is determined by the size written by an application.

To see all three types of pages, insert one more row into the hugerows table, as Listing 4 shows.This time, the varchar column is so large that even without the text column,the row no longer fits in one page.The fixed- length column needs 1000 bytes,so an additional 8000 bytes in the varchar column won’t fit. Run the code in Listing 2 again, and you’ll get nine rows of output, similar to the output that Table 2 shows.

You now have one IAM page and one data page for the regular in-row data,an IAM page and a data page for the row-overflow data, and five LOB pages: an IAM page, a page containing the three root structures, and three pages for the actual text data.

Although LOB columns can hold up to 2GB of data in a column, they aren’t easy to work with. Many functions don’t take LOB data types as input, few functions can return a LOB data type as output (see the previous REPLICATE function),and LOB data columns can’t be accessed from the inserted and deleted tables in the body of an AFTER trigger. As of SQL Server 2000, LOB data columns can be accessed from the inserted and deleted tables in the body of an INSTEAD OF trigger. To do most of the initial population and manipulation of LOB data, you must use the special operators READ- TEXT, WRITETEXT, and UPDATE-TEXT,which are employed differently from SELECT,INSERT and UPDATE. For more information about these special operators, see “BLOB Access Technologies,”July 2003, InstantDoc ID 39058;“Importing Word Documents into SQL Server,” March 2003, InstantDoc ID 37903;and “Off the Record,” January 2003, InstantDoc ID 26997.

   Prev. page   [1] 2     next page
 
 

ADS BY GOOGLE