As I discussed last month in “Stretching the 8K Row Limit,”
InstantDoc ID 50490, SQL Server 2005 uses row-overflow pages to store data in
variable-length columns that exceed the 8060-byte maximum row size.But even
with the ability to store more than 300 columns of variable-length data of up
to 8000 bytes each, in every row, you might still want to circumvent the 8K-per-column
limitation sometimes. You can do so by using SQL Server’s support for
large object (LOB) data and the MAX specifier with varchar columns.
Large Objects in SQL Server
SQL Server has long been able to store large objects in LOB columns, by using
the text and image data types (since SQL Server 4.0) and, additionally, the
Unicode ntext data type (since SQL Server 7.0). Let’s take a closer look
at LOB data by running the code in Listing 1,
which creates the hugerows table with a fixed-length column of 1000 bytes, a
variable-length column of up to 8000 bytes, and a text field and inserts one
row into the table.
By default, the text data is stored completely off the row on another type
of page, with only a 16-byte pointer in the row itself. So the data row inserted
in Listing 1 will be only 2016 bytes long,
plus a few overhead bytes, and fits on the regular data page. The LOB data will
be stored on one or more additional LOB pages. One limitation in the INSERT
statement in Listing 1 is although I’ve
attempted to insert 10,000 bytes into the text field, which can hold 2GB of
data, the REPLICATE function can’t return a value of type text and quietly
truncates anything longer than 8000 bytes to that size. So if I run the following
SELECT statement, SQL Server reports that the length of column c is 8000 bytes:
SELECT datalength(c) FROM hugerows
The code in Listing 2 executes DBCC IND
on the hugerows table, captures the output, and returns some of the most relevant
columns. Table 1 shows Listing
2’s output: one page for the regular in-row data, one page for the
Index Allocation Map (IAM) for the in-row data,one IAM page for the LOB data,
and two pages for the LOB data. (Your output, of course, will show different
values for the actual page numbers.)
LOB Data Storage
Beginning with SQL Server 7.0, LOB data is stored in a tree structure, similar
to the trees used to store indexes. This structure makes accessing part of the
data efficient if you know the offset within the column. If the amount of data
in a LOB column is less than about 40KB, the text pointer in the data row points
to an 84- byte text root structure. This structure forms the root node of the
B-tree structure, and each row has its own root. The root node points to the
blocks of text, ntext, or image data. One of the LOB data pages in Table
1 is a page containing the root structure for the first row; the others
are the actual LOB data—the string of 8000 c’s.
Let’s add another row to the hugerows table, by using the statement
in Listing 3. Now when you run the code in
Listing 2 again, you should get only one
more row of output, indicating the LOB page needed to store the new text data
with the string of f’s.The root structure for this second row can be
stored on the same page as the first row’s root structure; no new page
is needed. Although the data for LOB columns is arranged logically in a B-tree,
physically both the root node and the individual blocks of data can be spread
throughout the LOB pages for the table—wherever space is available. In
this case, since the actual data took almost an entire page, SQL Server put
the root structures on a separate page, but that might not always happen. If
the text data took only half a page, the root structure could be stored on the
same page, and you wouldn’t need a separate LOB page just for the root
structures. The size of each block of data is determined by the size written
by an application.
To see all three types of pages, insert one more row into the hugerows table,
as Listing 4 shows.This time, the varchar
column is so large that even without the text column,the row no longer fits
in one page.The fixed- length column needs 1000 bytes,so an additional 8000
bytes in the varchar column won’t fit. Run the code in Listing
2 again, and you’ll get nine rows of output, similar to the output
that Table 2 shows.
You now have one IAM page and one data page for the regular in-row data,an
IAM page and a data page for the row-overflow data, and five LOB pages: an IAM
page, a page containing the three root structures, and three pages for the actual
text data.
Although LOB columns can hold up to 2GB of data in a column, they aren’t
easy to work with. Many functions don’t take LOB data types as input,
few functions can return a LOB data type as output (see the previous REPLICATE
function),and LOB data columns can’t be accessed from the inserted and
deleted tables in the body of an AFTER trigger. As of SQL Server 2000, LOB data
columns can be accessed from the inserted and deleted tables in the body of
an INSTEAD OF trigger. To do most of the initial population and manipulation
of LOB data, you must use the special operators READ- TEXT, WRITETEXT, and UPDATE-TEXT,which
are employed differently from SELECT,INSERT and UPDATE. For more information
about these special operators, see “BLOB Access Technologies,”July
2003, InstantDoc ID 39058;“Importing Word Documents into SQL Server,”
March 2003, InstantDoc ID 37903;and “Off the Record,” January
2003, InstantDoc ID 26997.
Prev. page  
[1]
2
next page