To efficiently manage storage for your databases, you need to understand what objects take up disk space and how SQL Server stores those objects. In SQL Server 2000, for example, one simple system table tracks space usage, only two objects consume disk space, and only three types of pages exist to store user data.This structure is relatively easy to manage, but it also has its limitations, especially regarding how SQL Server stores and retrieves large object (LOB) data.
SQL Server 2005's enhanced storage model expands the number and types of objects that consume space, gives you more flexible options for storing variable-length LOB data, and adds functionality to store partitioned data in multiple, different locations. Let's review SQL Server 2000's basic storage model, then look at how SQL Server 2005 manages disk space for data.
Storage Space in SQL Server 2000
In SQL Server releases preceding SQL Server 2005, only two things in a database actually use storage space: tables and indexes. In addition, there are only a few ways that the information in tables and indexes can be stored.
SQL Server 2000 provides three types of pages for storing application information: index pages, data pages, and a third type of page for storing LOB data. LOB data is data defined as one of three data types: text, ntext, or image.
This relatively simple storage model requires only one system table, sysindexes, to keep track of all SQL Server's space-consuming objects. Sysindexes holds one row for each table and one row for each index on the table. It also optionally stores one row per table to keep track of any LOB data the table contains. Every row in sysindexes contains information about how much disk space the table, index, or LOB data consumes and where you can find the pages for that structure.
If a table has a clustered index, the table's data is considered part of the index, so the data rows are actually index rows in sysindexes. For a table with a clustered index, sysindexes has a row with an index ID (indid) value of 1. If a table has no clustered index, there is no organization to the table data; we call such a table a heap. A heap in sysindexes has an indid value of 0. Every additional index has a row in sysindexes that contains an indid value from 2 to 250. (See "Digging Up the Dirt on Indexes," December 2004, InstantDoc ID 44002, for information about how to interpret sysindexes' first and root columns to determine which file and page number contain the first page of the table and the root of the index.)
The sysindexes column that contains the indid value is a tinyint data type, which means it can hold values up to only 255.The maximum indid value for an index is 250 because the values 251-254 are reserved, and SQL Server 2000 uses the special indid value of 255 to keep track of pages holding LOB data. SQL Server uses the same sysindexes columns that keep track of the space a table or index uses to keep track of the space consumed by all the LOB data in any row or column of the table. (For more information about sysindexes, see "Something About Sysindexes," April 2005, InstantDoc ID 45408.)
The simplicity of the sysindexes table is one of its strengths. However, SQL Server 2005 addresses several problems with the sysindexes table structure.
One problem is that the sysindexes table also stores rows for column statistics not associated with an index, which means these statistics must have a unique indid value. If you have a lot of statistics on a table, you could run out of indid values before you build all the indexes you need on the table.
Sysindexes also doesn't allow for other types of pages or for redesign if the relationship between pages changes. Finally, because SQL Server 2000 considers LOB data as a special kind of data associated with a table, indexes can't contain LOB data. In SQL Server 2000 and earlier releases, you can't create an index on a text, image, or ntext column, so the sysindexes structure remained workable. But things changed in SQL Server 2005.
New Ways of Storing Data
SQL Server 2005 still needs to store regular rows for data and indexes. But SQL Server 2005's new varchar(max) data type lets you define a column to hold both regular row data and LOB data. In addition, you can define indexes to have columns of type varchar(max), which means an index can contain LOB data. (For more information about varchar(max), see the Web-exclusive T-SQL 2005 column, "MAX Specifier Boosts Variable-Length Data Type," September 2005, InstantDoc ID 47915.)
Another new mechanism lets you define multiple large varchar fields, but instead of using MAX as the maximum length, you can use an integer up to the SQL Server 2000 maximum length of 8000.This technique lets you populate multiple large varchar columns in a row so that the total length of the row exceeds the maximum length that SQL Server can store on a page. SQL Server simply stores any varchar fields that can't fit on the row on special pages called ROW_OVERFLOW pages, which I discussed a bit in "Piecing Together Fragmentation," December 2005, InstantDoc ID 48129. SQL Server can also store index rows on ROW_OVERFLOW pages.
SQL Server 2005 also lets you partition a table or index and have its rows stored in multiple different locations. Sysindexes has no way to indicate that a single structure occupies multiple storage areas.
Thus, in SQL Server 2005, instead of considering indexes as a subsidiary storage structure to tables, you need to consider tables and indexes on an equal footing. Both tables and indexes need to store regular rows, LOB data, and ROW_OVERFLOW data. And both tables and indexes can be partitioned.
Prev. page  
[1]
2
next page