SideBar    Storing BLOBs in the Database or the File System?
DOWNLOAD THE CODE:
Download the Code 92995.zip

Like the jelly-like Blob monster in those old Steve McQueen horror movies, mixing binary large objects (BLOBs) with relational database data has traditionally been unnatural and difficult to manage. BLOBs and relational database data are very different entities. Relational data usually consists of text or numbers and tends to be small. In contrast, BLOB data is most often pictures in .jpg, .tiff, or .bmp format—such as product images on a Web site—which can be quite large.

As the sidebar "Storing BLOBs in the Database or the File System?" discusses, integrating BLOB data and relational database data has its pros and cons. And until recently, SQL Server’s limited data types for storing BLOBs made working with them more difficult than working with traditional relational database data. But with the introduction of three new data types, SQL Server 2005 treats all data the same—whether your BLOB contains images, large amounts of text data, audio files such as MP3s, or even program executables (.exes). In this article, you’ll see how easy it is to use one of the new data types, VARBINARY(MAX), to import BLOB data into and retrieve it from a SQL Server 2005 database. You can download a simple example project that demonstrates this functionality. But before jumping into the code, let’s review the key elements of SQL Server’s BLOB storage

What about BLOB?
SQL Server stores standard INT, CHAR, and VARCHAR data directly within a row. However, this approach limits the maximum capacity of each data type to 8000 bytes, slightly less than SQL Server’s maximum row size of 8060 bytes. (Note that with SQL Server 2005’s row-overflow feature, the maximum row size can exceed 8060 in certain cases; for information about this feature, see Kalen Delaney's "Stretching the 8K Row Limit," InstantDoc ID 50490.)

Although more than adequate for most data types, the 8KB limit is a problem for most BLOB data. To accommodate the needs of larger BLOB data, Microsoft designed SQL Server to handle BLOB storage differently than it handles storage for more common data types. Figure 1 shows an overview of how SQL Server 2005 and earlier releases store the IMAGE and VARBINARY(MAX) BLOB data types.

As you can see in Figure 1, SQL Server doesn’t store large BLOB data on the same data page as the data for the rest of the row. Instead, it stores BLOB data as a collection of 8KB pages organized in a B-tree structure. Each row’s BLOB column contains a 16-byte pointer to the root B-tree structure, which tracks the blocks of data that comprise the BLOB. If the data is less than 64 bytes, SQL Server stores it as part of the root structure. Otherwise, the root structure contains a series of pointers to the data blocks that comprise the binary object.

For BLOBs smaller than SQL Server’s 8KB page size, you have a couple of options for storing the BLOB data inline, as you would standard text and numeric data. For the old TEXT, NTEXT, and IMAGE data types, which SQL Server 2005 continues to support, you can use the text-in-row feature to store the data inline. And for the new VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types, you can use the backward-sounding large value types out of row option. Storing smaller BLOB data inline improves performance, avoiding the extra I/O needed to read the BLOB data record. (For more information about SQL Server’s text-in-row feature, see Kalen Delaney’s "Text in Row Internals," InstantDoc ID 37635.)

Note that BLOBs are sometimes called large objects (LOBs) or character large objects (CLOBs). The LOB designation can refer to both CLOBs and BLOBs. However, technically, CLOBs can contain only text data such as XML, whereas BLOBs can contain any type of data, including binary data. SQL Server 2005 can store LOB data by using the older TEXT, NTEXT, and IMAGE data types or the new VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types. Table 1 lists the attributes for SQL Server’s LOB data types.

In pre–SQL Server 2005 releases, IMAGE is the primary data type for BLOB storage and can contain 2GB of virtually any type of binary data. Although similar to IMAGE, TEXT and NTEXT data types can store only text data, not binary data. TEXT can accommodate up to 2GB of non-Unicode text data, and NTEXT can accommodate up to 1GB of Unicode text data.

However, these earlier BLOB data types have some frustrating limitations. You can’t use them as local variables in stored procedures or T-SQL batches, as part of an index, or in WHERE, ORDER BY, COMPUTE, or GROUP BY clauses. (The exception: You can use TEXT and NTEXT with the LIKE keyword.) However, the biggest problem with IMAGE, TEXT, and NTEXT data types is that to access them, you have to use a different programming model than you use to access other SQL Server data types.

SQL Server 2005’s VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data types eliminate these limitations while still letting you store large amounts of data. Unlike with earlier BLOB data types, you access the various (MAX) data types the same way as other SQL Server data types. In addition, you can declare the new data types as local variables. And you can use SQL Server string-handling functions such as SUBSTRING() on VARCHAR(MAX) and NVARCHAR(MAX) columns. VARCHAR(MAX) and VARBINARY(MAX) both provide up to 2GB of storage, whereas NVARCHAR(MAX) provides 1GB of character storage. Now, let’s see how you can use the new VARBINARY(MAX) data type with T-SQL and ADO.NET to import BLOB data into a SQL Server 2005 database, then use ADO.NET to retrieve it.

Using T-SQL to Import BLOBs The first step in working with BLOBs and the VARBINARY(MAX) data type is create a table containing a VARBINARY(MAX) column and use the column as target to import BLOB data. Use the following code to create the sample table MyBLOBTable:

CREATE TABLE MyBLOBTable
  (blob_id int IDENTITY(1,1),
  blob_description varchar(256),
  blob_data varbinary(max))

Because of T-SQL’s limited ability to address the IMAGE, TEXT, and NTEXT data types, these older BLOB types almost forced you to use ADO or ADO.NET to load BLOB data. But because you can use the new (MAX) data types as local variables, you can either use T-SQL to load them directly or use ADO or ADO.NET to load them.

Listing 1’s code uses T-SQL and the OPENROWSET function to load data into the VARBINARY(MAX) blob_data column. The INSERT statement inserts the results of the SELECT statement, which uses the OPENROWSET function to read the contents of the bitmap file win2003.bmp from the file system. Notice that OPENROWSET uses the new BULK rowset provider, which reads data from the file system, to return a file’s contents as a scalar value. The SINGLE_BLOB keyword tells the BULK provider that it will be reading binary data.

   Prev. page   [1] 2     next page
 
 

ADS BY GOOGLE