Like the jelly-like Blob monster in those old Steve McQueen horror movies, mixing binary large objects
(BLOBs) with relational database data has traditionally been unnatural and difficult to manage. BLOBs
and relational database data are very different entities. Relational data usually consists of text or numbers
and tends to be small. In contrast, BLOB data is most often pictures in .jpg, .tiff, or .bmp format—such
as product images on a Web site—which can be quite large.
As the sidebar "Storing BLOBs in the Database or the File System?" discusses, integrating BLOB
data and relational database data has its pros and cons. And until recently, SQL Server’s limited data types
for storing BLOBs made working with them more difficult than working with traditional
relational database data. But with the introduction of three new data types, SQL Server
2005 treats all data the same—whether your BLOB contains images, large amounts of
text data, audio files such as MP3s, or even program executables (.exes). In this article,
you’ll see how easy it is to use one of the new data types, VARBINARY(MAX), to
import BLOB data into and retrieve it from a SQL Server 2005 database. You can download a simple example
project that demonstrates this functionality.
But before jumping
into the code, let’s review the
key elements of SQL Server’s
BLOB storage
What about BLOB?
SQL Server stores standard INT, CHAR, and VARCHAR data directly within a row.
However, this approach limits the maximum capacity of each data type to 8000
bytes, slightly less than SQL Server’s maximum row size of 8060 bytes.
(Note that with SQL Server 2005’s row-overflow feature, the maximum row
size can exceed 8060 in certain cases; for information about this feature, see
Kalen Delaney's "Stretching the 8K Row Limit," InstantDoc ID 50490.)
Although more than adequate for most data types, the 8KB limit is a problem
for most BLOB data. To accommodate the needs of larger BLOB data, Microsoft
designed SQL Server to handle BLOB storage differently than it handles storage
for more common data types. Figure 1 shows an overview of how SQL Server 2005
and earlier releases store the IMAGE and VARBINARY(MAX) BLOB data types.
As you can see in Figure 1, SQL Server doesn’t store large BLOB data
on the same data page as the data for the rest of the row. Instead, it stores
BLOB data as a collection of 8KB pages organized in a B-tree structure. Each
row’s BLOB column contains a 16-byte pointer to the root B-tree structure,
which tracks the blocks of data that comprise the BLOB. If the data is less
than 64 bytes, SQL Server stores it as part of the root structure. Otherwise,
the root structure contains a series of pointers to the data blocks that comprise
the binary object.
For BLOBs smaller than SQL Server’s 8KB page size, you have a couple
of options for storing the BLOB data inline, as you would standard text and
numeric data. For the old TEXT, NTEXT, and IMAGE data types, which SQL Server
2005 continues to support, you can use the text-in-row feature to store the
data inline. And for the new VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX)
data types, you can use the backward-sounding large value types out of row option.
Storing smaller BLOB data inline improves performance, avoiding the extra I/O
needed to read the BLOB data record. (For more information about SQL Server’s
text-in-row feature, see Kalen Delaney’s "Text in Row Internals,"
InstantDoc ID 37635.)
Note that BLOBs are sometimes called large objects (LOBs) or character large
objects (CLOBs). The LOB designation can refer to both CLOBs and BLOBs. However,
technically, CLOBs can contain only text data such as XML, whereas BLOBs can
contain any type of data, including binary data. SQL Server 2005 can store LOB
data by using the older TEXT, NTEXT, and IMAGE data types or the new VARCHAR(MAX), NVARCHAR(MAX),
and VARBINARY(MAX) data types. Table 1 lists the attributes for SQL Server’s
LOB data types.
In pre–SQL Server 2005 releases, IMAGE is the primary data type for
BLOB storage and can contain 2GB of virtually any type of binary data. Although
similar to IMAGE, TEXT and NTEXT data types can store only text data, not binary
data. TEXT can accommodate up to 2GB of non-Unicode text data, and NTEXT can
accommodate up to 1GB of Unicode text data.
However, these earlier BLOB data types have some frustrating limitations. You
can’t use them as local variables in stored procedures or T-SQL batches,
as part of an index, or in WHERE, ORDER BY, COMPUTE, or GROUP BY clauses. (The
exception: You can use TEXT and NTEXT with the LIKE keyword.) However, the biggest
problem with IMAGE, TEXT, and NTEXT data types is that to access them, you have
to use a different programming model than you use to access other SQL Server
data types.
SQL Server 2005’s VARCHAR(MAX), NVARCHAR(MAX), and VARBINARY(MAX) data
types eliminate these limitations while still letting you store large amounts
of data. Unlike with earlier BLOB data types, you access the various (MAX) data
types the same way as other SQL Server data types. In addition, you can declare
the new data types as local variables. And you can use SQL Server string-handling
functions such as SUBSTRING() on VARCHAR(MAX) and NVARCHAR(MAX) columns. VARCHAR(MAX)
and VARBINARY(MAX) both provide up to 2GB of storage, whereas NVARCHAR(MAX)
provides 1GB of character storage. Now, let’s see how you can use the new
VARBINARY(MAX) data type with T-SQL and ADO.NET to import BLOB data into a SQL
Server 2005 database, then use ADO.NET to retrieve it.
Using T-SQL to Import BLOBs The first step in working with BLOBs and the VARBINARY(MAX)
data type is create a table containing a VARBINARY(MAX) column and use the column
as target to import BLOB data. Use the following code to create the sample table
MyBLOBTable:
CREATE TABLE MyBLOBTable
(blob_id int IDENTITY(1,1),
blob_description varchar(256),
blob_data varbinary(max))
Because of T-SQL’s limited ability to address the IMAGE, TEXT, and NTEXT
data types, these older BLOB types almost forced you to use ADO or ADO.NET to
load BLOB data. But because you can use the new (MAX) data types as local variables,
you can either use T-SQL to load them directly or use ADO or ADO.NET to load
them.
Listing 1’s code uses T-SQL and the
OPENROWSET function to load data into the VARBINARY(MAX) blob_data column. The
INSERT statement inserts the results of the SELECT statement, which uses the
OPENROWSET function to read the contents of the bitmap file win2003.bmp from
the file system. Notice that OPENROWSET uses the new BULK rowset provider, which
reads data from the file system, to return a file’s contents as a scalar
value. The SINGLE_BLOB keyword tells the BULK provider that it will be reading
binary data.
Prev. page  
[1]
2
next page