SQL Server 2008's FILESTREAM storage offers exciting new performance improvements for storing BLOB data as full-blown files managed by the OS. But as great as the FILESTREAM storage feature is, it’s not exactly trivial to use and requires some legwork to set up correctly. Let’s look at what you need to do to set up, configure, and code to take advantage of FILESTREAM storage.
Why Use FILESTREAM Storage?
Conventional wisdom has long maintained that storing and retrieving files and images within a relational database incurs significant performance penalties. That logic seems sensible, but I had always wondered whether it was based on empirical evidence or mere superstition.
In 2006, Microsoft Research published an excellent white paper that tackled this subject in detail. Its authors conclude that databases could learn a thing or two from file systems, which are much better at managing fragmentation than are relational database engines.
Interestingly enough, they also determined that file systems could learn from database engines by virtue of how they handle smaller bursts of data. (See “To BLOB or not to BLOB: Large Object Storage in a Database or File System.” ) The rough translation: File systems generally work best with blocks of data greater than 1MB in size, while databases work best with blocks of data less than 256K in size. To address this situation, SQL Server 2008 offers highly performant storage and retrieval of smaller bursts of data and adds FILESTREAM storage, letting SQL Server work closely with the file system to achieve the best performance from both worlds.
FILESTREAM Storage Hurdles
Setting up FILESTREAM storage correctly, however, can cause a few headaches. For example, security considerations in distributed environments require a bit more planning to make sure that client applications are able to take advantage of streamed files that are proxied through SQL Server. Similarly, because FILESTREAM storage represents a joint interaction between SQL Server and the host OS, setting up FILESTREAM storage on your servers requires admin permissions at the OS level.
Furthermore, if you want to take advantage of Win32 streaming capabilities, you need to use Windows integrated security (as SQL Server logins obviously interact with the underlying file system). These minor limitations, however, help to safeguard your infrastructure and data. A bigger hurdle is that neither SQL Server Management Studio (SSMS) nor SQL Server itself really has any UI, or native way, to let you stream the contents of a file into a table that’s been marked with the FILESTREAM attribute on one of your varbinary(max) columns.
In other words, if you've got a .jpg file that you want to store within SQL Server, there's no native functionality to convert that image's byte stream into something that you could put, for example, into an INSERT statement. Accordingly, you have to use middle-tier and client applications to leverage this new storage functionality. The problem with accessing BLOB data from within client and middle-tier applications is that the transactional choreography and overhead needed to safely and securely stream file contents in and out of the native file system via SQL Server adds a bit more complexity than you'd normally see when working with queries that involve simple data readers.
FILESTREAM Storage Benefits
Those hiccups aside, FILESTREAM storage is worth using in many scenarios. Besides its performance benefits, it also lets you exceed the 2GB limit previously associated with storing BLOBs, so you can store BLOBs as large as the OS, or file system, permits.
FILESTREAM storage also provides transactional safety when storing files and offers increased file manageability—including the obvious benefit of file backup and restore functionality. Other great benefits include the ability to leverage replication, log shipping, and the power of full-text indexing directly against files stored and managed by SQL Server. (However, operations against FILESTREAM file groups aren’t supported by database snapshots or database mirroring; see SQL Server 2008 Books Online "Using FILESTREAM with other SQL Server Features.")
Configuring Your Server for FILESTREAM Storage
Given that the file system directly manages FILESTREAM data, with SQL Server handling file names, paths, and even security (by proxy), the first thing you need to do when using FILESTREAM storage is to grant SQL Server permission to work directly with the host OS’s file system. Because this is an OS-level change, enabling FILESTREAM storage requires administrative rights on the host server.
As you can see from Figure 1, to enable FILESTREAM functionality you open SQL Server Configuration Manager, then access the Properties page for the individual SQL Server instance that you wish to enable. As part of this configuration process, you specify what kinds of access and streaming rights are available to SQL Server and remote clients.
When it comes to enabling FILESTREAM functionality, you have three choices: 0, which disables FILESTREAM support for this instance; 1, which enables FILESTREAM functionality for T-SQL access; and 2, which enables FILESTREAM support for T-SQL and Win32 access.
After you make the change from within SQL Server Configuration Manager, a new share is created on the host system with the name specified. This isn’t a typical share though, and is intended only to allow very low-level streaming interaction between SQL Server and authorized clients. As such, Microsoft recommends that only the service account used by the SQL Server instance have access to this share. Moreover, because this change takes place at the OS, or service, level you can’t enable it from within SQL Server, and you need to restart your SQL Server instance for the change to take effect.