One of my least favorite features of Microsoft SQL Server is that it lets
you apply a NoLock hint within a query. The NoLock hint tells SQL Server
that, for the current query, you'd like to read the data from a given table
while ignoring any locks. In systems in which there are multiple processes
trying to simultaneously access the same resource, NoLock is advantageous
because it allows user queries to operate without needing to wait for updates
to finish. The disadvantage is that in many cases users read data that hasn't
been committed to the database or is only partially in place, which can
produce unexpected results.
I usually find the NoLock hint in use by Web applications that are well
behaved but the business tier has a custom-reporting capability, import
capability, or both. For example, a common pattern is that in order to speed
custom reporting, the system occasionally reads from five or six related
tables, then inserts or updates all the data used to create reports into a
single table. Rather than update the reporting table as changes occur on the
source table, the developers thought they could speed the system by batching
these updates. Similarly, for data import, there's one giant statement that
takes a table or a file of data from another system and attempts to insert
all the rows in a single SQL statement.
The problem with batching updates and using giant statements is that SQL
Server holds locks until the entire transaction completes. As a result,
users' queries against the involved tables are suspended, making it seem like
the application has frozen. The easy solution is to add the NoLock hint,
which allows these queries to occur while these massive updates statements
run their course.
At this point, a quick review seems in order. When you're creating a
transaction, you want to follow the Atomicity, Consistency, Isolation, and
Durability (ACID) principle. Atomicity doesn't refer to an atom bomb but
rather that a transaction should be processed completely or not at all. Thus,
you should work with the smallest possible unit. For example, it's better to
process one order and its accompanying set of order details rather than
10,000 orders and all of their accompanying details.
Granted, NoLock usually lets organizations survive with few noticeable
errors. However, I had been looking forward to SQL Server 2005 because it was
supposed to provide an alternative to using NoLock. In SQL Server 2005,
Microsoft introduced Snapshot Isolation. The basic idea is that transactions
are held in suspense; thus, a long-running transaction (e.g., 10,000 orders)
wouldn't hold locks during the transaction's update cycle, allowing reads to
be done against clean data.
Unfortunately, Snapshot Isolation has two problems that keep it from being
a good replacement for NoLock. The first problem is that enabling the
Snapshot Isolation option in a database requires an additional 14 bytes of
space per row in every table. Given that most companies that have broken the
transaction rules have also created some very wide tables, this space
requirement can be a problem.
The second and bigger problem is how this feature is implemented.
Specifically, when data is written to a database, SQL Server 2005 writes all
the data to a temporary table in TempDB instead of updating the actual
database. When the transaction completes, SQL Server 2005 moves the data from
TempDB storage to the database. This means that the system will likely need
additional hardware to provide more I/O capacity.
Unfortunately, I suspect that the majority of organizations that could
really benefit from Snapshot Isolation won't be willing to pay the price to
implement it. Of course, the real solution is to not have long-running
transactions, but then everyone agrees with that in principle. For more
information about Snapshot Isolation in SQL Server 2005, I suggest that you
go to http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnsql90/html/sql2k5snapshotisol.asp.
End of Article