These read-consistency problems have to do with changes
in the storage engine choices when instructed by the relational
engine (specifically by the optimizer) to perform index scans
(both clustered and nonclustered) with Ordered: False.
The optimizer will use an index scan whenever all the
index leaf data is needed. When the data isn’t required to
be scanned in order, the Index Scan operator would show
Ordered: False in the tooltip box. When the data is required
to be scanned in order (e.g., to satisfy an ORDER BY
request, GROUP BY, DISTINCT, merge join), the Index
Scan operator would show Ordered: True. As an example,
examine the execution plans that Web Figure 1 shows for the following three
queries:
USE AdventureWorks;
-- Query 1
SELECT * FROM Sales.SalesOrderHeader;
-- Query 2
SELECT SalesOrderNumber FROM Sales.
SalesOrderHeader;
-- Query 3
SELECT * FROM Sales.SalesOrderHeader
ORDER BY SalesOrderID;
Observe that the first two index scans (i.e., first clustered,
second nonclustered) show Ordered: False, whereas the
third shows Ordered: True. With the first two scans, as far
as the optimizer is concerned, the data can be returned in
any order, but with the third scan, it needs to be returned
in index key order.
Ordered: False doesn’t necessarily mean that the
storage engine in charge of the actual row operations
(fetching the data) will utilize the more efficient allocation-
order scans based on IAM pages. The storage engine
needs to also take into account your consistency expectations
based on the isolation level the query runs under.
An allocation-order scan running while inserts are issued
against the table/index might result in getting the same row
multiple times or skipping rows.
As an example of getting the same row multiple times,
suppose you have the following index keys in pages of
an index shown in file order: Page#101(keys: 50, 60, 70,
80), Page#107(keys: 10, 20, 30, 40), Page#121(keys: 90,
100, 110, 120). The logical index order of the pages is
Page#107(keys: 10, 20, 30, 40)-Page#101(keys: 50, 60,
70, 80)-Page#121(keys: 90, 100, 110, 120). Suppose an
allocation-order scan starts and reads pages 101 and 107.
So far, it has read the rows with the keys 50, 60, 70, 80,
10, 20, 30, 40. At this point (before the scan is finished),
another process inserts a row with the key value 65. Suppose
the target page 101 is full, so it undergoes a split.
It’s important to note that in a Read Committed isolation,
shared locks aren’t kept until the statement is done but
rather until SQL Server finishes reading the locked resource
(row or page). So, even if SQL Server decides to lock pages
when reading, once the page is read, the lock is released.
The split process allocates a new page (can be either
before or after the page that got split in the file)—say, page
135—and moves half the rows from the source page to the
new one, and inserts the new row to one of the two pages
based on its key value. The layout of the data in the file is
now Page#101(keys: 50, 60, 65), Page#107(keys: 10, 20, 30,
40), Page#121(keys: 90, 100, 110, 120), Page#135(keys: 70,
80). The linked list will reflect the right logical index key
order, so the layout in the index is Page#107(keys: 10, 20,
30, 40)-Page#101(keys: 50, 60, 65)- Page#135(keys: 70,
80)-Page#121(keys: 90, 100, 110, 120). The scan continues
reading the rest of the pages—121 and 135, reading the
rows with the keys 90, 100, 110, 120, 70, and 80. You realize
that the rows with the keys 70 and 80 were read twice.
Similarly, if a page that wasn’t yet reached by the
scan splits, and the new page is allocated before the scan
position, the scan will end up skipping or not reading the
rows that moved to the new page. For these potential consistency
problems, in Read Committed isolation or higher,
the storage engine would opt for an index-order scan when
the optimizer requests an index scan with Ordered: False.
There are exceptions, which I’ll discuss later.
The reason an index-order scan won’t allow these
problems is that the pages are read in logical index key
order following the linked list. If a page splits after the
scan visited it, both pages (the one that got split and the
new one) won’t be visited again because they’re behind in
logical order. If a page splits before the scan reached it, the
scan will read both pages in order.
An index-order scan won’t be as efficient as an allocation-
order scan because fragmentation will negatively impact
an index-order scan. However, these particular consistency
problems won’t happen. Other consistency problems
might happen, but I’ll discuss those in a later article.
What might surprise you is that when an index scan
with Ordered: False is running under Read Uncommitted,
the storage engine assumes that you have no consistency
expectations whatsoever but rather that you’re aiming at
the best possible performance. Therefore, the engine will
use an allocation-order scan and these consistency problems
might happen if rows are inserted while the scan is
running. There’s an exception in SQL Server 2005: If the
index is small (64 pages or fewer), the storage engine will
use an index-order scan because it’s more efficient than an
allocation-order scan with so little data. Unfortunately,
when you see an index scan in the plan with Ordered:
False, there’s no way to tell in the plan whether the storage
engine will, in practice, use an allocation-order scan or an
index-order scan.
It should be noted that table scans against a heap don’t
incur the consistency problems that this article describes
for the simple fact that heaps don’t incur page splits, as I
explained earlier. However, keep in mind that clustered
tables have some advantages over heaps—for example,
they let you control the way inserts are distributed over
disk drives. Also, even when the table is organized as a
heap, if there are nonclustered indexes, those can incur
splits, and scans of such indexes could incur the aforementioned
read-consistency problems.
Ready for Examples?
Next month, I’ll provide tangible examples with code
you can run to see the problems with your own eyes, and
I’ll suggest alternatives to avoid these problems. Several
people are due thanks for their part in shedding light on
the read-consistency problems I’ve covered in this article:
Lubor Kollar, Srikumar Rangarajan, Tony Rogerson,
Davide Mauri, Andrea Benedetti, Gianluca Hotz, Maciej
Pilecki, and Paul Randal.
End of Article
Prev. page
1
[2]
next page -->