Many DBAs and T-SQL developers use
nonclustered indexes to avoid table scans,
thereby improving query performance.
Frequently, this type of performance tuning
works well. However, with large tables,
costly bookmark lookups often occur as
a result. I devised a technique to improve
the performance of queries that run slow
because of bookmark lookups. This technique also demonstrates that DBAs and
T-SQL developers often know more than
the query optimizer. Unlike the query
optimizer, they can make assumptions based
on their knowledge of the data and devise
creative solutions.
Before I tell you about the technique,
let me tell you a bit about the problem.
My client's data warehouse environment
contained several very large and very wide
tables. The clustered indexes for the tables
were based on an identity column. The
client chose to use the identity column to
improve batch performance and reduce
fragmentation during INSERT operations.
In addition, because the tables contained
multiple types of date-time columns (e.g.,
Julian, local DateTime, Coordinated Universal Time—UTC) to meet specific needs,
there wasn't an obviously suitable candidate
to use to replace the identity columns as the
range-based clustered key.
As you would expect, the client created
nonclustered keys for columns, which were
typically found by using WHERE clauses
that included a join condition. Almost every
query used the following type of filter to
return data from the large tables:
WHERE location = @location
AND some_date_time
BETWEEN @start and @end
Nonclustered indexes were created for the
(some_date_time, location) columns. This
was optimal because the selectivity of the
location column was low with respect to
the size of the table.
In this case, the client's use of indexes and T-SQL code to leverage the indexes, avoid
table scans, and improve query performance
was by the book. However, the client still
was experiencing performance problems
with some of the crucial stored procedures.
Specifically, several of the data-load stored
procedures took 90 minutes to run, which,
in turn, was affecting the timeliness of report
generation.
After looking at the execution plan for
these queries, I was able to determine the
problem. As Table 1 shows, table scans weren't
occurring and index seeks were occurring
on the nonclustered indexes as planned.
However, the nonclustered index seeks were followed by bookmark lookups, which were
drastically impacting the I/O and subtree
cost, thus slowing down queries.
I looked at several books and online
articles and discovered three ways to avoid
bookmark lookups:
- Create fully covering nonclustered
indexes.
- Create many separate, small (i.e., nonwide) nonclustered indexes so that the
query optimizer can use index joins.
- Change the clustered index key.
In this case, creating covering nonclustered
indexes and creating many small indexes
weren't viable solutions because numerous
queries were returning many different columns. For example, one query might return
12 columns and another query might return
15 columns, nine of which differ from the
columns returned by the first query.
Because the client's stored procedures
used many different date-time columns
to meet specific needs and because of the
concern with fragmentation, changing the
clustered key wasn't a palatable option to
the client. Had the stored procedures used
the same date-time column for every query,
this would have been a possible solution.
So, I decided to creatively leverage the
clustered index. In the client's environment, the data was inserted into large, source
tables per date (some_date_time) and location (location). Because of this setup, the
data being extracted from these large tables
physically exists on the disk in relatively contiguous chunks per date range. However, the
query optimizer didn't know this, so it used
bookmark lookups to retrieve the actual data
for the queries. To circumvent the bookmark
lookups, I took the following steps:
- I rewrote part of the stored procedures'
code so that they retrieved the clustered
identity key's MAX and MIN values before
they ran queries against large tables. As
Listing 1 shows, these values are stored in
the @max_value and @min_value variables.
The identity column is clustered, so this type
of query to retrieve the clustered key's MAX
and MIN values per the (some_date_time,
location) columns is fully covered by the
nonclustered indexes. As you might know,
this is possible because the clustered key
values are implicitly stored within the nonclustered indexes.
- I added the MAX and MIN values in
the @max_value and @min_value variables
to the WHERE clauses. Listing 2 shows an
example of a rewritten WHERE clause.
These changes let the query optimizer
perform a clustered index seek directly
against the clustered index key that's based
on the identity column. As Table 2 shows,
the costly bookmark lookups have been
eliminated. Another positive consequence of
the changes is the addition of parallelism.
This simple solution addressed the major
cause of the performance problems. The time
it takes the data-load stored procedures to
run has decreased from 90 minutes to only 3
minutes (or less)—a 97 percent improvement
in performance. This simple solution also
proves that DBAs and T-SQL developers can
often come up with clever solutions because
they know their data well.