A few months ago, one of my customers began
experiencing performance problems with their
new reporting environment. After rolling out
an ad hoc reporting tool, some of the business users
began building reports that were very complex and, in
some cases, resulted in tens of thousands of rows being
returned. My customer wanted to prevent these types of
reports from being executed. With this requirement in
mind, I began investigating whether SQL Server execution
plans could provide a solution.
Let’s take a look at how execution plans can be used
to estimate both the complexity of queries and the size of
query result sets. I’ll first show you how to use execution
plans within SQL Server Management Studio (SSMS),
then show you how execution plans can be used from a
client application.
Execution Plan
Overview
When you run a query
against the relational database
engine, SQL Server
carries out two major steps.
The first step is to compile
the query into a plan.
The second step involves
executing the plan. SQL
Server 2005 can return a
query execution plan in
three formats: text, XML,
and graphical. The easiest
way to look at an execution
plan is to view its graphical
representation. With SQL
Server 2005, you use SSMS
to do so. For example, after opening a new query window
in SSMS, select Query, Include Actual Execution Plan
and run the following query against the AdventureWorks
sample database:
SELECT ST.Name AS TerritoryName,
SP.SalesPersonID,
SP.CommissionPct
FROM Sales.SalesTerritory
ST INNER JOIN
Sales.SalesPerson SP ON
SP.TerritoryID = ST.TerritoryID
ORDER BY ST.Name;
You can download this query along with the rest of the
sample code at InstantDoc ID 97267. This query is in the
Showplan_Sample1.sql file.
You can see the result set and execution plan for this
query in Figure 1 and Figure 2, respectively. Although an
in-depth discussion of understanding execution plans is
out of this article’s scope, you can find more information
about execution plans by reading “Optimizing a Suboptimal
Query Plan,” February 2007, InstantDoc ID 94775.
However, let me
point out a few
basics. First, an
execution plan is
returned for each
query submitted in a query batch. (In our example, we
submitted only one query.) An execution plan is read from
top to bottom, then right to left. For example, the first
step in this execution plan is to perform an Index Scan
of the AK_SalesTerritory_Name index to return the Territory
name and ID columns. The results of this step are
combined with the next step (a Cluster Index Scan against
the SalesPerson clustered index). Then, working right to
left, a nested loop is used to join the two tables together.
Finally, the results of this nested loop are fed into the final
SELECT operation.
Each step lists a cost value, which represents the cost
of a step relative to the entire query. In our example, the
AK_SalesTerritory_Name Index Scan represents 44 percent
of the total query cost. Similarly, if you run a batch
of queries, you’ll see the query cost relative to the entire query batch listed inside the top of each execution plan. (I’ll
discuss more about what this cost value actually measures
later.) Connecting each step is an arrow, whose thickness
denotes the relative number of rows coming from each step.
As you can imagine, information about each step in a plan
can be very useful when trying to troubleshoot and optimize
a query.
Note that you can view additional details for any step
in an execution plan by selecting or hovering over the step
with your cursor. In Figure 2, you can see the step details
for the AK_SalesTerritory_Name Index Scan. Several
columns of information are displayed here. Note that
if you want to see an explanation of each column, click
a step, then make sure the Properties window is visible
within SSMS. To make this window visible, select View,
Properties Window in SSMS. Note that many estimated
values (e.g., Estimated CPU Cost) are displayed, as well
as several actual values. Estimated values show what the
query engine expects to happen, which leads us into a
discussion of estimated execution plans.
Estimated Execution Plans
To view an estimated execution plan, select Query, then
choose Display Estimated Execution Plan in SSMS. Unlike
an actual plan, the query isn’t run; instead, only an estimated
plan is returned. For large, complex queries, you’ll
notice the estimated plan is returned very quickly (relative
to running the actual query and viewing the actual plan).
Also, when you look at the step details of an estimated
plan, you’ll see only estimated values (i.e., none of the actual
values associated with actual plans are returned).
As you can imagine, estimated values might not always
match the actual values because several conditions, such as
the cost of applying these filters and the number of rows
returned, won’t be known until after the query is executed.
The query engine calculates these estimated values based
on statistics (SQL Server maintains statistical information
about indexes and column data stored in the database), so
the accuracy of the estimates can be affected if the statistics
are out of date. For more information about statistics,
see “Making the Most of Automatic Statistics Updating,”
October 2007, InstantDoc ID 96767.
To help meet my customer’s requirement of identifying
complex queries and/or queries that return large numbers
of rows, we’re going to focus on two estimated values: Estimated
Number of Rows and Estimated Subtree Cost. The
Estimated Number of Rows value is self-explanatory. Estimated
Subtree Cost is defined as the estimated cumulative
cost of an operation and all child operations. In Figure 3,
I’ve highlighted the SELECT operation, which is the final
step of our query. The estimated number of rows for this
query is 14. The estimated subtree cost is 0.0075236. But
how is this cost measured?
According to SQL Server 2005 Books Online (BOL),
cost is measured as the “estimated elapsed time, in seconds,
required to complete a query on a specific hardware configuration.”
This estimate is based on a test machine that
the SQL Server developers used in their lab. Therefore, this
cost value is a loose estimate. It’s important to note that the
actual execution plan cost can vary greatly depending on
your environment at the time the query is running.
I encourage you to run several different queries on
your machine and examine the estimated values versus the
actual values, as well as the actual time it takes to run the
query. You’ll then have a better idea of what type of cost
threshold is realistic for your environment.
Using Estimated Execution Plans
in an Application
You should now have an understanding of how execution
plans work and how to view the graphical form of an
execution plan within SSMS. Now I’ll walk you through
how to use an execution plan within an application. I
mentioned that execution plans can be returned in text, XML, and graphical format. There are two types of
text plans: one that’s generated using the SET SHOWPLAN_
TEXT command and one that’s generated using
the SET SHOWPLAN_ALL command. SET SHOWPLAN_
TEXT returns a limited amount of information,
which doesn’t include the estimated values we need. Using
SET SHOWPLAN_ALL returns estimated values in a
tabular format. It’s a possible candidate for consumption
in an application. However, the XML version of a
plan provides the richest level of detail (graphical plans
are actually built from the XML version of a plan) and
produces a single XML document for the query batch.
There are existing examples of extracting data from an
XML plan, which we can use as a starting point. In
particular, I have developed samples using some of the
code found in the Microsoft article “Processing XML
Showplans Using SQLCLR in SQL Server 2005” (msdn2.microsoft.com/en-us/library/ms345130.aspx).
Continued on page 2.