Try injecting code through the revised procedure by
running the following command:
EXEC dbo.usp_getorders N'0); RAISERROR( 'This
could have been much worse than a RAISEERROR
statement.', 16, 1); --';
This time you get the following error:
Msg 50000, Level 16, State 1, Procedure usp_
getorders, Line 11
Input may contain SQL injection. Procedure
aborted.
However, if the input isn’t restricted to digits and
commas (e.g., when the elements in the array are supposed
to be character strings), you’ll find it difficult
to guarantee a validation that leaves absolutely no
exposure to SQL injection. (For more information, see
SQL Server Books Online—BOL—“SQL Injection” at msdn.microsoft.com/en-us/library/ms161953.aspx.)
Performance. When a query has a plan in cache, in
order for subsequent query invocations to be able to
reuse the cached plan, the query strings (that of the
query for which the plan was cached and those of the
subsequent query invocations) need to be the same.
The exception is when SQL Server auto parameterizes
the query, but this exception doesn’t apply to our procedure.
For each unique input to our procedure, a different
query string is constructed, and therefore a new
plan is produced and cached. Instead of optimizing the
query only once and reusing the cached plan in all subsequent
invocations of the procedure, SQL Server will
reoptimize almost every time the procedure is invoked,
thus generating a large number of execution plans. To
demonstrate plan caching and reuse behavior, I’ll use
the version of the procedure that you create by running
the code in Listing 3.
As you can see in the procedure’s code, I incorporated
a GUID as a block comment within the query.
Based on this GUID, it will be easy to locate the plans
specifically associated with the query of interest in
cache. After creating the procedure with the GUID
as a comment, run the following code to execute the
procedure with two different inputs:
EXEC dbo.usp_getorders N'10248,10250,10249';
EXEC dbo.usp_getorders
N'10260,10270,10265,10290';
You can locate the plans associated with our procedure
and dynamic batch by querying sys.syscache
objects (or master.dbo.syscacheobjects, in versions
before SQL Server 2005). This object has a row for
each plan in cache. You can look for plans in which
the SQL text (sql attribute) contains the specific GUID
you planted as a comment. Assuming you executed the
stored procedure twice with different arguments as I suggested, run the following code to inspect the plans
associated with our query in cache:
SELECT objtype, usecounts, sql
FROM sys.syscacheobjects -- in 2000 query master.
dbo.syscacheobjects
WHERE SQL LIKE '%DDAF1499-A7FF-4A9C-AE83-
FCE5A33F5151%'
AND sql NOT LIKE '%sys%';
For each plan in cache, the query returns the object
type (Proc for procedure plan, Prepared or Adhoc
for a prepared statement such as one prepared by
sp_executesql), a count of how many times the plan
was used, and the SQL text for which the plan was
created. This query returns the output in Table 2. Note
that in SQL Server 2008 the objtype attribute will show
Adhoc instead of Prepared.
Dynamic SQL operates in its own batch and isn’t
considered part of the batch of the calling procedure;
therefore, you get separate plans for the procedure (Proc
object type) and the dynamic batches (Prepared or
Adhoc object type). The procedure plan is just a shell
in our case, simply encapsulating a call to the dynamic
batch. As you can see, SQL Server did reuse the procedure
plan for the second invocation of the procedure.
However, you can see that there’s a separate prepared
statement plan for each unique dynamic batch, meaning
that the actual query plan wasn’t reused. Think of the implications for performance when the procedure is
invoked very frequently with different sets of order
IDs; consider the cost of all those compilations, and
the memory required to hold all the plans.
Note that as of SQL Server 2005, recompiles are
done at the statement level rather than the batch/
procedure level; this method enables a RECOMPILE
query hint (in addition to the existing RECOMPILE
procedure option). When the RECOMPILE query
hint is specified, SQL Server doesn’t bother to keep the
plan in cache. Because with the dynamic SQL solution
chances for plan reuse are very low to begin with, at least
you can mitigate the memory pressure issue by adding
OPTION(RECOMPILE) at the end of the query. Run
the code in Listing 4 to create the revised stored procedure
(notice the change in GUID as well).
Then, execute the revised procedure with different
inputs:
EXEC dbo.usp_getorders N'10248,10250,10249';
EXEC dbo.usp_getorders N'10260,10270,10265,10290';
Query the cached plans:
SELECT objtype, usecounts, sql
FROM sys.syscacheobjects -- in 2000 query master
.dbo.syscacheobjects
WHERE SQL LIKE '%995F4DF8-6518-47EE-AB9E-
2E2876DA1700%'
AND sql NOT LIKE '%sys%';
You get the output in Table 3, indicating that although
the procedure’s plan was reused (the shell for the call
to the dynamic batch), the plans for the prepared statements
weren’t cached.
Control. To work around the problem of limited
control over the order of the rows in the output, you
can add an ORDER BY clause to the query to sort
the output by existing attributes from the Orders table.
However, there’s no straightforward way for you to
ask to sort the output by the order of appearance of
the order ID in the input array. For example, given the
input array 10248,10250,10249, you want the output
to be sorted with 10248 first, 10250 second, and 10249
third. Although this task isn’t impossible, solutions
based on the existing implementation of the procedure
are tricky and inefficient.
Coming Up
Although using dynamic SQL to handle an unknown
number of input elements in a stored procedure is a
common solution, I don’t recommend it. You should
consider a less problematic solution, such as using a
T-SQL split function, a CLR split function, or SQL
Server 2008’s table valued parameters. I’ll discuss these
alternatives in subsequent articles.