DOWNLOAD THE CODE:
Download the Code 42349.zip

One of my favorite SQL Server topics to teach about is the query optimizer, whose job is to determine which query plans SQL Server will use to execute the queries your applications submit for processing. A main optimizer task is to determine which indexes, if any, SQL Server will use to find the relevant data in each table involved in a query. In my July 2001 column, "Are You in Tune?" (InstantDoc ID 21038), I introduced the concept of a SARG, which is my term for a particular form of search argument condition in a query's WHERE clause. In most cases, a column must appear in a SARG before the query optimizer will consider using an index on that column.

A SARG limits a search because it specifies an exact match, a range of values, or a conjunction of two or more items joined by AND. A SARG contains a constant expression (or a variable that resolves to a constant) that SQL Server compares to a column by using an operator. SARGs take the form column inclusive_operator <value> or <value> inclusive_operator column. The column name appears on one side of the operator, and the value appears on the other side. The following operators are considered inclusive operators: =, >, <, =>, <=, BETWEEN, and sometimes LIKE (if the wildcard isn't at the beginning of the string). The most important condition is that the column must appear alone on one side of the operator. If the column doesn't appear by itself, the expression isn't a SARG, and SQL Server won't even consider using an index on the column. The sample code in this article illustrates when and how SARGs can be useful.

To run this article's examples, run the script that Listing 1 shows to create a copy of the Northwind database's Orders table. This copy, NewOrders, has an integer IDENTITY column added, and its OrderID column contains character strings. To keep the OrderID values unique even though the same data appears five times, the code concatenates a different digit onto the OrderID string each time it copies the Orders table.

In many cases, you can convert a non-SARG into a SARG so that SQL Server can use indexes to process a query. Consider this query against the newly built NewOrders table:

SELECT * FROM NewOrders
WHERE substring(OrderID, 1, 5) = '10248'

The plan for this query shows that SQL Server will perform a table scan. If you use SET STATISTICS IO ON before executing the query, you should see 106 logical reads—one for each page in the table. To change the query so that it contains a SARG but remains logically equivalent, you'd run the following:

SELECT * FROM NewOrders
WHERE OrderID LIKE '10248%'

The plan for the modified query shows an index seek. STATISTICS IO reveals that only seven logical page reads returned the same five rows of data. You get a better plan and better performance because the SARG directs the optimizer to evaluate the index on OrderID, and the optimizer determines that the index is useful. In cases like this example, you can easily rewrite your query so that it contains a SARG. However, in some queries, the lack of a SARG isn't obvious; these queries can be hard to detect and correct. I'll discuss some examples of such queries later in this article.

Change Is Good...Sometimes
Sometimes changes in the way SQL Server processes your queries internally can mean expressions that seemed to contain SARGs no longer contain any. In my columns "Inside Optimization" (October 2003, InstantDoc ID 39822) and "Inside Optimizer Enhancements" (November 2003, InstantDoc ID 39906), I discussed query optimizer changes in SQL Server 2000, most of which were improvements. However, reader Tim Kehoe told me about one optimizer change that hurt his application's performance. Before SQL Server 2000, in a comparison between a column and a value of a different data type, SQL Server always implicitly converted the value (a constant or expression) to the column's data type. However, the rules for implicit conversion changed in SQL Server 2000. Under the topic "Data Type Precedence," SQL Server Books Online (BOL) states:

"When two expressions of different data types are combined by an operator, the data type precedence rules specify which data type is converted to the other. The data type with the lower precedence is converted to the data type with the higher precedence. If the conversion is not a supported implicit conversion, an error is returned. When both operand expressions have the same data type, the result of the operation has that data type."

The BOL topic goes on to list all the data types in order of precedence. The list shows that integer types have a higher precedence than character types. (For more information about data types and data-type precedence, see Wayne Snyder's May 2001 article, "Problem Children," InstantDoc ID 20014.) Tim had queries like this:

SELECT * FROM NewOrders
WHERE OrderID = 102533

Although OrderID is a character column by definition, this query compares it to a constant, which SQL Server assumes to be an integer. Because SQL Server needs both values to be the same data type, it will convert the OrderID column to an integer type, as if you'd written this query:

SELECT * FROM NewOrders
WHERE CONVERT(int, OrderID) = 102533

In fact, the query plan for the first query includes the conversion, so this query no longer contains a SARG. The solution in this case is to not let SQL Server assume that the data type of the constant is integer but instead to force SQL Server to recognize the constant as a character string. In this case, you simply put quotes around the value. When you run the following two queries, you get significantly different plans and performance:

SELECT * FROM NewOrders
WHERE OrderID = 102533

SELECT * FROM NewOrders
WHERE OrderID = '102533'

The first query above uses an index scan and requires 15 logical reads, whereas the second uses an index seek to access the data directly and needs only 3 logical reads to find the data.

   Prev. page   [1] 2 3     next page
 
 

ADS BY GOOGLE