• subscribe
March 22, 2006 12:00 AM

CLR or Not CLR: Is That the Question?

Knowing when to choose the CLR—and when not to—unlocks its potential
SQL Server Pro
InstantDoc ID #49429
Downloads
49429.zip

Before you test the UDF, run the following code to enter the context of the TSQLCLR database, and to turn on the STATISTICS TIME session option:

USE TSQLCLR; 
SET STATISTICS TIME ON; 

Next, run the following query, which strips all special characters from the phone numbers in the 100,000-row Customers table:

SELECT Phone,
 dbo.fn_RemoveChars(Phone,
 N'%[^0-9a-zA-Z]%') AS
 CleanPhone 
FROM dbo.Customers; 

On my test machine, this query ran for 53 seconds; two of those seconds were for generating the output.To measure the runtime without generating the output, turn on the Discard results after execution option in SQL Server Management Studio (SSMS) under Tools, Options, Query Results, SQL Server, Results to Grid.

Next, examine the definition of the fn_RegexReplace CLR UDF in the C# code at callout A in Listing 3 (or the code at callout A in Listing 4 for the VB implementation). The UDF's simple definition accepts three inputs: input (input string), pattern (regular expression pattern to find), and replacement (replacement string pattern). The UDF first determines whether one of the inputs is NULL, in which case it returns a NULL. If none of the inputs is NULL, the UDF simply invokes the Regex.Replace function, which internally handles the string-pattern replacement based on regular expressions. For simplicity, I didn't include exception-handling code in any of the routines in this article, although you would in production code.

One reason this UDF is dramatically more powerful than the T-SQL one is that it supports regular expressions, which have versatile capabilities. Unfortunately, covering regular expressions is outside the scope of this article. But suffice it to say that you can do much more with this UDF than you can with the T-SQL one. Now, test the function by running the following query:

SELECT Phone,
 dbo.fn_RegexReplace(Phone,
 N'[^0-9a-zA-Z]', N') AS
 CleanPhone 
FROM dbo.Customers; 

This query ran in under 3 seconds on my test machine, with two seconds for generating the output. In other words, it ran about 50 times faster than the T-SQL UDF.

Stored procedures. Just as I was about to develop my own CLR stored procedure to demonstrate what you shouldn't do with CLR code, I stumbled on a perfect example in SQL Server 2005 Books Online (BOL). The stored procedure queries a table, uses a loop to aggregate all values from one of the table's columns, and returns the result value in an output parameter. I made some revisions to the original stored procedure, including directing the query to our 1-million-row OrderDetails table and aggregating two columns, Quantity and UnitPrice, instead of one to return both result values as output parameters.You can find the definition of the usp_OrderDetailsTotals CLR stored procedure in the C# code at callout B in Listing 3 (or the VB code at callout B in Listing 4).

The procedure's logic is simple: It creates a connection to the database in which the stored procedure resides; loads the result set of the query to a SqlDataReader object called reader; and iterates through the records in reader, adding in each iteration the current record's values to the output parameters—simple iterative aggregation logic.

The author of the original procedure probably intended to demonstrate the syntax for creating a CLR stored procedure and the use of output parameters. However, such logic is a classic example of what you should never do in .NET without a compelling reason—for example, if you need to perform complex computations (or some manipulation that makes more sense with CLR code) for every record in the data reader object. I have to apologize for publicly pointing out bad code in BOL, but it's vital to understand this is exactly the type of activity that requires T-SQL, both for performance and simplicity.

To test the CLR procedure, run the following code:

DECLARE @sumqty AS BIGINT, 
  @sumprice AS MONEY; 
EXEC dbo.usp_OrderDetailsTotals
  @sumqty = @sumqty OUTPUT, 
  @sumprice = @sumprice OUTPUT; 
SELECT @sumqty AS sumqty, 
  @sumprice AS sumprice; 

On my system, this code ran for just over 4 seconds. In T-SQL, you don't need a stored procedure for this task. Instead, use the following simple query:

SELECT SUM(Quantity) AS sumqty,
  SUM(UnitPrice) AS sumprice 
FROM dbo.OrderDetails; 

This query ran for about half a second on my system; that's about eight times faster than the CLR code. By using a set-based query, you let the optimizer do what it does best: choosing an efficient execution plan for your query. If you use the interative code in the CLR implementation, you force a specific execution plan and incur the overhead of a record-by-record manipulation.

The Real Question
By now, you've realized that the real question isn't "CLR or not CLR?" T-SQL is more powerful and efficient in data manipulation scenarios, whereas the CLR is richer and more efficient in scenarios that are computation intensive or involve procedural or iterative logic or string manipulation.When you take the time to learn how to identify whether T-SQL or the CLR is the best choice for a given situation, you'll be able to get the best performance out of your queries.

Author's Note: I'd like to express my deepest gratitude to my good friend Dejan Sarka, a fellow Solid Quality Learning mentor who taught me how elements such as XML and the CLR work with relational database models.



ARTICLE TOOLS

Comments
  • ian
    5 years ago
    Nov 07, 2007

    Excellent intro

  • Miki
    6 years ago
    May 24, 2006

    Running your code I found that if SET STATISTICS TIME is ON then the TSQL fn took 30 sec to complete and while the session settings had SET STATISTICS TIME OFF the TSQL fn took 4 seconds to complete.
    The .NET fn took 2 seconds on my machine regrdless of the above mentioned session settings.

  • Oscar
    6 years ago
    Apr 14, 2006

    It's a great article, I think that old-t-sql developers recognize the advantages of new T-SQLCLR.

  • Oscar
    6 years ago
    Apr 14, 2006

    It's a great article, I think that old-t-sql developers recognize the advantages of new T-SQLCLR.

You must log on before posting a comment.

Are you a new visitor? Register Here