SideBar    5 Steps for Developing and Deploying CLR Code in SQL Server
DOWNLOAD THE CODE:
Download the Code 49429.zip

The integration of the .NET Common Language Runtime (CLR) in SQL Server 2005 lets you develop .NET objects such as user-defined types (UDTs), user-defined aggregates (UDAs), user-defined functions (UDFs), stored procedures, and triggers in any .NET language, including C# and Visual Basic. Programmers who are experienced with .NET development probably find CLR integration appealing. But programmers and DBAs who are mainly experienced with T-SQL development sometimes find the idea appalling—like giving matches and knives to kids. However, in the hands of mature, responsible people, knives and matches can be useful tools. Likewise, the CLR is a powerful tool when you use it appropriately. To help you learn when it's best to use the CLR and when to choose T-SQL, I've created a couple of simple examples that demonstrate the differences you'll see in T-SQL and CLR performance and functionality in different scenarios.

For these examples, I'm assuming you have basic experience in developing and deploying CLR objects in SQL Server. For your convenience, I provide .NET code in both C# and VB, and you can download all the code examples at http://www.sqlmag.com, InstantDoc ID 49429.

Setting Up the Sample Database, Objects, and Routines
First, to use CLR objects in SQL Server, you need to turn on the CLR enable server configuration option, which is disabled by default, by using the following statement:

EXEC sp_configure 'CLR enable', 1; RECONFIGURE; 

Next, you need to set up the sample database, TSQLCLR, which contains the Customers and OrderDetails tables that I refer to in this article's examples. Run the code in Listing 1, which should take a few minutes to finish.

The Customers table in the TSQLCLR database is populated with 100,000 rows from the Customers table in the Northwind database. Similarly, the OrderDetails table in TSQLCLR is populated with 1 million rows from the Order Details table in Northwind. Typically, it's a bad idea to just duplicate rows from a source table to generate sample data for performance tests; doing so can lead to skewed performance results because the plans that the optimizer generates can vary based on data density. However, for our purposes, the data-duplication approach works fine.

Finally, you need to create the T-SQL and CLR-based routines that I discuss in this article. To create the fn_RemoveChars T-SQL UDF, run the code that Listing 2 shows. Then, deploy the C# code that Listing 3 shows or the VB version in Listing 4. If you're not familiar with deploying CLR objects in SQL Server, you'll find step-by-step instructions in the sidebar "5 Steps for Developing and Deploying CLR Code in SQL Server," page 18.

When CLR and When T-SQL?
To figure out which scenarios require TSQL code and which will benefit from CLR code, consider which activities each is best at. T-SQL is more powerful and efficient than .NET code when the bulk of the activity involves data access and manipulation. But T-SQL is a declarative language, not a procedural one, and isn't designed for computation-intensive activities such as procedural or iterative logic, string manipulation, or data formatting in general. For these activities, .NET is more efficient. Let's take a look at a couple of examples.

String manipulation. The first example involves string manipulation, which is a weak point forT-SQL and a strong point for .NET. T-SQL supports basic string-manipulation capabilities through built-in functions and predicates.When the built-in T-SQL functions meet your needs, you should use them because you'll find it hard to beat the performance of built-in functions with your own code, regardless of the development environment you use. Moreover, by using built-in T-SQL predicates (e.g., LIKE), you let the optimizer work its magic by optimizing the execution of your code. But if you need the ability to perform more complex string-manipulation tasks than T-SQL supports, it's a different story. Attempting to use T-SQL code will typically require you to implement awkward, iterative logic. T-SQL simply doesn't have the tools for such purposes.

However, .NET provides a superior environment for string manipulation. For example, it supports regular expressions, which are powerful tools for pattern matching and replacement. Even if you don't rely on .NET for string manipulation and decide instead to implement a similar iterative or procedural algorithm that uses both T-SQL and .NET, you'll see that the .NET code is simply much faster in such activities. Note that ANSI SQL supports regular expressions,but that support isn't implemented yet in T-SQL.

Suppose you need to be able to perform a complex string replacement based on patterns. T-SQL's REPLACE function accepts three character-string arguments (let's call them str1, str2, and str3).The REPLACE function replaces all occurrences of str2 in str1 with str3. However, both str2 and str3 are constant strings, not patterns. T-SQL has no built-in function that lets you replace all occurrences of a pattern with another pattern.For example, consider the Phone column in the Customers table that you created earlier. Suppose you want to remove all special characters from the Phone column—that is, all characters that aren't a digit (0-9) or an alpha character (a-z or A-Z).You might need to do this type of replacement to identify "logical" duplicates of phone numbers, which users might enter with slight variations. For example,the phone numbers (503) 555-7555, 503-555-7555, and 503.555.7555 should all be treated as identical. Similarly, the phone numbers 555-FILM and 555 FILM should also be treated as identical, whereas 555-FILM and 555-FILK should be treated as different.When you strip the special characters from the phone numbers, you'll get strings such as 5035557555 and 555FILM, which you can compare to determine logical uniqueness.

Let's start with the T-SQL implementation of the fn_RemoveChars UDF, which Listing 2 shows. The UDF accepts the inputs @string and @pattern and removes all occurrences of characters in @string that match the pattern you provide for @ pattern.The UDF mainly relies on the PATINDEX built-in function, which returns the first position of a pattern within a string. To iterate through the occurrences of the pattern, the UDF invokes PATINDEX in a loop and removes the character where it finds the pattern. It's difficult to avoid iterative logic to implement this task. Furthermore, the UDF's functionality is limited—it lets you remove characters only where it finds the pattern. You'll also find it hard to replace patterns with a different pattern rather than with a constant string.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

It's a great article, I think that old-t-sql developers recognize the advantages of new T-SQLCLR.

DosFlores

Article Rating 5 out of 5

It's a great article, I think that old-t-sql developers recognize the advantages of new T-SQLCLR.

DosFlores

Article Rating 5 out of 5

Running your code I found that if SET STATISTICS TIME is ON then the TSQL fn took 30 sec to complete and while the session settings had SET STATISTICS TIME OFF the TSQL fn took 4 seconds to complete. The .NET fn took 2 seconds on my machine regrdless of the above mentioned session settings.

miki@easy-forex.com

Article Rating 5 out of 5

Excellent intro

ianstirk

Article Rating 4 out of 5

 
 

ADS BY GOOGLE