• subscribe
January 20, 2009 12:00 AM

More Options for Handling Arrays as Inputs

Use a CLR split UDF or table-valued parameters
SQL Server Pro
InstantDoc ID #100975
Downloads
100975.zip

The CLR implementation of the fn_split function is several times faster than the T-SQL implementation. Like the T-SQL implementation, the CLR implementation is secure because it doesn’t expose the system to SQL injection. In addition, the CLR implementation allows efficient reuse of execution plans. (For information about enhancements to CLR table-valued user-defined functions in SQL Server 2008, see the web sidebar “Ordered Common Language Runtime Table-Valued User-Defined Functions.”)

 

SQL Server 2008 Table-Valued Parameters

SQL Server 2008 introduces support for table types and table-valued parameters. A table type is simply a way to persist a table definition in the database for later use as the type for a table variable or a table-valued parameter. For example, the following code creates a table type called OrderIDs:

 

IF TYPE_ID('dbo.OrderIDs') IS NOT NULL DROP TYPE dbo.OrderIDs;
CREATE TYPE dbo.OrderIDs AS TABLE
(
  pos INT NOT NULL PRIMARY KEY,
  orderid INT NOT NULL UNIQUE
);

 

As an example for using the table type, the following code declares a table variable of the OrderIDs type and manipulates the variable with INSERT and SELECT statements:

 

DECLARE @T AS dbo.OrderIDs;
INSERT INTO @T(pos, orderid) VALUES(1, 10248),(2, 10250),(3, 10249);
SELECT * FROM @T;

 

This code returns the output in Table 2.

 

The benefit of table types isn’t only the ability to use them as types for table variables, but more importantly the ability to use them as types for input parameters of stored procedures and functions. Thus, you can pass multiple values as input to your routines instead of relying on cumbersome solutions such as passing strings with comma-separated lists of values. As an example, run the code in Listing 3 to create a new version of the usp_getorders stored procedure that accepts a table-valued parameter as input instead of a string with a comma-separated list of values.

 

Note the use of the READONLY clause, which is mandatory in SQL Server 2008. SQL Server 2008 allows reading from table-valued parameters but not writing to them.

 

When you need to execute a stored procedure, you must declare a variable of the table type, populate it with rows, and pass it as input to the stored procedure. As an example, run the code in Listing 4 to execute the stored procedure with two different inputs.  You can query the syscacheobjects view to observe the plan reuse behavior:

 

SELECT objtype, usecounts, sql
FROM sys.syscacheobjects -- in 2000 query master.dbo.syscacheobjects
WHERE SQL LIKE '%84B72B28-8D27-45FC-89BB-2A84226AE0E8%'
  AND sql NOT LIKE '%sys%';

 

Table 3 shows the output of this query. The usecounts attribute in the table indicates that the plan was used twice.

 

Internal handling of table-valued parameters is similar to handling of table variables. For example, SQL Server doesn’t maintain distribution statistics (histograms) on table-valued parameters, and their scope is limited to the local batch.

 

SQL Server 2008 client APIs are also enhanced to support table-valued parameters. Like the CLR split UDF approach, the implementation of stored procedures using table-valued parameters is secure, and it allows efficient reuse of previously cached execution plans.

 

Recommendation

The necessity of passing an unknown number of values as input to a routine is common in T-SQL. You can use one of four techniques that I demonstrated to accomplish this task: dynamic SQL, a T-SQL split function, a CLR split function, and table-valued parameters. The number of options available depends on the version of SQL Server you’re using. I don’t recommend the dynamic SQL approach because it has security and performance problems. The T-SQL split function is secure and efficient, but it’s complicated. The CLR version of the split function is faster and less complicated than the T-SQL version, but it requires a non-T-SQL implementation. If you’re running SQL Server 2008, I recommend the table-valued parameter solution because it’s secure and efficient, and it uses T-SQL exclusively. For more information about using arrays as inputs in SQL Server stored procedures, see SQL Server MVP Erland Sommarskog’s research and findings, at www.sommarskog.se/arrays-in-sql.html.



ARTICLE TOOLS

Comments
  • TIM
    3 years ago
    Mar 09, 2009

    Our top SP behind a retail website takes a delimited string that may contain dupes, which are filtered out. I modified the C# method from this article to utilize the efficiency of a HashTable to prevent duplicate element values.

    However! The performance of the CLR solution (mine or the article's) in production was very poor. The original SP uses a temp table inside a WHILE loop with SUBSTRING to parse each element. The existence of the value is checked before each insert to the temp table (though it may be cheaper to SELECT DISTINCT inside a subquery). This method keeps TEMPDB busy (but well under disk capacity), the CPUs averaging around 8%, the DB around 8K logical page reads/s and procedure cache hit rate near 97%. I tried several varieties of the CLR implementation (some keeping the temp table, others using the UDF inline), but ALL shot the CPUs to 20%+, logical page reads to 25K/s and the cache hit rate down to 82%.

    I'm sure the CLR is more efficient at parsing, but there seems to be a huge overhead when it's invoked several times/sec. Any similar experiences or suggestions?? This is on an 8-proc, 64-bit server.


    [SqlFunction(
    FillRowMethodName = "ArrSplitDFillRow",
    DataAccess = DataAccessKind.None,
    TableDefinition = "pos INT, element NVARCHAR(50)"
    )]
    public static IEnumerable fn_splitDistinct([SqlFacet(MaxSize = -1)]SqlString inpStr, SqlString charSeparator)
    {
    string locStr;
    string[] splitStr;
    char[] locSeparator = new char[1];
    locSeparator[0] = (char)charSeparator.Value[0];

    if (inpStr.IsNull)
    locStr = "";
    else
    locStr = inpStr.Value;

    splitStr = locStr.Split(locSeparator, StringSplitOptions.RemoveEmptyEntries);
    Hashtable SplitString = new Hashtable();

    int i = 1;
    foreach (string s in splitStr) {
    if (!SplitString.ContainsKey(s)) {
    SplitString.Add(s,i);
    ++i;
    }
    }
    return SplitString;
    }

    public static void ArrSplitDFillRow(Object obj, out int pos, out string item) {
    pos = (int)((DictionaryEntry)obj).Value;
    item = (string)((DictionaryEntry)obj).Key;

You must log on before posting a comment.

Are you a new visitor? Register Here