Executive Summary:
|
There are many ways to concatenate row values in Microsoft SQL Server databases. With the hope of finding the most efficient row-concatenation solution, a SQL Server Magazine reader tested two common approaches: using a self-reference variable within a SELECT clause and using FOR XML. He found that although these solutions produce the same result, the time it takes to get those results varies significantly.
|
Let's say you have a table with a column named
color_name and you want to store all the row
values as a list of comma-delimited values. For example,
if the column contains the row values of Yellow,
Blue, and Red, the final result would look like:
@cv = {Yellow,Blue,Red}
There are many ways to concatenate row values to
get results like this. Although the various solutions
produce the same result, the time it takes to get
those results can vary significantly. With the hope
of finding the most efficient row-concatenation
solution, I tested two common approaches: using a
self-reference variable within a SELECT clause and
using FOR XML.
I began by running the SampleData.sql script in
Listing 1 to create a table named T1 and populate
it with 50,000 rows. (If you run this code so that
you can test the solutions on your system, note that
SampleData.sql takes a few minutes to run.)
Next, I turned on the Discard results after execution
option in SQL Server Management Studio
(SSMS) so that the time it took to generate the
output wouldn't be taken into consideration. After
enabling that option, I cleared the data cache and
ran Solution1.sql, which Listing 2 shows, to concatenate
rows by using a self-reference variable within
a SELECT clause. It took 10 seconds for this code
to run on my system.
It's worth noting that some people prefer to use
the following two system-defined functions to concatenate
rows:
DECLARE @cv varchar(MAX)
SELECT @cv =
COALESCE(@cv + ',', ') +
ISNULL(col1,'<NULL>') FROM T1
SELECT @cv
This code took even longer to run on my system (16
seconds) because of the functions' overhead.
I then tested Solution2.sql, which Listing 3 shows, to concatenate rows by using FOR XML.
It took less than 1 second for this code to run on
my system, which is 10 times faster than the first
solution.
The first solution is not only slower but also
leads to undefined results. This is due to the
kind of assignment used within the SELECT
clause. As Microsoft's Eugene Zabokritski notes,
"The results of an assignment in the SELECT
list... depend on what plan the optimizer chooses
to execute the query, so technically speaking,
the results are undefined." (For more information
about undefined results, go to http://connect.microsoft.com/SQLServer/feedback/ViewFeed
back.aspx?FeedbackID=126129.)
When you have a small number of row values, it
probably doesn't matter which row-concatenation solution
you choose. However, when you're working with
tables that contain a large amount of data, consider
using FOR XML to concatenate row values.