A tabular report lists information in rows, like the rows of a database table. The column headers in a tabular report correlate to table column names. A cross-tabulation report (crosstab report, for short) is a two-dimensional matrix or spreadsheet that has look-up criteria listed across the top in the column headers and down the left side in row headers. The data you're looking upwhich is usually summarized by an aggregate function such as SUM(), AVERAGE(), or COUNT()occupies the inner cells of the matrix.
For example, suppose you work for the world-famous Pubs bookstore chain, and the CEO wants to know how well each of the company's stores are selling the books of the company's authors. You could create a 3-column tabular report by using a view like the one that Listing 1, page 22, creates. This view, vwSales, lists total book sales grouped by store and by author, with each author-store-sales tuple listed in one row. Figure 1, page 22, shows the tabular report that Listing 1's view generates. However, your report would be easier to read and would convey more information if you presented it as a crosstab report, such as the one that Figure 2, page 22, shows. The crosstab report lists the store names horizontally in column headers instead of alongside the author names in the rows. So, to find sales for author Stearns MacFeather at Bookbeat, you'd look at the cross section of the Stearns MacFeather row and the Bookbeat column.
Crosstab reports fall into two categories: fixed-width and variable-width. For fixed-width reports, you know at design time the number and names of the columns in the report. Using a T-SQL query to produce a fixed-width crosstab report is straightforward because you can hard-code a CASE expression embedded in an aggregate function to evaluate each column of output. Listing 2 shows an example of using CASE expressions in a fixed-width crosstab query for our Pubs bookstore application. The first CASE expression in Listing 2 says: If the value in the Store column equals "Barnum's," then return the value in the Sales column; otherwise, return 0. The crosstab query uses SUM() functions in its column expressions, so there's no need to pre-aggregate the data in the view it uses for its data source. So, in its FROM clause, the crosstab query in Listing 2 uses the vwSales2 view that Listing 3 creates, instead of vwSales. If you're using COUNT() or AVG() as your aggregate function, leave "Else 0" off the CASE expression; otherwise, the answers will average or count transactions that have sales values of 0, which don't really exist. When you drop "Else 0" from the CASE expression, you get NULL values instead of zeros for author-store combinations for which no records exist.
But let's say that Pubs has a hundred bookstores, with new ones opening and others closing each month. For this situation, you need a variable-width crosstab report that dynamically reads the store names from the data and produces a column for each one, no matter how many exist that month. Let's explore two very different methods of producing variable-width crosstab reports. The first example uses dynamic SQL in a stored procedure to create a crosstab query string that contains CASE expressions. The EXEC command executes the query string to return the report. Instead of CASE expressions or dynamic SQL, the second example uses the new relational features in ADO.NETthe data-access component in Visual Studio .NETto cross-tabulate the data. Along the way, I point out advantages and disadvantages of each method.
Using Dynamic SQL
The stored procedure procXTabDSQL, which Listing 4 shows, uses the vwSales2 view and a local cursor that loops through a list of store names to generate a crosstab query string. The query string starts with the row header as the first column. The procedure then opens the cursor on the list of store names. The cursor fetches the individual store names into the variable @StoreName. Each iteration generates a string containing a SUM() function wrapped around a CASE expression, then concatenates that string to the varchar variable @strSQL. Each SUM() function generates a column of output in the final report.
The CASE expression in the first SUM() function is for the Barnum's bookstore column of the report. SQL Server interprets the embedded apostrophe in "Barnum's" as the delimiter for the comparison string and tries to compare the data stored in stor_name to Barnum instead of Barnum's. To fix this problem, the code must replace the single quote in the name with two single quotes. The first quote in a double-singlequote sequence is an escape character for the second quote; it tells the SQL Server string parser to interpret the second quote as a character rather than a string delimiter. The apostrophe in the column label [Barnum's] is OK because it's a column label embedded in square column brackets. Listing 2 contains the correct CASE expression for the Barnum's column of the report.
The difficulty grows when you try to generate this SQL query from a string that is itself delimited with single quotes. You must provide an escape sequence for the comparison string's leading and trailing single quotes. The comparison string is contained in a variable, so you must programmatically replace its single quote with two single quotes by using a REPLACE() function. Unfortunately, the REPLACE() function also uses single quotes as string delimiters for its string-literal arguments. So to replace one single quote with two single quotes in the @StoreName variable, you have to use the awkward syntax
REPLACE(@StoreName,'''','''''')
The ASCII number for the single quote is 39. To improve the readability of the CASE expressions in Listing 4's code, I selectively use CHAR(39) in place of some of the single quotes.
After the code has finished looping through the store names, it concatenates the rest of the SQL string to the varchar variable. When the SQL string for the crosstab query is complete, the code calls the EXECUTE procedure to generate the crosstab result set.