DOWNLOAD THE CODE:
Download the Code 42901.zip

Data integrity, data consistency, and avoiding anomalies are the most important considerations when you store data in your database. However, when you query data, you usually think about your application's users and the information that you provide them. You often want to let them look at the data in a different format than the raw form in which it's stored. When you query data for analysis purposes, you typically want to produce either a small result set that fits on a screen or an intermediate result set that can be more easily manipulated by another query than your source data can. For these purposes, you might want to pivot your data. When you pivot data, you rotate rows to columns or columns to rows. That is, you can rotate data from multiple rows to multiple columns in a single row or unpivot data from multiple columns of a single row into result rows. You can use the pivot technique to rotate data in columns and rows for analysis purposes. Many analysis tools provide pivot capabilities, such as Microsoft Excel's PivotTable, Microsoft Access's Transform command, and the Pivot Table ActiveX Control. However, if you want to pivot or unpivot data in SQL Server 2000, you have to write complex T-SQL expressions. SQL Server 2005 provides native T-SQL PIVOT and UNPIVOT operators that simplify these operations. In this article, I discuss some techniques for pivoting data in SQL Server 2000 and introduce you to SQL Server 2005's PIVOT operator.

Pivoting Data in SQL Server 2000
To demonstrate pivoting techniques, let's use the Orders table that Listing 1 creates and populates. Say you want to analyze order quantities by customer and order year. You can group the data according to custid and YEAR(orderdate), then return the SUM(qty) value, but the total quantity for each combination of customer and order year will appear in a separate result row. For example, customer A ordered in 2002, 2003, and 2004. Hence, customer A has three rows in the result set-one row for each year. Assume that for each customer, you want one result row that has one column for each order year-2002, 2003, and 2004-and total quantities for each year in separate columns, as Figure 1 shows. Writing a query that produces these results in SQL Server 2000 requires some T-SQL acrobatics. But I can show you some techniques that will solve the problem. The first solution uses an aggregate query that contains CASE expressions:

SELECT custid,
  SUM(CASE WHEN orderyear = 2002 THEN qty END) AS [2002],
  SUM(CASE WHEN orderyear = 2003 THEN qty END) AS [2003],
  SUM(CASE WHEN orderyear = 2004 THEN qty END) AS [2004]
FROM (SELECT custid, YEAR(orderdate) AS orderyear, qty FROM dbo.Orders) AS D
GROUP BY custid

This query defines the derived table D and extracts from the Orders table only the columns of interest for the pivot operation (custid, orderyear, and qty). To return one row for each customer, the query groups the data by custid. The SELECT list performs the magic of extracting quantities from multiple source rows and creating multiple result columns within a single row. The query contains a CASE expression for each order year. This example involves three order years: 2002, 2003, and 2004. The CASE expressions return the order quantity only if the base row contains the order year that the expression contains. Otherwise, the CASE expression returns NULL because you didn't specify an ELSE clause. You can explicitly specify ELSE NULL if you want, but because you want to get a NULL when the CASE doesn't find a match, you might prefer to use the shorter syntax. Then the SUM() function summarizes all the order quantities and ignores the NULLs. In the query, the first SUM(CASE) expression summarizes the order quantities for order year 2002, the second expression summarizes for 2003, and the third expression summarizes for 2004. So you get all the yearly order quantities for a particular customer in one result row.

This example uses only 3 years, so the query is short. A larger number of years will make the query much longer. However, you can shorten the query by applying a different pivoting technique-create a matrix table that contains a row and a column for each order year. A matrix table is a helper table that contains a row for each source attribute that you're about to rotate and a column for each target attribute that you're about to generate. In this example, we're rotating source year rows into target year columns so you have three rows (for the years 2002, 2003, and 2004) and three columns (for the same years). You get nine row and column intersections in a three-by-three table. The matrix table will help you isolate the column values that you want to aggregate into the different result columns. Listing 2 creates and populates the Matrix table with three rows for the order years 2002, 2003, and 2004. The Matrix table has an orderyear column and a column that uses the naming format yorderyear for each order year that you want to handle. In our example, the table contains the columns orderyear, y2002, y2003, and y2004. The table has a row for each year-three rows in this case-and stores the order year in the orderyear column, a 1 in the column whose value is the same year as its name, and NULLs in all other columns. For example, the row for order year 2002 has 2002 in the orderyear column, a 1 in the column y2002, NULL in the column y2003 and NULL in the column y2004. Issuing a SELECT * statement against the Matrix table produces the results that Figure 2 shows.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

"dynamic pivoting"-- I wonder if that's anything like an article idea I pitched about a year ago (and which you declined). Having had to do a lot of ad hoc pivots of large tables for my work, I wrote a stored procedure that will pivot anything you throw at it. So you can use a stored procedure call like this in the pubs database:

exec dbo.spTranspose @Query = ' SELECT s.stor_id, st.stor_name, s.title_id, s.qty FROM pubs.dbo.sales s JOIN pubs.dbo.stores st ON st.stor_id = s.stor_id JOIN pubs.dbo.titles t ON t.title_id = s.title_id WHERE t.type = ''business''' , @Group_Columns = 'stor_id, stor_name' , @ID = 'title_id' , @Data = 'qty'

...and get results like this:

stor_id stor_name BU1032 BU1111 BU2075 BU7832 ------- ---------------------------------------- ------ ------ ------ ------ 6380 Eric the Read Books 5 NULL NULL NULL 7896 Fricative Bookshop NULL NULL 35 15 8042 Bookbeat 10 25 NULL NULL

...with no additional effort. You can even send spTranspose a @ResultTable parameter and save your results into a table of your choice to further work with.

I can send source code (again), if you're interested.

brian j. parker

Not so new! It will be more interesting when I will be able to use a parametric pivot table (I mean with a parametric number of years). Thank you

Stefano Sammarco

Stefano: you could definitely use my spTranspose. A call like this would let you do what you want using Ben-Gan's test data, you just have to change the WHERE clause in the @Query parameter:

exec cmv.dbo.spTranspose @Query = 'SELECT custid, YEAR(orderdate) AS orderyear, sum(qty) as "qty" FROM dbo.Orders WHERE Year(orderdate) in (2003,2004) GROUP BY custid, Year(orderdate)', @Group_Columns = 'custid', @ID = 'orderyear', @Data = 'qty'

If you want the source code, just let me know where to send it. (brian.j.parker at gmail dot com)

Of course the support for PIVOT will be the way to go in the future, but I've had to do a LOT of ad hoc pivoting where I work and will continue to need to do so until SQL Server 2005 is in our hands! I've gotten a ton of mileage from spTranspose and I'm surprised I've never seen anything like it in any of the books I've read.

beeporama

Article Rating 2 out of 5

The question this article highlights is, given the the all too common need to pivot data, how is it possible that this feature was left out of SQL Server 2000 in the first place, given that other 'lesser' MS products already have the feature? And Brian, don't take things so personally. I've been on the publishing side of the fence myself, and your article idea not being accepted was likely a matter of timing vs. merit. BTW, search for 't-sql pivot' in Google and you'll find lots of approached to the problem.

dterrie

Article Rating 3 out of 5

Hey brian, Am currently using SQL Server 2000 and exactly looking for a generic procedure which does this transposition. Do you mind sending me the stored proc code to crush_spin@yahoo.com.

Thanks in advance.

crush_spin

Hi,

Brian, I’m sorry your article wasn’t published. It was probably because we covered a routine providing dynamic pivoting in the past (InstantDoc #15608, October 2000). As dterrie guessed, timing has a lot to do with the publication of this article. The T-SQL 2005 column’s purpose is to describe new T-SQL features in SQL Server 2005 and compare them to existing ones where relevant. Since SQL Server 2005 introduced the PIVOT and UNPIVOT operators, of course I so fit to discuss it in detail and compare to the existing alternatives.

Cheers, Itzik Ben-Gan

BG_SQL

First few years of programming SQL; Now getting into reporting services this quarter. Very useful. THANKS!

dianagele

Article Rating 5 out of 5

 
 

ADS BY GOOGLE