In discussing what isn’t allowed within a view (as I did in my February column "Viewer Advisory," InstantDoc ID 37660), I'm often asked about the importance of COMPUTE and COMPUTE BY clauses. These aren't allowed within a view’s definition, yet they're important clauses to understand for detailed reporting. For summarizing data, many people use only GROUP BY when COMPUTE BY might be a better alternative. I’m partial to COMPUTE and COMPUTE BY, so I always take a minute to explain their use. When you want more than just the summary data (GROUP BY returns only one row for each grouping), using COMPUTE and COMPUTE BY is the best way to get the details and the aggregates.

To execute this article's examples, you can download the ComputeByExamples.sql commented script file at InstantDoc ID 38150. For simplified explanations of the results, I recommend running the examples in text mode (using Query Analyzer’s Query dropdown menu, choose the Results in Text option or Ctrl+T). Later, I explain how grid mode (the default option for Query Analyzer) works for these examples.

SQL Server supports a non-tabular way of returning aggregations over sets of data. Often, people use GROUP BY to produce summary data, but GROUP BY doesn’t always give enough detailed information. The GROUP BY clause is an ANSI-standard way of producing summary data, but it doesn’t return the details on which the summary data is based. For example, the code in Listing 1 produces one row for each ProductID. In this case, the grouping row shows the sum of items sold (quantity) for each ProductID in the Order Details table. With this query, SQL Server will return a sum of the number of products sold and produce only a sum—just one row—for each ProductID (and only for those that have rows in Order Details). For products without rows in the Order Details table, the query won’t return the ProductID and sum. For products with many sales, the query returns only one row—with the total sold (the sum of quantity).

This result set is extremely useful if you want only the total. However, if you need more details about these sales, the GROUP BY result set doesn’t provide them. For example, if you sold 580 of product ID 23, how would you know whether the total represented one order with a quantity of 580, 580 orders with a quantity of 1 each, or something in between? In some cases, you need the summary data and the details.

As an option, adding the count(), avg(), min(), and max() aggregates gives you more information about each product. But you can’t produce detailed information, such as the order on which the maximum number was sold or the customer to whom you sold it, without writing a separate query. Listing 2 shows how you can add more aggregates, but the results still give only limited information. If you’re trying to get an estimate, this is probably sufficient, but what if you want details about each order in addition to the sum? One might argue that you’re data mining here—looking for information about your data—and that programmatic data mining is better (and I’d probably agree). However, many of my customers still want paper-based or scrollable reports with the details and the sums. COMPUTE and COMPUTE BY offer a nice way to produce these reports.

COMPUTE and COMPUTE BY are T-SQL extensions whose result sets produce summary data while including all the details. The syntax mirrors this concept; instead of asking for aggregates in the SELECT list, you ask for them separately (in the COMPUTE clause) and SQL Server produces a separate result for each aggregate (a "logically" broken report). These reports might be several pages long, but you can flip through the pages and see summary data that’s easy to read because the report format breaks where the summary data exists. The summary data stands out from the rest of the report almost as if it were highlighted, but it’s really just another result set. Often, this report is called a control break report.

Technically, only COMPUTE BY breaks the report into groups, but even COMPUTE produces an extra result set for the grand aggregate (e.g., the grand total or overall average). Being able to quickly find the summary data and also have the details is the benefit of using the COMPUTE and COMPUTE BY clauses. You can combine COMPUTE and COMPUTE BY, even within the same query; COMPUTE offers grand totals and COMPUTE BY offers subtotals.

Consider the GROUP BY clause that Listing 1 shows. For each ProductID, the GROUP BY produces a sum. The result set doesn’t give you the breakdown of sales for each of these ProductIDs. In Listing 3, I removed the GROUP BY so that the query would return just the list of Product ID and Quantity for each order. In the result set, all you get is a ProductID with its quantity. However, if you’re using COMPUTE BY instead of GROUP BY, you can add attributes. Listing 4 adds the OrderDate, CompanyName, ProductName, and CategoryName to the SELECT list. Listing 4 shows a query that gathers the details; with a GROUP BY, this information wouldn't be displayed. Listing 4 gives you the details but not the sum, but what if you want both?

Here’s where COMPUTE and COMPUTE BY come in handy. For the columns you want summary data for, you can add the aggregation in the COMPUTE or COMPUTE BY clause. Before you can write this query, you must meet the first requirement of a COMPUTE or COMPUTE BY clause—including in the SELECT list the columns where you want summary data. (Note that in a GROUP BY clause, you don’t need to list the column defining the grouping.) You need to list each column only once for COMPUTE and COMPUTE BY, even if you want to see multiple aggregates of this data.

Listing 5 shows two additional clauses. The first COMPUTE clause is a COMPUTE BY, meaning that SQL Server will perform the computation for each grouping you defined in the BY portion. This query returns a sum of quantity for each Product ID, similar to the sum that the original GROUP BY in Listing 1 returned. However, this result set is significantly larger because it includes all the details about each order. The second COMPUTE clause is just a regular COMPUTE clause: It gives a grand total—in this case, the sum of all products.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

Paragraph 2 of the article states: "..you can cownload the Computeby Examples.sql commented script file at InstantDoc ID 38150." 38150 is the InstantDoc ID for the same article, and I am unable to find any means to access the script file. Can you help?

Justin Randall

I loved the way the T-SQL was presented at ever increasing complexity with an excellent description. Came away with an excellent understanding of how and where to apply Compute and Compute by.

ingebme

Article Rating 5 out of 5