Many programmers continue to overlook helpful SQL Server features that have
been available for years. Most of these overlooked features can simplify your
queries, optimize their performance, and improve your productivity. One such
feature is T-SQL's GROUP BY ALL option. Ask around, and I bet you'll find few
T-SQL programmers who can explain what this option does.
By using the GROUP BY ALL option, you can create groups even if they're empty.
In other words, your query returns groups of zero rows even when the WHERE clause
filters out all the group's rows. Have I lost you yet? Never fear—read
on, and you'll soon understand the differences between using the GROUP BY clause
with and without the ALL option and learn some different ways to use GROUP BY
ALL. And for more TSQL challenges, see the Logical Puzzle and the
Web-exclusive sidebar "Catch That Bug."
GROUP BY and GROUP BY ALL
First, consider the following GROUP BY query, which doesn't use the ALL option:
SET NOCOUNT ON;
USE Northwind;
SELECT City, COUNT(*) AS NumEmps
FROM dbo.Employees
GROUP BY City;
This query groups employees by city and returns the city and count of employees
for each group, as Table 1 shows. Suppose you
want to consider employees hired only in or after 1993. You just add a simple
filter on the HireDate column, as the WHERE clause in the following query shows:
SELECT City, COUNT(*) AS NumEmps
FROM dbo.Employees
WHERE HireDate >= '19930101'
GROUP BY City;
This query filters out employees hired before 1993, groups the remaining employees
by city, then returns the city and count of employees for each group, as Table
2 shows.
Notice that the query doesn't return Kirkland and Tacoma, because no employees
from these cities were hired in or after 1993.Typically, the GROUP BY clause
operates only on rows that remain after you apply the WHERE filter. However,
by adding the ALL option to the GROUP BY clause, you can generate empty groups—or
groups of zero rows—for cities that the WHERE clause filters out. Table
3 shows the results of adding the ALL option to the GROUP BY clause:
SELECT City, COUNT(*) AS
NumEmps FROM dbo.Employees
WHERE HireDate >= '19930101'
GROUP BY ALL City;
Kirkland and Tacoma now appear in the query results, each with a count of 0, showing that no employees from these cities were hired in or after 1993. Including these cities in the result set indicates that the Employees table contains employees from these cities—just not employees hired in or after 1993.
The Best Option
You could get the same result without using the GROUP BY ALL option, but the
alternatives are more complex. For example, you can use the UNION ALL set operation
to return all cities, including those with zero employees that match the filter
criteria, as the code in Listing 1 shows.
In this example, two different queries return sets of data as input to the UNION
ALL operation. The first query returns cities and counts of employees hired
in or after 1993—without empty groups. The second query uses a NOT EXISTS
predicate to return only cities that have no employees hired in or after 1993.
In SQL Server 2005, you can simplify this query by using the new EXCEPT set
operation instead of the NOT EXISTS predicate, as the code in Listing
2 shows. You can also achieve the desired result by using a CASE expression
that yields a 1 if the hire date is in or after 1993 or a NULL otherwise.You
can then use the COUNT function on the CASE expression to consider employees
hired only in or after 1993:
SELECT City,
COUNT(CASE WHEN
HireDate >= '19930101'
THEN 1 END) AS NumEmps
FROM dbo.Employees
GROUP BY City;
However, all these alternatives are more complex than using GROUP BY ALL.
Join In
The GROUP BY ALL option also has interesting applications in join queries. To
demonstrate these uses, first run the code that Listing
3 shows to add two employees to the Northwind database's Employees table—Itzik
Ben-Gan and Lilach Ben-Gan from Middle Earth, Hobitton—and to add an order
for Lilach in the Orders table. (Some code in this article wraps to several
lines because of space constraints.)
Now, examine the query in Listing 4, which
returns US employees and the count of orders for each of those employees, as
Table 4 shows. Because the query applies an
inner join, the ON filter eliminates employees who have no orders. And the WHERE
clause filters out non-US employees. So the result set doesn't include employees
who made no orders (Itzik) or employees from outside the United States (the
United Kingdom, Middle Earth).
Suppose you want to see all employees who placed orders, but you want the count
of orders only for US employees. To add non-US employees to the result, you simply
add the ALL option to the GROUP BY clause that Listing
5 shows. Table 5 shows this query's result
set. The WHERE clause filters out non-US employees, but the ALL option in the
GROUP BY clause adds those groups of employees back to the result. So Steven,
Michael, Robert, Anne, and Lilach (who aren't in Table
4) now appear in Table 5, with order counts
of 0.These employees handled orders, but they aren't located in the United States.
Notice, however, that Itzik doesn't appear in the result because he didn't handle
any orders.
Remember that GROUP BY ALL adds empty groups for rows eliminated in the WHERE
clause—it doesn't add empty groups for rows eliminated in the ON clause.
If you put the country filter in the ON clause and don't use a WHERE clause,
the ALL option becomes meaningless. For example, the following query generates
the same result as Table 4 shows, whether you
use the ALL option or not:
SELECT E.EmployeeID, E.FirstName,
E.LastName, E.Country,
COUNT(*) AS NumOrders
FROM dbo.Employees AS E
JOIN dbo.Orders AS O
ON O.EmployeeID =
E.EmployeeID
AND E.Country = 'USA'
GROUP BY ALL E.EmployeeID,
E.FirstName, E.LastName,
E.Country;
You might think you can produce the desired result by using an outer join without
using the GROUP BY ALL option at all, as Listing
6 shows. Remember that you want to see all employees who made orders, but
you want the count of orders only for US employees. However, as you can see
in the result that Table 6 shows, this query
includes Itzik even though he handled no orders. So this query returns all employees,
regardless of whether they made orders, with the count of orders only for US
employees.
Prev. page  
[1]
2
next page