DOWNLOAD THE CODE:
Download the Code 98711.zip

Grouping Data by More Than One Column
When selecting data for a query, you can group by more than one column. When you group by more than one column, you should include the ORDER BY clause to specify how you want the returned data sorted. If you don’t use this clause, the data isn’t returned any particular order.

For the next examples, let’s switch to the MovieReview table to see what some employees have been up to in their (ahem) spare time. Suppose you want to find out each employee’s favorite types of movies (i.e., genres) and the average rating he or she has given them. To determine the favorite genres, you need to count the number of times each type of movie was reviewed, which you can do with the COUNT function. You also need to use the AVG function to determine the average rating for each genre. To show the number of movie reviews per employee per genre, you need to group this data by two columns: EmployeeID and Genre. To make it easy to see each employee’s favorite types of movies, you can sort the data first by the returned values in EmployeeID column (column 1) then by the returned values in the Review column (column 3). Here’s what that query looks like:

SELECT EmployeeID, Genre,

COUNT(*) AS 'Reviews', AVG(Stars) AS 'Average Rating' FROM MovieReview GROUP BY EmployeeID, Genre ORDER BY 1,3

As you can see from the results in Figure 4, employee 1 has watched many different types of movies and seemed to enjoy the fiction and horror movies the most because they had the highest average rating. You can also see that employees 4 and 5 haven’t reviewed as many movies as the other employees.

Refining Results with the HAVING Clause

The HAVING clause is used to eliminate rows from the result set after the data has been aggregated and grouped. Any column defined in the SELECT list can be referenced in the HAVING clause. You can also reference aggregate functions.

For example, the results in Figure 4 show that not all the employees have reviewed the same number of movies. So, let’s produce a report that lists the average rating of every movie that has been rated by four or more employees.

As output, you want to see each movie’s name, the average number of stars it received, and the number of times it was reviewed. To get this output, you need to:

  • Use the GROUP BY clause to group the data by the MovieName column.
  • Use the AVG function to determine the average number of stars for each movie. However, instead of displaying the average number of stars as an integer (boring), you can use the REPLICATE function to display an asterisk (*) for every star the movie received. The REPLICATE function repeats a string value the specified number of times. To make sure this new Stars column is only 10 characters long, you need to use the LEFT function, which returns with the specified number of characters from the left part of the specified string.
  • Use the COUNT function to determine how many reviews each movie received.
  • Use the HAVING clause to display only those movies that were rated by at least four employees.
  • Use the ORDER BY clause to sort the movies by their average rating.

The query would look like

SELECT MovieName,

LEFT(REPLICATE('* ',AVG(Stars)),10) AS 'Stars', COUNT(*) AS 'Reviews' FROM MovieReview GROUP BY MovieName HAVING COUNT(*) >= 4 ORDER BY Stars

Figure 5 shows the results.

Now let’s get a list of the movies with the worst ratings—that is, those movies with an average rating of 3 or fewer stars. You can use the same query, except you need to use a HAVING clause that will display only those movies that received an average of three or fewer stars:

SELECT MovieName,

LEFT(REPLICATE('* ',AVG(Stars)),10) AS 'Stars' FROM MovieReview GROUP BY MovieName HAVING AVG(Stars) <= 3 ORDER BY Stars

As Figure 6 shows, there are some real stinkers.

Finally, let’s see what movies the employees enjoyed the most by changing the HAVING clause so that it displays only those movies that had an average rating of four stars or higher:

SELECT MovieName,

LEFT(REPLICATE('* ',AVG(Stars)),10) AS 'Stars' FROM MovieReview GROUP BY MovieName HAVING AVG(Stars) >= 4 ORDER BY Stars DESC

Notice that the ORDER BY clause sorts the data by the number of stars, but this time it displays them in descending order. Figure 7 shows the result.

A Powerful Feature
As you witnessed in this lesson, the GROUP BY clause is a powerful feature of T-SQL. Grouping data might have been a foreign concept to you prior to this lesson, but now you should be well on your way to writing complex useful reports in record time. Be sure to rub it into any COBOL programmers you might know.

 

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

 
 

ADS BY GOOGLE