• subscribe
July 24, 2003 12:00 AM

Survive the (Relational) Divide

Sometimes a hybrid solution is the answer
SQL Server Pro
InstantDoc ID #39300
Downloads
39300.zip

Relational division is a useful T-SQL programmer's tool for solving problems users encounter as they try to retrieve information from their database tables. In my previous two columns, I discussed different techniques for handling relational division, which involves matching sets that share some or all members. In "Set Members and Relationships," June 2003, InstantDoc ID 38515, I discussed solutions based on aggregations. In "A Different Setup," July 2003, InstantDoc ID 38812, I reviewed solutions based on correlated subqueries. This month, I present a relational-division puzzle, invite you to solve it by applying those earlier techniques, then show you an additional solution and some variations.

The Puzzle
In a fictitious training company called Luminata, the training staff stores information about courses and course materials in a SQL Server database. A table called Courses holds a row for each course that contains course attributes such as the course ID, title, and duration. Information about materials such as student textbooks and software CD-ROMs resides in a table called Materials, which contains a row of identifiers such as material ID, material name, supplier ID, and shipper ID for each material. A table called CourseMaterials tracks the specific materials required for each course in columns called materialid and courseid. Note that each material can appear in Materials only once but can appear in CourseMaterials multiple times because the same material might be required for multiple courses. Run the script that Listing 1 shows to create the Courses, Materials, and CourseMaterials tables and populate them with sample data.

Although courses usually require their own unique set of course materials, on rare occasions at Luminata, different courses share the same set of materials. The training staff wants to reduce the maintenance overhead involved with ordering and organizing class materials, so it has devised a streamlined way to identify courses that share the same materials: The staff assigns every course a minimum ID. For all courses that require unique material sets, the course ID is also the minimum ID. However, for courses that share the same course materials, the minimum ID assigned is the lowest course ID among them. For example, suppose courses with IDs 2 and 3 both require materials 1, 5, and 6. The minimum course IDs that identify the set of required materials would be 2 for both courses.

Here's your puzzle: Write code that for each course returns an ID and a minimum ID from the courses that share the same course materials. Before looking at my solutions, try to solve the puzzle on your own by using the aggregations and correlated subqueries techniques I showed you in my previous two articles. If you use Listing 1's sample data for your solution, you should obtain the results that Figure 1 shows. Courses 2 and 3 share the same set of course materials (i.e., the set {1, 5, 6}), so the staff assigns 2 as the minimum course ID for both courses.

Pure Set-Based Solutions
Now, compare your solutions with a couple of pure set-based solutions that I devised by using the aggregations and correlated subqueries techniques. My solutions are based on the following pseudo query:

SELECT courseid,
   COALESCE(
      (SELECT MIN(courseid)
      FROM Courses AS C2
      WHERE C2.courseid
     < C1.courseid
          AND <expression_that_
            returns_true_when_C1_
            and_C2_are_equal>
      ),
      courseid) AS mincourseid
FROM Courses AS C1

For each row in the instance of Courses called C1, a subquery calculates the minimum course ID from the courses that share the same set of materials in the instance called C2. Because the minimum course ID is required, the subquery examines only courses in C2 that carry a lower course ID than they have in C1. When the course in C1 is already the minimum course ID, the subquery returns NULL; in that case, the COALESCE() function returns the course ID from C1. This pseudo code is fairly straightforward; the tricky part is to write the expression that evaluates to true when the sets are equal.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here
  • SP1?
    I know there is a SP1 for SQL 2008 R2 available....and there is a "feature pack" as well... ...
  • SQL database mirroring
    I have SQL Server 2008 R2 Enterprise 64bit on Windows 2008 R2 Enterprise 64bit.  Each SQL Server has...
  • Dell Compellent Disk Drive
    Does anybody has experience with Dell Compellent Disk Drive? Basically, this system manages all disk...
  • Sql server performance tuning
    I need to find a tool that help me to optimize sql server,queries,improve the performance and solve ...