DOWNLOAD THE CODE:
Download the Code 38656.zip

At first, you might wonder, why bother with this design at all? Why not just put all vehicle types in one table? Figure 2 shows a one-table implementation of the information from Figure 1's schema. Although you might save a table join by designing the database this way, this table isn't the most efficient from a relational standpoint. The NumberOfDoors field has no meaning for a motorcycle, and although motorcycle shoppers are certainly concerned with seat height, this information is irrelevant to a car. Nor have I ever seen a motorcycle with a truck bed. Consequently, you introduce lots of nulls into the table. Just imagine a supertype-subtype relationship that has 30, 40, or even 100 subtypes that hold 1000 not-in-common attributes. The inefficient storage and page splits you'd cause by having a table with 1000 or so columns could easily cancel out the benefit of saving a table join, especially if the not-in-common attributes use long character fields. Although I encourage you to consider all implementation possibilities, for this example, let's work with the four-table schema.

The ManyToMany Database Schema
Now that we've examined the supertype-subtype relationship, let's see an example of the performance benefits it offers over multiple M:N relationships. The examples in this article use two databases, ManyToMany and SuperSub. You can download CreateAndPopulateDBs.sql, which contains the code to create these databases and the article's stored procedures, at InstantDoc ID 38656. The examples work in both SQL Server 2000 and 7.0, and I've indexed the tables in both databases so that all join columns have an index on them. Also, to ensure that the indexing strategy doesn't affect the results, I avoided filtering result sets in the stored procedures other than on the primary key columns.

The example deals with a fictitious company that leases equipment to other businesses. This leasing company stocks various kinds of trucks, office furniture, and computers. In implementing the database schema for the company's main application, the designer used several M:N relationships, as Figure 3 shows. In this ManyToMany database schema, the Leases table is related through three M:N association tables to tables containing each type of equipment that customers can lease. Each association table records the item being leased and the quantity of the item being leased. (I omitted the Customer table for clarity.)

In a sense, nothing's wrong with this schema. It contains no major normal-form violations. (You could argue that Salesperson in the Leases table might be multivalued, that a truck could be painted two colors, or that in the future, furniture might have no weight; but let's stay focused.) Although the ManyToMany database schema serves its purpose, let's look at some common ways users might query this database and the subsequent performance results.

Listing 1 shows the T-SQL code that creates the stored procedure GetInvoiceDetails, which users execute frequently against the ManyToMany database. The GetInvoiceDetails procedure contains four blocks of code that return details about the equipment on a given lease as well as the grand total of the monthly rate billed for that lease. With the M:N schema, you need to use UNION to join three separate queries to return lease- detail information, as the first three blocks of the stored procedure show. Notice that each query adds a column called ITEM to identify what type of item is being leased. The fourth block retrieves the total monthly amount billed on each invoice by summing the unioned line-item total columns to create a super-aggregated amount. Using a leasenumber of 1, let's execute this query against the ManyToMany database:

USE ManyToMany
GO
EXEC GetInvoiceDetails 1

As Figure 4 shows, each item on leasenumber 1 is listed by type with line-item totals. The final row of the result set is the total monthly lease amount. A lease-detail report is a common request, so this would likely be one of the most-executed stored procedures for this database.

Now, let's look at the STATISTICS IO output of this stored procedure. Run the GetInvoiceDetails stored procedure with STATISTICS IO turned on:

SET STATISTICS IO ON
EXEC GetInvoiceDetails 1

You get the results that Figure 5 shows. (To ensure consistent results, I've already loaded the data into cache by running the procedure several times, so you have no physical reads). This fairly common lease-detail stored procedure generates 35 scans and 70 logical reads. As you'll soon see, this number of reads is much higher than necessary.

Prev. page     1 [2] 3     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

The zip file doesn't have the CreateAndPopulateDBs.sql script file. Can it be posted on the FTP site?

tester

First of all, good article. But I do have a question. In figure 6, the furniture, trucks and computers subtypes have their PKs named after their types (i.e. FurnitureID, TruckID, etc); however, on figure 1 the subtypes inherited their supertype's PK name (i.e. VehicleID). In the past, I've struggled to decide which format to use, mostly taking the former case, what rules do you adhere to when deciding whether to name the subtypes PK based on their type or based on their supertype's name?

csantos

Loved the article! I have used the super/sub relationships many times without knowing there was an actual official terminology.

datagod

Article Rating 5 out of 5

To datagod: you might find this article of interest too! Diana May Sr Technical Editor

[November 2006] Implementing Supertypes and Subtypes Why, how, and when By: Michelle A. Poolet Solutions by Design InstantDoc #93241 From the November 2006 edition of SQL Server Magazine

DianaMay

Article Rating 5 out of 5