• subscribe
June 17, 2003 12:00 AM

Redesigning a Catalog

Create an ERD that lets the database grow with your needs
SQL Server Pro
InstantDoc ID #39069

Many developers create databases to manage a limited number of products for a limited business need. You might want to keep track of supplies in the office-supply cabinet, for example, or the total value of your office furniture. However, as this body of information grows to support other business needs, the simple inventory design gets more complicated. And managing the data can grow cumbersome and inefficient if you don't handle the added complexity correctly.

A catalog database might be the most complicated database design on the planet. Part of this complexity comes from the diversity of catalog businesses. My personal experience working with office-supply companies, grocery chains, and retail clothing stores has taught me that no one template can fit all purposes. But the easiest way to explain how to normalize a complex model such as a catalog is to work through an example. So let's look at a simplified solution that has room for growth and expansion and that you can adapt to various needs. I won't try to address the user-interface problems that surround catalog order entry. An order-entry application is only as good as its foundation—the database. If the database is poorly designed and can't accommodate the limitations of the client interface, you can count on continual problems managing the data.

A Problematic Design
Todd, a database developer, needs a well-normalized database to support products in a clothing catalog. An article of clothing in the catalog might be priced the same, be made of the same type of cloth, and come in the same colors as other pieces in the catalog. However, a pair of slacks might come in various waist sizes and lengths for men (e.g., 34/36, 36/38, 38/40) and in various numbered sizes (e.g., 4, 6, 8, 10, 12, 14) and generalized lengths (e.g., petite, regular, long) for women. Todd wants to be able to search the catalog database for all items of a particular color or material type and have the query run efficiently, hitting as few tables as possible.

Table 1 shows the kind of data Todd is dealing with. I show only 20 rows in this example table; an actual product table in a catalog database might contain hundreds of thousands or even millions of rows. Notice the way Table 1 is organized—one row for each variation of each product. The amount of redundant data in the table is staggering. Besides having to deal with the redundancy and resulting waste of hard-disk space, Todd can anticipate running into data-integrity problems during updates and deletes—such problems are always associated with a table in second normal form (2NF). (If you're not familiar with normal forms, you can review the basics in "Why You Need Database Normalization," March 1999, InstantDoc ID 4887.)

Table 1 is in first normal form (1NF) because the data is atomic—one value per cell, based on the meanings that the column headings imply—and because no arrays of values (also called multivalued attributes) are embedded in the table. The table is also in 2NF because each attribute is fully functionally dependent on the primary key, which is the product code. In this table, the product code defines each item in the catalog. The table isn't in third normal form (3NF) because it contains a transitive dependency—a combination of the product description, size and color, and possibly unit price determines the reorder point for each item. We don't know enough about Todd's company policy to determine whether unit price has any bearing on the reorder point for each item. This is a problem, because until we find out how unit price and reorder point interact, we can't fully normalize this table.

One important note about catalog inventory: In a typical clothing catalog, a single catalog number describes a set of products. For example, in Table 1, catalog number PP44888 describes a collection of lady's crewneck sweaters that come in a variety of colors and sizes. Each size-and-color combination is identified by a unique value—the product code—for inventory control. It's possible that a garment manufacturer in Chile makes the pink and white sweaters and a factory in Sri Lanka makes the blue and lavender sweaters. The catalog number is what customers see and use to place orders; the product code is what Todd's company uses to manage inventory.

Table 1 is also badly designed. The column headings are too general, often forcing Todd to string together two or three different attributes in a single column. For example, the column Product Description contains the product category, product subcategory, gender, and material type for each item. A better way to represent catalog items is to design a set of tables that lets you represent each attribute clearly and concisely.

A Workable Solution
Figure 1 shows a conceptual model—an entity-relationship diagram (ERD) of a proposed redesign of Table 1. Sets of products make up the CATALOG entity. Each PRODUCT is defined by attributes in the PRODUCT DETAILS table, is produced by a MANUFACTURER, and belongs to a single PRODUCT CATEGORY. The one-to-many (1:M) relationship between PRODUCT CATEGORY and PRODUCT is simplified; it lets Todd classify product PP44888 under only one product category such as sweater. If this restriction is too limiting, Todd can convert the relationship between PRODUCT CATEGORY and PRODUCT to a many-to-many (M:N) relationship. He can then classify product PP44888 as both sweater and outerwear—and as many other categories as he wants.

I used a 1:M relationship between MANUFACTURER and PRODUCT because for inventory control, Todd needs to track which company produces which product. For example, if two different companies manufacture the same sweater, Todd's inventory records it as two different products. Now Todd can extend this basic database design to track pricing and delivery times because he knows which company produced which lot of lady's pale blue crewneck sweaters. Todd's catalog company probably operates on a small profit margin, so tracking such details is crucial to the success of his business.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here
  • SP1?
    I know there is a SP1 for SQL 2008 R2 available....and there is a "feature pack" as well... ...
  • SQL database mirroring
    I have SQL Server 2008 R2 Enterprise 64bit on Windows 2008 R2 Enterprise 64bit.  Each SQL Server has...
  • Dell Compellent Disk Drive
    Does anybody has experience with Dell Compellent Disk Drive? Basically, this system manages all disk...
  • Sql server performance tuning
    I need to find a tool that help me to optimize sql server,queries,improve the performance and solve ...