How do you identify a list or a collection in a table? I define a list as a set of same-type attribute names that have a progressive counter (e.g., 1, 2, 3) appended to the end of the names. Thus, Disk1, Disk2, and Disk3 constitute a list of physical disks. In Figure 1, Disk1Capacity and its fellows are a variation of this naming convention in which the counters reside in the middle of the attribute name, but the progression—1, 2, 3—is still clearly distinguishable. One word of warning: In confirming your suspicions about a list, you need to verify that the progressive counters designate separate objects instead of breaking up an overly long attribute, such as Address1, Address2.

A list is a multivalued attribute that has been flattened so that the list's members can be stored in the same table. To correct that condition, you need to decompose a multivalued attribute into a separate table, then link the two tables in a 1:M parent-to-child relationship. In my example, after you remove the disk list in Figure 1 from the SERVER table (the parent), you convert the list into a table that represents the set of physical disk devices (the child). Then, you have to identify other attributes that describe a hard disk. As Figure 2 shows, SERVER—the parent table—has a 1:M relationship with DISK_DEVICES—its new child table. In DISK_DEVICES, you can capture not only each disk's capacity but also its physical device number, physical device name, capacity units, and the presence (or absence) of a boot drive. By separating device capacity—an integer data type—from capacity units—a char(2) data type—you can easily run a query that calculates total capacity, grouped by ServerName. Figure 2 shows DISK_DEVICES involved in an additional relationship—the VENDOR-to-DISK_DEVICES 1:M relationship. Because DISK_DEVICES is a separate table, you can associate each disk with its disk vendor.

Now that you know how to define, identify, and decompose a list, you can apply a similar process to a collection. I define a collection as a group of similar (not same-type) objects that describe a parent object. You treat a collection the same way that you handle a list, except that you typically need two tables instead of one to fully express the collection. The SERVER table contains a collection of devices—ProcessorType, Memory, CD_ROM, DVD, FloppyDrive, MouseType, Monitor, and NIC. Figure 3 shows that the collection has been removed from the SERVER table and that COMPONENT, a new table, has appeared in the model. The COMPONENT table contains columns in which you store the type of component (e.g., processor, monitor, mouse) and a brief description of each (Pentium III, flat-screen 15", and trackball optical 1.0 PS/2-/USB-compatible, respectively). Because you relate the COMPONENT table to the VENDOR table, you can show which vendor manufactured which component.

Decomposing the SERVER table to remove the collection of components into the COMPONENT table is the first step in the process. To relate the components to the servers they're part of, you have to create a table called INSTALLED_COMPONENT. The relationship between SERVER and COMPONENT is M:N. You know that the only way to express a M:N relationship is to create an associative table—in this case, INSTALLED_COMPONENT—which Figure 3 shows. Now you can write a query that joins the SERVER, INSTALLED_COMPONENT, and COMPONENT tables to find out which servers have optical trackball mouse devices installed. At last, you've removed the multivalued attributes (disguised as a list and a collection) from the SERVER table.

Simplify Your Data Management
Finding multivalued attributes that are embedded in tables as lists or collections can be a challenge. You have to know what the attribute name implies before you can definitely say that a set of attributes constitutes a list or a collection. After you've made that determination, you need to decompose the multivalued attribute into separate tables and relate those tables back to the parent table from which you removed the attribute. Although writing some queries might be easier against the simpler, "flattened" model, you'll be better able to manage your data and control data integrity if you properly decompose these multivalued attributes.

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

This article is great, but what happen when you don't know the name of the attibutes or they are variable, for example the attributes for people on a marketing database (do you have a car?, what model? Insurance?, credit card?, etc, to name a few), can you help me with that?

Thanks in advanced

Luis Abrego

 
 

ADS BY GOOGLE