Get the data you need from less-than-perfect tables
Let's face reality: The world isn't perfect. All the database tables in the fifth normal form, hyper-efficient queries, and unlimited hardware budgets that you read about in articles and books are the way things should be. Now let's talk about how things often are. You'll inevitably come across some VSDBs (Very Strange Databases) in your career. Most of us have opened a database diagram in Enterprise Manager and wondered, "What in the world were they thinking?" I'm not blaming anyone. Database schemas tend to have unusual logic for several reasons, not just because their designers lacked knowledge of good database-design practices. Small companies usually don't anticipate explosive growth when designing their databases and applications. But a system that handled 100 users well can find itself in a disastrous situation when the company grows faster than expected. These companies rarely consider redesigns or upgrades, so a kind of patched-up database evolves. The Internet era has also contributed to the rise of shoddy database design. Companies expect Web sites to be up and running yesterday in the race for e-business, so developers adopt a mentality that a dear friend of mine calls the "shoot now and aim later" school of application design.
Whatever the reasons, quirkiness prevails in some parts of the database world. Let's look at some practical ways of dealing with two common but unfortunate situations that you'll likely encounter in dealing with relational databases: tables with multivalued columns and tables with repeating groups.
Why Be Normal?
As you know, a set of rules known as normal forms governs relational database design. I won't go into detail about these rules here, but the first ruleor first normal form (1NF)states that each attribute in a table should be atomic (of a single value) and nonrepeating. Hence, you need to model a one-to-many (1:M) relationship as two entities: one to contain the parent and another to contain the children.
Let's look at an example e-commerce situation involving an imaginary company that sells toys and novelties. (For the example to work, you need to download and run the .sql file that accompanies this articlesee "More on the Web" for instructions. Then, create a DSN called ToysDSN that points to the database containing the example tables. You also need a copy of the adovbs.inc include file in the Microsoft IIS directory that you're executing the Active Server PagesASPexamples from.) The company's legacy database uses an item ID number and an item option (e.g., size, color), which an application eventually writes to an Orders table. Figure 1 shows two tables, Items and Options, which contain the items and the options for each item, respectively.
This schema is a normalized example of a 1:M relationship; it's modeled with one entity for the items and another for the available item options. Item 100, a leisure suit, comes in three sizessmall, medium, and largeand the SQL to find the options for item 100 is straightforward:
SELECT ItemOption FROM Options WHERE
ItemId = 100
On your Web site, when a user clicks the leisure suit link to show the item details, you want to populate a drop-down list with the item options. With the normalized design, this process requires only the preceding simple SELECT statement, as the ASP code in Listing 1 shows.
Additions to the option list are also straightforward. Say that big and tall people began to email your company, requesting your leisure suit for parties and class reunions. After instructing your supplier to begin manufacturing plus sizes, you can just insert the new options like this:
INSERT INTO Options (ItemId, ItemOption)
VALUES (100, "Extra Large")
INSERT INTO Options (ItemId, ItemOption)
VALUES (100, "Extra Extra Large")
You can now offer the leisure suit in five sizes, andthis is importantyour application code has remained unchanged. The original SELECT statement will populate the drop-down list correctly, no matter how many sizes you put in the Options table.
Now let's look at the same scenario using a table in a state I call the first abnormal form (1ANF). Let's say that your legacy system uses an item ID and option code, but the database was modeled with only table AB1, which Figure 2 shows. As you can see, the designer defined all the items and item options in one entity, with one column for each option. You might even have the misfortune of needing to deal with a variant of a 1ANF table, like table AB2 in Figure 3. Table AB2 has three columns; the third one contains a character string of all available options delimited by a comma. You might wonder what's wrong with creating tables like these, but consider the following: How easy is adding an option? Removing one? Can users search for all items that offer a particular option? Looking at AB1 and AB2, you can see that the answers to these questions aren't clear.
So, how do you effectively deal with tables that are in the first abnormal form? Ideally, you split the original table into two tables and create a 1:M relationship. But in some cases, you won't have that option, as the DBA of the legacy system will be happy to tell you. Legacy databases often have many processes and applications that operate against them, making structural changes to the databases impractical. Proceeding on the assumption that you're stuck with a 1ANF table, let's address tables AB1 and AB2.
Prev. page  
[1]
2
3
4
next page