DOWNLOAD THE CODE:
Download the Code 44682.zip

In my work, I routinely redesign and redeploy production databases, and to do so, I write programs that transfer data from the old database to the new one. Readers have asked me to share data-transfer tips and techniques, so this month, I respond to that request. To show how to move data from an old database to a new one, I refer to the keyword-array example in my December 2004 article "The Search Path Less Traveled," InstantDoc ID 44273. In that article, I explained how keyword arrays such as the one that Figure 1 shows can adversely affect database performance and data-retrieval accuracy. I also explained how to design a set of normalized tables that will let you easily search by and manage keywords, maintain high data-retrieval accuracy, and minimize duplicate data. This month, I walk you through porting the data from the keyword array into this new set of normalized tables.

Before we start this project, let's briefly analyze Figure 1's ASSET table and determine why we'd want to normalize it. (For a full discussion about data normalization, see my article "Normalization: Why You Need It," March 1999, InstantDoc ID 4887.) Remember this rule of normalization: one topic, one table. The ASSET table breaks this rule by storing data about images and keywords. But a worse problem is the embedded many-to-many (M:N) relationship between image and keyword. As I explained in "The Search Path Less Traveled," an image is described by one or more keywords; a keyword can describe one or more images. As the M:N relationship becomes more complex—more keywords describe each image, and the same set of keywords describes more images—the ASSET table becomes harder to maintain, so we need to normalize it.

You can download this article's Web listings at InstantDoc ID 44682. We'll use the code in Web Listing 1 to create and load the ASSET table. Then, we'll use the code in Web Listing 2 to create the new, normalized tables. Finally, we'll use the code in Web Listing 3 to transfer the data from ASSET into the three new tables, IMAGE, KEYWORD, and IMAGEKEYWORD.

Doing the Prep Work
Figure 2 is a physical data model (it looks suspiciously like an entity-relationship diagram) of the new, normalized tables. The IMAGE table contains a description of each image. The KEYWORD table contains a list of keywords. To associate keywords with images, we need a table like IMAGEKEYWORD, in which each image can be associated with zero or many keywords and each keyword can describe zero or many images.

When you're transferring data from one database schema to another, it helps to visualize what you need to do before you start writing code. Figure 3 shows a mapping technique I use when planning complex data transfers. Each field of the original table (the blue bar) is mapped to another field in one of the new tables (the tan bars). I retain the ID field from the ARRAY table in the IMAGE table's ImageID field because of relationships with other tables that aren't part of this discussion. When transferring data, you have to remember this rule: You must maintain the valid table-to-table associations and relationships of the original database. As Figure 3 shows, we'll transfer data from the ARRAY table's name and asset_image fields to the IMAGE table's ImageName and ImageFile fields, respectively. I renamed these fields to better identify their business function in the new database schema. We'll transform the keyword array, ASSET.keywords, from an array of terms to a list of single words in the KEYWORD table's KeywordDscr field.

Now we have images and keywords mapped, but how do we associate the two? That's the purpose of the IMAGEKEYWORD table. We created this table to retain the M:N relationship and make it work. The IMAGEKEYWORD table stores the ImageID field from the IMAGE table and the KeywordID field from the KEYWORD table as pairs; for each image that one keyword describes, IMAGEKEYWORD will contain one record.

We're almost ready to begin reviewing the code listings, but before we do, I want to mention something about backward compatibility. You might be thinking that when I talk about an older database schema, I'm talking about a legacy system or at least one that's several years old. That's not always true. Occasionally, a client asks me to restructure brand-new SQL Server databases whose design doesn't let a company expand or extend their business processes. In one case, the schema was only a year old when I started working on its replacement. So when you transfer data from one database schema to another, you have to keep in mind that the current UI might not be changing immediately. The decision-makers in your company might decide to restructure the database first and map the existing UI to the new database schema. Then over time, they'll introduce the new schema's enhanced functionality to their users and customers through new Web interfaces. Therefore, when you're designing a new database and transferring data, you have to be aware of how you're going to support this essential backward compatibility.

   Prev. page   [1] 2 3     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.