DOWNLOAD THE CODE:
Download the Code 16548.zip

Periodically refresh the data in your data mart

In the first two articles in this series ("Relational to Dimensional," June 2000, and "Building a Cube from a Dimensional Database," October 2000), I described a simple solution to provide reporting and analysis for the fictitious Northwind Traders Company. The solution included creating a new database designed for querying, loading the database from the relational Northwind database, and using SQL Server 2000 Analysis Services to create an OLAP cube. This article completes the series by describing the steps you need to take to periodically refresh the data in the data mart so that you can reload the entire data set without updating the data from the source database.

Changing Dimensions
The dimensional, or star schema, database created for reporting and analyzing sales information across your company has dimensions for suppliers, products, employees, customers, and time, as Figure 1 shows. Except for time, this data changes as business parties begin or end relations with Northwind Traders. Depending on analysis requirements, a variety of rules could apply to these changes. New customers, suppliers, products, and employees are relatively easy to handle. You add them to the dimension tables so that you can capture related transactions in the fact table. When these business parties aren't in the picture, the rules can be more troublesome.

An easy solution to changing dimensions is to ignore those dimension members that go away and delete them and their facts, but this approach skews history. A product that Northwind Traders discontinued still has sales numbers for the time that the company sold the product, and this history factors into the numbers for the product's parents in the product dimension hierarchy.

What do you do if two customer or supplier companies merge into one entity? Although many possibilities exist, they all boil down to business requirements. How you deal with changing dimensions can vary on a case-by-case basis. For information about three options for dealing with slowly changing dimensions, see Ralph Kimball, The Data Warehouse Toolkit: Practical Techniques for Building Dimensional Data Warehouses (Wiley & Sons, 1996); Ralph Kimball et al., The Data Warehouse Lifecycle Toolkit: Expert Methods for Designing, Developing, and Deploying Data Warehouses (Wiley & Sons, 1998); and Amir Netz's Microsoft article "OLAP Services: Managing Slowly Changing Dimensions," at http://msdn.microsoft.com/library/techart/slowly2.htm. Each of the three options has a different effect on the ability to look back into history.

The first option is to simply replace an old value with the new one. For example, if a customer's marital status changes from single to married, you could just change his or her status. But that approach hides the person's historical buying behavior as a single person, which could be very different from the buying behavior of a married person.

The second option for dealing with a changing dimension is to add a dimension member for the customer with a status of married so that historical transactions associated with him or her as a single person still show up and his or her new transactions as a married person are captured moving forward. This option preserves historical information.

The third option is to store a reference to the previous value. In the case of the customer who marries, he or she might have a current marital status of married and a previous marital status of single. Because being married is definite, referring to the previous value isn't necessary. However, this option makes more sense for less cut-and-dried situations. For example, in The Data Warehouse Lifecycle Toolkit, Kimball uses the redrawing of sales district boundaries as a possible candidate for this option. Analysis Services also has features for dealing with changing dimensions.

Following the simple approach of my first two articles, I will preserve history in the dimensional database and add new business parties as changes arise, as in the second option above. This approach lets me add sales numbers to the dimensional database fact table based on the business parties that each transaction involved.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

How come you don't include the DTS Package Refresh Northwind_Star in the source code for download???

Spencer Tabbert