3 More Dimension Tables
We have three more dimension tables to load: Date, Product, and Currency—all structured the same as the Promotion package. There are two Control Flow tasks: an Execute SQL task that pre-deletes existing dimension rows and a Data Flow task that extracts, transforms, and loads. As you can see in Figure 6, the Data Flow step for Product is somewhat more complex than for Promotion, but it's still basically the same. All the packages described in this article are available for download from the Microsoft Data Warehouse Toolkit Book Website at http://www.msftdwtoolkit.com.
The only new piece of the Product package is the Lookup transform. If you refer back to Figure 2, you'll see that we source DimProduct from three tables. In the interest of keeping source system queries simple, our source query included only two of those three tables. (We have only about 500 rows for the Product dimension, so we actually could have joined a dozen tables without hurting anything.)
If you can't or won't join tables in the source query, you can often use the Lookup transform to perform that operation within the Data Flow, as you see in Figure 6. For every row in the flow, Lookup looks up one or more columns in a secondary table.When you set up the Lookup transform, you first specify the connection for the database in which the lookup table resides. Next, you specify a source query that brings back only the columns you need for the lookup table. The default configuration points to the entire table, but because the high-performance Lookup is cached, you want to conserve memory. Finally, you go to the Columns tab of the Lookup Transformation Editor and specify the join column. On the same tab, ensure that you check those columns in the Lookup table that you want to add to the current data flow.
After you've tested each dimension table's package individually, create a master package to tie together their execution, as Figure 7 shows.The master package contains Execute Package tasks. Execute Package tasks are easy to configure: You need only identify the package to be executed.
Finishing Up
You're off to a good start, but you have a lot more to do.Your next step is to finish up the historical load by working through the process of loading the fact table, which you can postpone for now.
In a future article, we'll discuss the next part of the ETL process, which includes creating packages to handle incremental processing and fact table loads.The challenge for incremental dimension processing is to take care of changes in dimension attribute values. An enterprise-class SSIS system will use variables extensively, and will let you replace the placeholder values for your auditing columns. Most production SSIS systems also use some scripting to tie things together.
Most SSIS systems use only a subset of the tasks and transforms that are available. In the simple packages that we described in this article, we used 3 of the 40 Control Flow tasks: Data Flow, Execute SQL, and Execute Package.And we used 7 of the 45 Data Flow objects: OLE DB Source and Destination, Multicast, Raw File Destination, Derived Column, Lookup, and Union.You can use these basic building blocks to get started.
End of Article
Prev. page
1
2
[3]
next page -->