• subscribe
September 21, 2006 12:00 AM

Build a Simple ETL System with SSIS

Get the big picture of basic package design and construction
SQL Server Pro
InstantDoc ID #93114
Downloads
93114.zip

3 More Dimension Tables
We have three more dimension tables to load: Date, Product, and Currency—all structured the same as the Promotion package. There are two Control Flow tasks: an Execute SQL task that pre-deletes existing dimension rows and a Data Flow task that extracts, transforms, and loads. As you can see in Figure 6, the Data Flow step for Product is somewhat more complex than for Promotion, but it's still basically the same. All the packages described in this article are available for download from the Microsoft Data Warehouse Toolkit Book Website at http://www.msftdwtoolkit.com.

The only new piece of the Product package is the Lookup transform. If you refer back to Figure 2, you'll see that we source DimProduct from three tables. In the interest of keeping source system queries simple, our source query included only two of those three tables. (We have only about 500 rows for the Product dimension, so we actually could have joined a dozen tables without hurting anything.)

If you can't or won't join tables in the source query, you can often use the Lookup transform to perform that operation within the Data Flow, as you see in Figure 6. For every row in the flow, Lookup looks up one or more columns in a secondary table.When you set up the Lookup transform, you first specify the connection for the database in which the lookup table resides. Next, you specify a source query that brings back only the columns you need for the lookup table. The default configuration points to the entire table, but because the high-performance Lookup is cached, you want to conserve memory. Finally, you go to the Columns tab of the Lookup Transformation Editor and specify the join column. On the same tab, ensure that you check those columns in the Lookup table that you want to add to the current data flow.

After you've tested each dimension table's package individually, create a master package to tie together their execution, as Figure 7 shows.The master package contains Execute Package tasks. Execute Package tasks are easy to configure: You need only identify the package to be executed.

Finishing Up
You're off to a good start, but you have a lot more to do.Your next step is to finish up the historical load by working through the process of loading the fact table, which you can postpone for now.

In a future article, we'll discuss the next part of the ETL process, which includes creating packages to handle incremental processing and fact table loads.The challenge for incremental dimension processing is to take care of changes in dimension attribute values. An enterprise-class SSIS system will use variables extensively, and will let you replace the placeholder values for your auditing columns. Most production SSIS systems also use some scripting to tie things together.

Most SSIS systems use only a subset of the tasks and transforms that are available. In the simple packages that we described in this article, we used 3 of the 40 Control Flow tasks: Data Flow, Execute SQL, and Execute Package.And we used 7 of the 45 Data Flow objects: OLE DB Source and Destination, Multicast, Raw File Destination, Derived Column, Lookup, and Union.You can use these basic building blocks to get started.



ARTICLE TOOLS

Comments
  • Neile Bermudes
    2 years ago
    Dec 31, 2010

    Just to add - if it just had a bit more detail, it would be a great article!

  • Neile Bermudes
    2 years ago
    Dec 31, 2010

    I agree with Jeff, many details missing in this article. For example, at point does the schema in AWOrders get created?? I can't use it as an ole db destination cause it doesn't contain any tables! It's definitely left me stuck...

  • Luca
    3 years ago
    May 18, 2009

    Too generic overview

  • Diana
    5 years ago
    Apr 19, 2007

    To vette67 - can you be more specific about the important steps you want detailed? We may have covered or will look at covering it in future articles.

  • JEFF
    5 years ago
    Apr 18, 2007

    The author provides little or no detail about how to do many important steps. I wouldnt call this a "how-to" article because it will leave beginner SSIS users stuck.

You must log on before posting a comment.

Are you a new visitor? Register Here