• subscribe
June 21, 2007 12:00 AM

Discover the Star Schema

A basic schema design for the data warehouse
SQL Server Pro
InstantDoc ID #96112

The star schema design that Figure 1 shows has several notable characteristics:

  1. Every table in the star schema has an identity primary key, which prevents quibbling about natural primary keys versus surrogate primary keys. In a data warehouse, the primary key assignment goes to the surrogate; if you have a natural key that you want to retain for querying, you designate it as an alternate key.
  2. Most of the columns in both the dimension tables and the fact table are nullable. Only the primary and alternate keys are mandatory, and only the primary key is unique. You need to assume that data will be loaded into this data warehouse structure from various sources, even from multiple generations of operational databases. Therefore, the constraints that you would normally apply to enforce business rules in the transactional database must be relaxed in the data warehouse. Most of the columns must be nullable because you might not have data for them.
  3. Relationships are optional on the parent side (i.e., zero-to-one to zero-to-many). If you have no control over the source data's referential integrity, you must allow for orphan records in the fact table and in lower levels of a dimensional hierarchy.
  4. Most columns in the fact table (i.e., the measures) are numbers. The fact table is the focus of a BI investigation, and BI analysts are looking for numbers and facts.
  5. Data redundancy is rampant throughout the star schema design; this redundancy is necessary for the data warehouse to reach an acceptable level of performance. The amount of data in a data warehouse is typically enormous compared with the amount of data in a transactional database. When you write a T-SQL query against the data warehouse's star schema, the redundancy minimizes the number of joins required to return the data, yielding much better performance than if you issued the same query against the source transactional database.

Schema design for a data warehouse need not be much different than schema design for a conventional transactional database. Because the data warehouse is an historical archive, you can retain some semblance of normalcy in the data warehouse schema design. In addition, you can create summary tables or columns, and you'll certainly want to add timestamps and identity values to individual records in each of a relational data warehouse's tables. A benefit of retaining a relational or near-relational data warehouse is that the warehouse is a readily available reporting database that you can use conventional T-SQL to query. With no programming learning curve and no special tool requirements, the data warehouse is an immediate ROI.

LEARNING PATH
SQL SERVER MAGAZINE RESOURCES:
"OLAP: Resistance Is Futile!" InstantDoc ID 5103
"Solutions by Design: Adding a Questionnaire to a Design," InstantDoc ID 7449
"Step-by-Step Data Warehousing," InstantDoc ID 8047


ARTICLE TOOLS

Comments
  • cwaters123
    9 months ago
    Aug 17, 2011

    The figures used as examples in the text are incorrect and seem to have nothing to do with the article content.

  • Marcos
    4 years ago
    Aug 26, 2008

    I agree with Bala. The articles are great but the website could use some user-friend techniques. Keep writing Michelle.

    Marcos Galvani

  • Diana
    5 years ago
    Dec 21, 2007

    Dear Bala,
    Thanks for taking the time to give your feedback. I'll be passing this on to our sales team...
    Diana May
    Sr Technical Editor
    SQL Server Magazine

  • BALA
    5 years ago
    Dec 17, 2007

    All these advertisements are really really annoying. I feel like navigating through a minefield to finish reading an article. Its not like this is a free website or something in order for me to put up with this annoyance. Could you please address this issue.

    Quite a few of your articles are thorough, professional and informative. I really enjoy reading your magazine articles.

    Thanks,
    Bala

  • Diana
    5 years ago
    Jul 05, 2007

    Dear Reader,
    Thanks for the feedback. Hope you'll find the whole series useful! Please let us know. Your feedback helps us direct our content to your needs.
    Diana - Technical Editor, SQL Server Magazine

You must log on before posting a comment.

Are you a new visitor? Register Here