• subscribe
June 03, 2008 12:00 AM

Retaining Relational Integrity in the Data Warehouse

SQL Server Pro
InstantDoc ID #99362

Retaining relational integrity between the fact table and its dimensions is very important. The data warehouse can become a trusted repository for enterprise data only if the data is clean and its integrity is intact. If, because of operational problems, you can't justify leaving the fact table foreign key constraints in place, plan to execute a periodic program that searches for violations of referential integrity and reports on them, so you can remedy the situation. It’s one thing to have a dimension record that’s not reflected in a fact table—that’s pretty typical. But it’s another thing to have a foreign key value in a fact table that has no associated record in a dimension. The valid orphan child record condition is unusual, even in a transactional database. In a data warehouse, it can cause problems for business intelligence (BI) software or when trying to build cubes for SQL Server Analysis Services (SSAS). Your best approach in this case, if orphan child records in the fact table are valid, is to create a record in the associated dimension with a value of unknown and during the extraction, transformation and loading (ETL) process, set the foreign key value in the fact table to point to the unknown dimension record.

My opinion might rub some readers the wrong way because many shops have ETL strategies that enforce referential integrity. They depend on the ETL applications in much of the same way as transactional applications enforce referential integrity in an OLTP database. However, as in the OLTP environment, all it takes is a single ETL process that fails to enforce referential integrity, and you’ve got the scenario I described in the preceding paragraph.

The other, more valid argument for not enforcing referential integrity at the data warehouse level is that ETL processes sometimes have to load data in such a way as to violate referential integrity. In such a situation, there’s no alternative to letting the ETL process enforce referential integrity. In this case, you must have strong standards and control over your ETL processes to prevent breaches of referential integrity.



ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here
  • SP1?
    I know there is a SP1 for SQL 2008 R2 available....and there is a "feature pack" as well... ...
  • SQL database mirroring
    I have SQL Server 2008 R2 Enterprise 64bit on Windows 2008 R2 Enterprise 64bit.  Each SQL Server has...
  • Dell Compellent Disk Drive
    Does anybody has experience with Dell Compellent Disk Drive? Basically, this system manages all disk...
  • Sql server performance tuning
    I need to find a tool that help me to optimize sql server,queries,improve the performance and solve ...