DOWNLOAD THE CODE:
Download the Code 8029.zip

Extensibility. Extensibility is an important repository feature that lets users add information to the models to cover topics specific to their organization or tool. Using the Microsoft Repository software development kit (SDK), which is available at the Repository Web site (for this and other resources, see "For More Information"), you can extend the information models that the Repository uses. If your organization needs to store information about locations that aren't part of the base OIM, you can extend the model to add a new object and corresponding relationships to track that information. Sites can also add new properties to track information for existing objects (e.g., adding a new attribute to a table).

Integration with the Microsoft Data-Warehousing Framework
The Repository is a component of Microsoft's data-warehousing framework and an important part of its SQL Server strategy. Microsoft adds value for its database customers by bundling warehouse tools with SQL Server 7.0, such as Data Transformation Services (DTS), OLAP services, English Query, and the Repository. You can also access repository data through the SQL Server Enterprise Manager, as Screen 1 shows, by clicking the Metadata node under Data Transformation Services. From there, you can access the relational schema information and package and lineage information. You can explore the relationships between columns and the packages that update them.

Data Transformation Services. DTS is the extraction, transformation, and loading (ETL) tool that comes with SQL Server 7.0. Its flexibility and range of features help you populate a warehouse. You can save packages directly into the Repository. One of DTS's more compelling features, lineage, requires the Repository. Lineage lets you track how data in the warehouse was calculated and when it entered the warehouse.

To save DTS information into the Microsoft Repository, choose SQL Server Repository as the location for saving the package. Use the Advanced tab on the Package Properties to set the scanning options, which Screen 2 shows. Doing so causes DTS to call the OLE DB scanner to load all source and target catalogs into the Repository. If you don't set the scanning options, DTS creates DTS Local Catalogs as the reference for all source and target catalogs, which can make locating the databases impossible. Each subsequent save replicates this reference, so you can't keep comments and other descriptive information updated.

You can run into problems when you try to save certain DTS transformations to a repository. If you use a script to perform a simple transformation and you choose the source columns explicitly (not from a query), all the transformation data is captured, as you can see in the transformation model in "The Open Information Model." If you choose a query as the transformation source, that source becomes objects that aren't part of the OLE DB imported data. This choice makes following the connection back to the true source objects difficult. Also, the query isn't parsed to create a connection between the query columns and the columns you select the data from. So in many cases, the connection between source and target is available, but in some, it isn't. You can solve these problems by writing a program to resolve the references in a repository or by using a custom model along with the DTS model to store the source target mappings.

DTS also uses versioning for the package object, but it replicates all the subordinate objects with each save. You can then go back to any version of a package to see exactly how the data was transformed. This capability is important for tracking down problems with data months (or years) after it entered the warehouse. This versioning scheme won't cause many problems if you use the appropriate scanning options to import the relational schemas, which will then be appropriately versioned for continuity.

OLAP Services. OLAP Services gives you multi-dimensional analysis for warehouse data. By using a utility in the Repository SDK, you can import all the definitions for the OLAP data into a repository. With SQL Server 2000, OLAP Services stores meta data directly in the repository. One problem with the utility in SQL Server 7.0 is that it doesn't connect the measures and dimensions to the underlying columns they're based on. Making these connections can be a tedious manual task, or you can write a program to make the connections automatically. The information about which columns are used is available in the repository model, so you can easily retrieve the data.

English Query. English Query lets you define a semantic model for a database, then translates English phrases into SQL. The semantic model provides information about a system and how it is used, and can be valuable as you use the warehouse. The Repository SDK contains a utility to import data from English Query into a repository. SQL Server 2000 has an option within English Query to export and import models from a repository.

OLE DB (relational schema). Importing relational schemas into a repository gives you the base set of information necessary to begin documenting the warehouse. You can import database schemas from any OLE DB- or ODBC-compliant data source. To run the import from Enterprise Manager, right-click the Metadata tab under Data Transformation Services and choose Import Metadata. This import utility uses full versioning as it loads the data, so you can easily understand the changes to the database schemas over time. Because the utility uses versioning, it preserves all the descriptive information, such as comments, and rescanning the catalog doesn't affect the information.

Example Usage Scenarios
The following usage scenarios demonstrate how end users and warehouse analysts can benefit from the data in the Microsoft Repository. For most organizations, the end users benefit most from the assurance that they have the appropriate information and an understanding of the data they're viewing.

Following a data path from OLAP cube to source system. A business user is using an OLAP client to view data and needs to better understand how the data was calculated and where it came from. The user is familiar with the company's current operational systems, but isn't familiar with the warehouse. The organization has undergone numerous mergers and acquisitions, so several systems that populate the warehouse originated in different companies.

The user starts with the OLAP client that is displaying sales data from a cube. To get more information about what the data means and where it came from, the user could obtain from the Repository descriptive information about the cube and its measures and dimensions. This description would be more informative than the simple labels assigned to each item. For Total Sales, the description might note that returns don't appear until the end-of-quarter processing, or that the regions changed in 1997, so region comparisons before that year will return invalid results. The user can then follow the path from a particular measure or dimension to the warehouse, then through the transformations to the source systems. If users are interested in data from a specific source system, they can use the Repository to determine which cubes or reports display data from that system. Descriptions of the source systems, the transformations that moved the data, and related information are also available. This information gives users a frame of reference in terms of the current business, not just the warehouse.

Using DTS lineage to find source data and transformations. In another case, a user is viewing data from a report or cube and wants to know exactly where the data came from. This environment has seven source systems, each of which uses a different package to populate the warehouse. Using the lineage feature of DTS, the user can determine which package a system used to populate the warehouse with a given row. From a particular report row, the user can determine the lineage IDs for that warehouse data, then query the Repository to find out which packages from which source systems populated that data. This process gives the user a detailed analysis of the data source. An experienced warehouse analyst can also use this information to see the exact transformation package that populated that data, even if the package changed after populating the data.

Make the Best Decision
The Microsoft Repository and its integration with other warehousing tools give you the kind of information that's impossible to obtain in a typical warehousing environment, which stores designs and transformations primarily in spreadsheets. The goal of warehousing is to give as much information to as many users as possible, so that they can make business decisions based on the best information.

But without a means to understand and describe that data, users might get too much information or not trust the data and could base decisions on a faulty interpretation of the data's meaning. Repository technology can play a significant role in the delivery of warehouse information to end users, supplying consistent descriptions and a road map of the data being viewed. Microsoft and other vendors provide significant integration with data warehousing tools to make meta data easier to use.

End of Article

Prev. page     1 [2]     next page -->



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

dear all I want to install microsoft repository in my computer but i don't know which package or which software included it regards massoomy

repository

 
 

ADS BY GOOGLE