DTS takes the movement of data to a new level

SQL Server 7.0 includes many new tools to support enterprise database systems. Data Transformation Services (DTS) is one of the most exciting new offerings because it helps developers and database administrators (DBAs).

Microsoft recognized the difficulty of using bulk copy program (bcp) to move data into and out of SQL Server and created DTS as a solution. DTS fulfills Microsoft's need for a fast, robust, extensible architecture for moving data out of traditional relational databases and into Microsoft's new online analytical processing (OLAP) engine (i.e., SQL Server OLAP Services, formerly code-named Plato).

The DTS utility service features a highly graphical, wizard-based interface; improved performance; and programmable extensibility through a rich set of publicly available COM interfaces. Also, the new service provides an excellent facility for migrating data from one database to another. DTS provides import, export, and data transformation services between Object Linking and Embedding Database (OLE DB), Open Database Connectivity (ODBC), or ASCII data sources. In addition, the new service satisfies data warehousing needs such as data extracting, scrubbing, validating, aggregating, and summarizing.

DTS takes data migration to a new level by providing the means to import, export, and transform data between multiple sources—interactively and programmatically—through COM interfaces. DTS differs from its predecessors, such as bcp and replication, by being able to interact with non-relational data and to programmatically transform data. The new service lets users interact with most existing data sources—not just those using relational technology—and eliminates much of the complexity that bcp and replication introduced. DTS removes the need for intermediary tables; transformation code now becomes part of the transfer process. This change minimizes the number of processing steps, which reduces the likelihood of failure and ensures a high-performance data transfer.

DTS uses OLE DB technology to support non-relational databases; OLE DB, Microsoft's solution for universal data access (UDA), defines a set of COM interfaces (called data consumers and data providers) for accessing and offering any type of data. SQL Server 7.0 is a native OLE DB provider. DTS is an OLE DB consumer. Additionally, with SQL Server 7.0, Microsoft provides OLE DB interfaces to Oracle 7.3+, Microsoft Jet, and ODBC sources. Microsoft envisions that OLE DB will eventually replace ODBC as the industry standard for accessing data.

Using OLE DB, DTS gains independence. Acting as an OLE DB consumer, DTS does not require SQL Server. Furthermore, the new service interacts with any database (including SQL Server 4.2 and 6.x servers) with a compatible ODBC driver. (Test your configuration for compatibility before you commit to a specific driver.)

DTS provides a robust user interface for developing and administering transformation tasks. For ad hoc transformations and simple transfers, DTS offers import and export wizards to assist users in developing their transformation tasks.

For more complex task development, DTS provides DTS Package Designer, a development tool. From within the designer, you can define multiple data sources and targets and create elaborate transformation scripts using any COM-based scripting language. Additionally, you can define packages, database tasks, and external applications for the system to execute, and you coordinate these tasks using built-in scheduling and workflow functionality.

Evolution
DTS has evolved from the idea of a dedicated data pump between SQL Server and SQL Server OLAP Services and has become an independent data service that can extract, transform, and manage data movement between multiple heterogeneous data sources. (DTS does not run like an NT service. Instead, DTS runs as a COM server. Also, it can run as an external application via dtsrun.exe.) The evolution of DTS means it can perform roles in many different projects. For example, DTS might replace native replication to move data back and forth between two vendors' data stores, such as an Oracle database running under NT and an Informix database running under UNIX. In another example, DTS might scrub and validate text data from a legacy system before loading the data into a Web-based database application.

Although DBAs and developers will find many such uses for DTS, Microsoft has designated DTS as an essential component in its Data Warehousing Framework. DTS is part of the warehousing framework diagram in Figure 1. In data warehousing, developers face three main tasks—data acquisition, data transformation, and data presentation. Microsoft is positioning DTS as the tool for automating data acquisition and data transformation. In addition, DTS provides scripting, task scheduling, and workflow tools to manage such jobs and handle exceptions that might arise.

Data acquisition is the task of getting data from any data source and feeding it into a warehouse. Acting as an OLE DB consumer, DTS extracts data from any data source that provides a native OLE DB or ODBC interface. For example, a company might host an inventory tracking system in a DB2 database running on a mainframe. The company's accounting system might be on an Oracle database running under UNIX, and the company's human resources data might be in a SQL Server database. To pull all this information into a SQL Server warehouse, the developer uses the DTS Package Designer to create OLE DB connections to each data source. The developer then uses the data source connections to map source data fields to destination data fields in the SQL Server warehouse.

Data transformation is a set of tasks for scrubbing, validating, aggregating, and summarizing data after its extraction and before its arrival at a target storage location. DTS provides this functionality through scripting capability that accesses COM interfaces.

DTS offers an infrastructure for managing acquisition and transformation tasks. DTS steps define the logic flow of a transformation; each step has an associated task, and the task reports its result (i.e., success or failure, with more detail available). DTS lets you schedule and execute steps serially, in parallel, or according to precedence—providing flexibility and performance in logic flow. For example, if an entire transformation process consists of six steps, you can start the first three steps in parallel, and have the fourth step start after successful completion of the first three. The remaining two steps can begin in parallel after the fourth step has successfully completed. In addition, you can conditionally execute steps. For example, after the first step returns a status code, you can have DTS use the status code to dynamically determine whether to execute step two or step three. DTS scheduling and workflow facilities fill a huge gap in most enterprise-warehousing solutions.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

Good

Rabi Pattnaik

Your Comments (required):it is gr8!!! it reall y helped me out,.. thanks.....

sunil vadgama

This article is very nice but still if author focus on parallel processing through DTS package, may help me lots.

Darsh

Article Rating 4 out of 5