New SQL Server 2000 tools and tasks make transformations faster and easier
Since its introduction in SQL Server 7.0, Data Transformation Services (DTS) has helped many SQL Server professionals migrate data from OLE DB-compliant data sources to any compliant destination. But in SQL Server 2000, DTS has really grown up. If you've used DTS in SQL Server 7.0, you'll notice many enhancements in the new release. DTS is full of new features and tasks that help you transform your data and perform other tasks such as FTP and sending messages to queues. Let's look at how Microsoft has enhanced DTS in SQL Server 2000 through improved tasks and tools such as the Dynamic Properties task, the Execute Package task, and the multiphase data pump. You could cut your DTS development time in half after you begin using SQL Server 2000 because the new built-in features alleviate the need for many custom scripts.
Global Variables
Global variables do the same thing in DTS packages that local variables do in stored procedures: Global variables let you reuse a value in a DTS package multiple times. In SQL Server 7.0, you need complex ActiveX scripts to reuse values, but now every task in a package can use variables. Global variables' importance has increased greatly in SQL Server 2000 because you can use them in nearly every task to make your package dynamic. To make a package dynamic, you use a new task called the Dynamic Package task. Other tasks, such as the Execute SQL task, have been enhanced to support global variables as parameters in queries. When a task is working inside a package, that task can't easily communicate with other tasks or packages. With global variables, you can store in memory such values as a server name or a connection string, then reuse those values from various tasks inside your package.
By specifying the /A switch in SQL Server 2000, you can pass global variables into a package when you use the Dtsrun command-line utility to execute a package. You can later use these passed-in global variables to set connection strings in your package. As I show in a moment, this capability gives you a wonderful advantage when you combine it with several Dynamic Properties tasks. The following syntax shows the essential components for passing a global variable into a package:
DTSRUN /S ServerName /U LoginID /N
PackageName /P Password /A
GlobalVariableName:TypeID=Value
The /A switch requires the global variable's name, its value, and its TypeID, which is the global variable's data type. Table 1, page 34, shows an abbreviated list of common TypeIDs that DTS uses. For a complete list, see the entry for the Dtsrun utility in SQL Server Books Online (BOL).
For example, you might use the following syntax to pass a package a string global variable with the TypeID of 8:
DTSRUN /S "(local)" /N "Dynamic Properties"
/A "gvCatalog":"8"="Pubs.dbo
.DynamicPropertiestask" /E
(You can use the /E switch to use Windows authentication instead of SQL Server authentication so that you don't have to hard-code a login name and password. Instead, the package will execute with the credentials of your current login.) By passing a global variable into a package in this way, you can change which tables a package loads the data into. You can create a package to load data into multiple tables. You can also use a scripting language such as VBScript or JScript to programmatically pass global variables into a package.
New Tasks
In SQL Server 2000, Microsoft has more than doubled the number of DTS tasks that install by default. The new tasks remove several limitations that exist in SQL Server 7.0. For example, some of these limitations require DTS developers to program complex code inside an ActiveX Script task to execute a package inside a package or to change a package's properties dynamically. In SQL Server 2000, DTS has two new tasks, Dynamic Properties and Execute Package, that can do these things for you automatically.
After you spend weeks developing and stabilizing a DTS package, the last thing you want to do is start the entire process again for another client. Reusability is an important part of DTS in SQL Server 2000. Dynamic packages let you create a package once and then use it over and over again. A sample dynamic load might consist of a DTS package, PackageA, that uses the ActiveX Script task to look in a directory every 10 minutes for a file to upload from the mainframe. After PackageA finds the file, it moves the file to a client directory. Then another ActiveX Script task launches PackageB, which performs the extraction of data to SQL Server or another OLE DB-compliant data source. However, before you can execute PackageB, you must pass to PackageB the global variables that contain the path and filename to extract. In SQL Server 7.0, you would pass these variables entirely through the ActiveX Script task. In SQL Server 2000, the Dynamic Package task takes care of the dynamic pieces of this scenario.
So, why would you use two packages? Extraction processes can grow quite large. Most packages that I develop for handling extraction processes are larger than 1MB, but SQL Server provides no way of caching such large packages when they execute, and they could be slow if executed whole. You don't want a 1MB package slowing down the production server by executing every 10 minutes. Instead, you can execute PackageA (which in most cases would be 20KB to 50KB) and have it execute PackageB only when it finds the file.
After PackageB receives the necessary global variables and executes, several things must occur before the load procedure can begin. The properties of each connection and task must change to match the data that you're loading. In SQL Server 7.0, you have to code ActiveX Script tasks before each connection or task that you want to modify. This process leaves a lot of room for errors, and your testing cycle could last weeks or months as you debug each task. Another problem with using the ActiveX Script task to modify the properties of a connection or task in DTS is that when you need to change some code, you have to modify the code in each ActiveX Script task in every package where the code exists. Again, you have a lot of room for error. Let's look more closely at the new tasks.
Dynamic Properties task. To cut down on coding and thereby minimize errors, Microsoft added the Dynamic Properties task to DTS. With the assistance of this task, you don't have to create bulky ActiveX Script tasks to dynamically set a DTS property, such as a username that you use to establish a connection. This task lets you change the value of any nonidentifying property that is accessible through the DTS object model (e.g., non-name/ID properties of a step, connection, task, package, or global variable). What once took 3 weeks to stabilize, you can now write and stabilize in less than a day. Using the Dynamic Properties task gives you faster performance than writing the same process with an ActiveX Script task because DTS doesn't resolve the ActiveX Script task until runtime.
Execute Package task. The new Execute Package task helps you with the second part of the example. Rather than creating an ActiveX Script task to execute a package from within a package, you can use the Execute Package task to serve the same purpose. When you use this task, you can remove redundant pieces of your package. For example, if you have a standard auditing procedure, you can create a separate child package for the procedure and execute that package from multiple other packages whenever you need it. Then, when you need to update your auditing code, you don't have to update the code in every package. Instead, you can update just the child package. You can also put secure parts of your package, such as payroll transformations, into separate packages that you can protect with a user password.
With the Execute Package task, you can also make a child package join a transaction if one exists. Transactions help you ensure that all your data from multiple tasks is either loaded or rolled back. You can use transactions to help ensure your data's integrity: If one step in a package fails, you can roll back the entire transformation. To enable transactions, go to Workflow Properties by right-clicking any step and selecting Workflow, Workflow Properties. Then, select Join Transaction if Present on the Options tab to tell the step to join a transaction, if one exists. If no transaction exists, DTS creates a new one. You also need to ensure that the Use Transactions option is selected on the Advanced tab in DTS Package Properties. (This option is enabled by default.)
Inside the Execute Package task, you can easily pass global variables to the child package, as Figure 1 shows. If you click the Inner Package Global Variables tab in the Execute Package task, DTS scans the child package for any global variables that it expects, and you can set them to your own values. Inner global variables let you set the global variables for the child package from the Execute Package task. Inner global variables are useful when you're calling auditing packages and you want to pass to the child package certain errors based on events in the parent package. Outer global variables send global variables from the parent package to the child package. If a global variable exists in the child package, DTS updates the child's global variable to the parent's value. If the global variable doesn't exist in the child package, DTS creates it.
Message Queue task. The Message Queue task is an elegant way of passing string messages, files, or global variables between packages or other programs. The Message Queue task works with the Microsoft Message Queue Service (MSMQ) that ships with Windows 2000 or the MSMQ that installs on Windows NT 4.0 as a part of the Microsoft Windows NT 4.0 Option Pack. This task can send messages to a queue for another package or program to pick up and process later, letting programs that don't support OLE DB interface with DTS. Through the Message Queue task, you can also have several packages that are running in parallel on multiple servers check in to a queue upon completion. Meanwhile, the parent package waits for the child packages to complete before it executes a cleanup script. This process is useful when you want to scale-out a data-load procedure in which you're loading millions of records.
File Transfer Protocol task. The File Transfer Protocol task lets you receive files from an FTP site or another Universal Naming Convention (UNC) path. With SQL Server 7.0, you have to use a line-command FTP application or develop your own file-copying or FTP component to perform the same task. A drawback of the File Transfer Protocol task, however, is that it can't send files to a remote FTP site or directory. The File Transfer Protocol task can only receive files.
Microsoft added five other minor tasks to SQL Server 2000 that work behind the scenes automatically with the Copy Database Wizard but that you can also incorporate into your packages. With the Transfer Databases, Transfer Jobs, Transfer Logins, Transfer Error Messages, and Transfer Master Stored Procedures tasks, you can transfer objects stored in the Master database from any SQL Server 2000 or 7.0 instance to another SQL Server 2000 instance.
Prev. page  
[1]
2
next page