• subscribe
December 20, 1999 11:22 AM

DTS Scripts and Tricks

SQL Server Pro
InstantDoc ID #7836
Downloads
7836.zip

DTS Threading Model
When developing with DTS and VB, you need to be aware of a subtle but potentially show-stopping difference in the threading models of the two tools. Briefly, threads let multiple tasks execute within one process. For example, in a multithreaded application, one thread might handle disk I/O while another handles network traffic. Multithreading allows parallelism within an application. In November 1999 ("DTS Error Handling Revealed"), we discussed the importance of choosing thread-safe data providers that fully implement the OLE DB Service Components interface, and the ramifications of using non-thread-safe providers. This warning extends to developers who are mixing DTS with VB.

The problem with mixing DTS and VB is that DTS is free-threaded (a variation of multithreading), whereas VB is apartment-threaded (a concept born of the COM architecture). The exact differences between the two and the definition of apartments are topics well beyond the scope of this article. For further reading on COM threading models, please refer to the Microsoft Developer Network (MSDN) Online Library under the topic Platform SDK/COM/COM Fundamentals/ Processes, Apartments, and Threads (http://msdn.microsoft.com/ library/psdk/com/aptnthrd_8po3.htm). However, the differences are serious enough that not handling them appropriately within your projects will lead to access violations at runtime. What makes this compatibility issue hard to diagnose is that Enterprise Manager and the Package Designer are free-threaded. Therefore, packages executing through the Package Designer won't experience the problem. Executing under VB, however, often generates this error:

Run-Time Error -2177221499 (80040005)

Provider generated code execution exception:
     EXCEPTION_ACCESS_VIOLATION

So, how do you avoid this access violation? It's simple. For any package that executes through VB or any package that references a custom task developed with VB, DTS is forced to run within the main thread. Programmatically, setting the step object's ExecuteOnMainThread attribute to TRUE does the same thing. Screen 8 shows how the Options tab of the Workflow Properties dialog box exposes this attribute within the Package Designer. The workflow properties belong to this package's Data Transform task. It's important to note that enabling the ExecuteOnMainThread attribute forces the associated tasks to run in serial rather than parallel, resulting in possible performance degradation (the performance degradation could be significant on an SMP machine).

Changing Columns into Rows
In relational database design, the first rule of normalization is to eliminate repeating groups. In an ideal world, all databases would follow this rule. However, in the real world, not all databases are relational and, even within many relational databases, repeating groups exist. This fact leaves database developers, especially in the data warehousing community, with the challenge of transforming repeating groups into normalized, relational tables, or to put it another way, splitting columns into rows.

At first glance, DTS might not seem to be a good candidate for solving this normalization problem. After all, as we've noted in several previous articles, transformation tasks are limited to a single operation or statement against the destination data store. To solve this problem, you'd potentially need many operations to properly convert one source row into several destination rows. DTS offers two possible solutions. The first solution uses a Data-Driven Query task with a stored procedure. With the data-driven query, DTS sends the entire source row to the server where the stored procedure resides, and then the stored procedure maps the columns into their appropriate INSERT statements. "The DTS Development Guide" (July 1999) covered the use of data-driven queries, so we won't go into detail here. Instead, we'll focus on the second solution to the splitter problem, using a transformation task's DTSTransformationStatus constants to control the flow of processing.

Within each transformation, you can control when the DataPump moves to the next row. By manipulating the DTSTransformationStatus value, you can force the transformation script to process the same row multiple times, thereby letting the transformation generate multiple INSERT statements from the same source row.

The following example illustrates this idea: At a hospital, multiple physicians are associated with a visit. The SourceData table, whose schema you see in Listing 1, represents the source system that supplies the data. The VisitPhysician table, whose schema is also in Listing 1, represents the destination in the warehouse.

First, build a package containing two connections, the first representing the sourcesystem and the second representing the warehouse. Next, add a Transform Task between the two connections and define a new ActiveX Script to handle the transformation. By default, DTS maps each source column to a destination column in column order. Before you add the transformation, you must delete the default column mappings by selecting and deleting each transformation line between the source and destination tables. Now, add the new transformation. As Screen 9 shows, when you're defining the transformation, select all columns in the source and destination tables and click New to create a new transformation of type ActiveX Script. Clicking New opens the ActiveX Script Transformation Properties dialog box. Place the VB Script from Listing 4 into this dialog box to implement the splitter.

In Listing 4, before the Main function declaration, establish and initialize a variable, nCounter, to keep track of where the code is in processing the row. The value 4 represents the number of output rows per source row. The function Main() begins by checking whether any rows are left to be inserted. When the function returns true, it first sets its return status to DTSTransformStat_SkipFetch. Here's the key to the splitter's functionality: By setting the status to SkipFetch, the transformation tells the DataPump not to get a new row from the source connection, but instead to resubmit the current row for additional processing. If nCounter equals 0, the transformation resets the nCounter value to 4 and sets the status to DTSTransformStat_SkipInsert. This time the transformation tells the DataPump not to write anything to the destination, but to move on to the next source row. The remainder of the function determines which PhysicianType to insert and sets the destination columns appropriately.

Using the data-driven query approach to solve this normalization problem offers a definite performance advantage over implementing the splitter by limiting the number of calls to the database. However, the splitter offers the flexibility of leveraging additional scripting and error-handling capabilities that aren't available to Transact SQL (T-SQL). Besides the splitter, you can use a similar technique to implement PivotTable-like functionality. A PivotTable goes in the opposite direction of the splitter—that is, where the splitter turns columns into rows, the PivotTable turns rows into columns. But we'll leave that example for a future article.

Being able to leverage common design and coding patterns is one of the hallmarks of a good developer. This month, we touched on several common scripting patterns that you can use on your DTS projects, but we've only scratched the surface. Send us email and tell us about your scripting tricks. We're always looking to expand our knowledge base and share that information with you.



ARTICLE TOOLS

Comments
  • Anonymous User
    8 years ago
    Nov 03, 2004

    To the Anonymous User, you should use global variables to set all values that are going to change when you move the package. Then, you simply have to update the global variables to whenever/wherever you move it.

  • Anonymous User
    8 years ago
    Oct 27, 2004

    I want to create a DTS package in the development SQL Server and move it to Production SQL Server later. How can i change the SQL Server Name, Source and Destination names used in the package with out much intervention from the Production DB Administrator.? Is there any way I can create some scripts like SQL Stored Procedure and edit it later.

  • Rohan Perera
    8 years ago
    Jun 05, 2004

    This article is very usefull for me. Thanks

  • Richard Burton
    9 years ago
    Oct 17, 2003

    Hi

    I'm searching the web to try and find any information on the 'Parameters' value in DTS package task 'Execute Process.

    So far I havent' found much, the batch job I'm calling from DTS is failing and so is my knowledge..!

    Cheers

  • Sambasiva Darbha
    11 years ago
    Jun 26, 2001

    Hi

    I have been trying for quite sometime now, to execute a DTS Package saved on my server under Local packages, from the command prompt using a batch file. It doesn't execute. It opensup a separate window but it doesn't execute. The syntax i had used to get it executed is as follows

    cmd /c \\\\abcdefghi\\MSSQL7\\binn\\dtsrun /S abcdefghi /E /N P1

    I had tried all combinations of flags for instance , saving it as a file and running and using USERID etc. But it doesn't seem to work.

    The server configuration we have is as follows..
    It's a Winnt CLUSTER of 2 servers , which i am accessing through their virtual Names. Some one told me that you cannot invoke the DTCRUN application on a virtual server. If this is true how do you schedule the package. Please help.

You must log on before posting a comment.

Are you a new visitor? Register Here