Executive Summary:
The debugging and logging capabilities in SQL Server Integration Services (SSIS) are greatly improved over those found in DTS. In SQL Server Integration Services (SSIS), you can debug packages, Control Flow tasks, and Data Flow tasks. SQL Server Integration Services (SSIS) also includes logging capabilities that display information about a package after it's been executed.
|
It’s been more than two years since SQL Server
2005 shipped with SQL Server Integration Services
(SSIS). However, many companies still
haven’t converted their DTS packages to SSIS, possibly
because the migration process can be painful or they
don’t have the time to learn about a new product.
Those of you who have made the conversion know
that SSIS definitely isn’t the “next version of DTS.”
Debugging and logging are just two of many areas
that have undergone a complete overhaul in SSIS. Let’s
take a high-level look at SSIS’s debugging and logging
capabilities. If you’re among those who are still using
DTS, prepare to be impressed.
Debugging Packages During Development
SSIS is far ahead of DTS in the area of debugging,
which is the process of identifying errors that prevent a
package from being executed or producing the desired
results. In DTS,
debugging typically
involves using
MsgBox statements
or the VBScript
Stop command to
simulate a breakpoint in scripts. Such statements had
to be removed from the package before it was put into
production. In contrast, debugging is built into SSIS, and
nothing has to be removed when the package is moved
to a production environment.
The problem with SSIS’s debugging tools is that
they aren’t consistent across all types of tasks, so
knowing when to use what tools can be a big help. Let’s
look at SSIS’s debugging capabilities at the package
level, followed by debugging within Control Flow tasks
and Data Flow tasks.
Package-Level Debugging
During the development process, SSIS provides red or yellow icons within the SSIS designer that tell you
when something is inherently wrong with the package.
To view the message associated with an icon in a collection
of packages, hover your mouse over the icon,
as Figure 1 shows. These messages are typically related
to data-source connections or data-type problems.
Because it can take time to sift through the messages
in the Output pane (which is located below the Error
List pane in the SSIS designer) during debugging to
see what caused a package to fail, SSIS lets you click
the Progress tab from the designer to view an outline
structure of the package, as Figure 2 shows,
and see where the failure took place.
When you debug a package in a Business Intelligence
Development Studio environment, you can
see which task is running and how far it’s progressed
by the background
color of the task:
Yellow indicates
that the task is
running, green
indicates that the
task completed
successfully, and
red indicates
that the task
completed with
errors. Next to
Data Flow tasks,
you also get
a count of the
rows that have
been processed
by the task. You
might notice that
SSIS often runs
multiple tasks simultaneously, whereas DTS runs only
one task at a time. The pipeline nature of SSIS is one of
the core architectural differences between it and DTS.
Debugging Control Flow Tasks
Control Flow tasks control the flow of the package.
SSIS’s debugging tools for Control Flow tasks closely
resemble those available in any respectable development
environment. Breakpoints and the Debug windows
can be especially helpful in debugging Control
Flow tasks.
Breakpoints tell SSIS to pause execution at the
indicated point in the package. When processing
encounters a breakpoint in Debug mode, processing pauses, and the Debug windows give you access to
additional information such as variable values, call
stacks, and status messages. You can set breakpoints at
the package, container, task, or Script task level. You
can even set breakpoints to pause after a given number
of encounters with an event, such as on the tenth iteration
of a task in a For Loop construct. Figure 3 shows
how you can interrogate an object to reveal its property
values during a breakpoint session.
I find the Locals window, which is accessible from
the Debug/Windows/Locals menu item, to be the most
useful of SSIS’s Debug windows. It displays values
for all the variables in the package and even lets you
modify variable values, which can be useful when skipping
ahead to a later point in the package or simulating
certain conditions during package development.
Debugging Data Flow Tasks
Data Flow tasks control how and when data is manipulated
in the package. The primary debugging tool
available in Data Flow tasks is the data viewer. I use
the data viewer during the development of Data Flow
tasks to see what the data looks like in the pipeline as it
flows from one task to another—usually just before the
task that actually loads the data into its destination.
You can add a data viewer between two Data Flow
tasks by right-clicking the connector between the tasks,
selecting Data Viewers, and then selecting the data
viewer you want to use. You can view the data in a grid,
chart, scatter plot, or histogram. I usually view data in
a grid, as shown in Web Figure 1, but I recommend playing with
all four data viewers to get a feel for when you should
use each one. Although I typically remove my data
viewers before deploying a package in production, you
don’t have to do so.
You can modify columns displayed by a data viewer
either as you set up the data viewer or after setup. To
modify a data viewer after you’ve created it, right-click
the connector, select Data Viewers, highlight the data
viewer, then click Configure.
Another Data Flow task debugging technique that
I use frequently during package development and
debugging is the RowCount task. The RowCount task
relates only to Data Flow tasks. The RowCount task
isn’t usually billed as a debugging tool but can be quite
useful as one. The RowCount task simply counts the
rows passed through the pipeline and puts the final row
count into a variable. I almost always use the RowCount
task as my initial data destination because it serves as a
way of examining the data, via a data viewer, without
actually loading the data anywhere. Because the Row-
Count task carries no measurable overhead, it can also
be used for baselining or to diagnose performance problems.
For more information about using the RowCount
task, see the Microsoft white paper “Integration Services:
Performance Tuning Techniques” (www.microsoft.com/technet/prodtechnol/sql/2005/ssisperf.mspx).
Continued on page 2