Error Logging
Because you can’t foresee all the conditions that will ultimately occur in a production environment, SSIS provides powerful and flexible logging capabilities to display information about a package after it’s been executed. The challenge is to log enough information to help you quickly diagnose and minimize the impact of problems that might occur in production.
Several error log providers let you specify where log messages will be written—to text files, SQL Server Profiler, a SQL Server table, the Windows event log, or XML files. You can indicate for which events log messages will be written and the amount of information that’s written. The type of events that trigger log messages can vary based on the Control Flow task type. For instance, a Data Flow task can log events such as OnPipelineRowsSent and PipelineInitialization.
To add logging to a package, click Logging on the SSIS menu and select one or more error log providers (i.e., log entry destinations), which enable you to write to a target destination. Select the check box next to the events you want to log, then click the Details tab and specify the events you want to log. Next, click Advanced to specify the columns to be logged, otherwise all the columns will be logged. After configuring logging, you can view log events in real time during development runs by right-clicking in the Control Flow designer and selecting Log Events.
Because the error log identifies each logged task by name, I recommend implementing a naming standard that uniquely identifies the running SSIS package. Depending on the events you choose to log, the error log can grow fairly rapidly. Be sure to log only the events that you need and occasionally prune old log entries. Because SSIS doesn’t include a process to do this out of the box, you must manually prune logs if your error log provider doesn’t provide the functionality to do so. For example, if the error log provider is configured to send log messages to SQL Server, rows in the msdb.dbo.sysdtslog90 table can be deleted after a specified period of time. You can also create logging configurations as templates to provide a consistent strategy across packages that perform similar functions and make log management easier.
Checkpoints
Checkpoints, another powerful SSIS feature, let you restart a failed package in production. SSIS package execution—especially extraction, transformation, and loading (ETL) package execution—can be lengthy. By its nature, ETL moves and transforms large amounts of data, which can be time-intensive. The failure of a package or task two hours into an ETL process could be catastrophic in a data warehouse system that’s required to be available by a set time. Checkpoints let you save the work that’s been accomplished so that when you restart the package after resolving the problem that caused it to fail, the process can pick up where it left off. Checkpoints aren’t enabled by default, however; they have to be turned on for each package.
Knowing how checkpoints work before you develop your packages is important because they can affect package design. Note that checkpoints can only be applied to Control Flow tasks. For example, a package can’t be restarted in the middle of a Data Flow task, which is considered to be a unit of work that’s either entirely successful or not. The lack of ability to checkpoint Data Flow tasks provides a good argument for componentizing packages to break logical tasks into Data Flow tasks or, ideally, to try to group packages together based on a Data Flow tasks’ packages. For example, you might modularize the packages by taking one big package and making into several smaller packages grouped together inside a controlling package. Then you could set a checkpoint on each module package.
Each package must be configured to be checkpointcapable on the Details tab of the Package window (shown in Figure 4), which can be accomplished in a SSIS process that includes parent and child packages. You can configure each package to create a checkpoint log to track execution information in case the package fails, as shown in Figure 4. If the entire process runs without error, the logs are removed when the ETL process is complete. Web Table 1 shows a high-level view of the properties you have to set for each package.
Debugging and Logging in SSIS
As you can see, SSIS provides more powerful debugging and logging capabilities than DTS, enabling you to debug packages more efficiently. Knowing when and how to use these tools can greatly reduce the time it takes to develop packages and resolve problems before and after the package is sent to production.