• subscribe
October 20, 2011 12:48 PM

SQL Server's Trace and Replay Tool

Identify performance problems before a SQL Server upgrade and hardware migration
SQL Server Pro
InstantDoc ID #139800

You finally received the approval to upgrade SQL Server and buy a new server. As you think about all the new SQL Server features that you want to take advantage and the benefits of having new hardware, your boss stops by. She asks, “Now that we can spend the money on this project, how are you going to be certain that we won’t have any issues? Are you sure that things will run faster on the new server? How will you know?” As your boss walks away, you break out in a cold sweat at the thought of the upgrade failing or performing poorly. “There must be an easy, low-cost way to predict performance,” you say to yourself as you start searching for a tool. The good news is that there is such a tool. You can use SQL Server Profiler’s replayable trace functionality.

As a consultant, I’m often asked to assist with SQL Server upgrades and hardware migrations. I’ve found that many IT shops just assume that a new version of SQL Server will perform better than the old one. What they fail to realize is that they might be running legacy code that isn’t efficient or won’t benefit from the new functionality in the new SQL Server version. In the case of a migration to new hardware, many system administrators will configure a new server the same as the old one, which can perpetuate poor server response. Unfortunately, the poor response isn’t realized until a load is placed on the new server. This often happens the day that users are allowed to access the new server. Failing to test new versions of SQL Server or new hardware can result in unpleasant surprises after the upgrade or migration.

Using Profiler’s replayable trace functionality (i.e., the trace replay template), you can test query compatibility and performance in a new SQL Server version or test a workload on new hardware. You can also use it to find T-SQL code that needs to be optimized or disks that need to be reconfigured. You can even use it as an inexpensive stress tester. Although there are other tools you can use for these tasks (e.g., RML Utilities, SQLIO, SQL Server Migration Assistant), they might not be immediately available for use on a new version of SQL Server. This isn’t an issue with the replayable trace functionality because it’s built into SQL Server. So, I’ll concentrate on how to use this powerful yet flexible tool. I’ll explain how the replayable trace process works, describe possible uses for it, and cover some “gotchas” that you need to be aware of.

 

The Process

Figure 1 outlines the replayable trace process in a graphical workflow. As you can see, you need to perform a backup, capture the workload, move the backup and trace files to the test server, perform the control run, perform a test run, and compare the performance data from those runs.

 

Figure 1: Understanding the replayable trace process
Figure 1: Understanding the replayable trace process

Performing the backup. You need to back up the database (or databases) from which you want to capture a workload. You can use any method as long as it creates a backup that you can use to restore the database back to the point right before the workload was captured. For example, you can perform a full backup right before capturing the workload or bring the last nightly full backup up-to-date with transaction log backups.

Capturing the workload. You need to capture the workload by collecting trace data. Because there are differences in the trace script generated by newer versions of SQL Server, it’s best to use the server-side trace script generated by your production SQL Server instance. For example, if your production server is running SQL Server 2005, then you should use a SQL Server 2005-generated trace script. Similarly, if your production server is running SQL Server 2008, then you should use a SQL Server 2008-generated trace script.

Moving the files. After the backups and trace files are created, you need to move them to the test server. The test server should be running the same software as the production server, including the same versions of SQL Server and Windows.

Performing the control run. On the test server, you need to first restore the database using the backup files. Then, you need to replay the workload from the trace files. This is referred to as the control run.

During the control run, you need to capture performance data. Ideally, you should capture this data from a remote client because collecting performance data on the same server that’s running the workload can skew the server’s performance. The performance data captured can be as simple as a SQL trace. Here are some SQL trace events that are often collected:

  • Cursors
  • Database
  • Lock:Cancel events and Deadlock-related events
  • Errors and warnings
  • Scans
  • Stored procedures (completed events)
  • Transactions
  • T-SQL (completed events)
  • Deprecation events (especially when upgrading)


ARTICLE TOOLS

Comments
  • pl800
    2 days ago
    May 22, 2012

    P. S. When Paul Randal sees that you've put fixed values for Page Life Expectancy thresholds...

  • pl800
    2 days ago
    May 22, 2012

    Lori,

    There is 1 problem here. SQL Server executes queries in the captured workload as fast as possible, and not at the same time intervals. Therefore any comparison between two servers with different hardware is not so relevant. Not only query execution times will be skewed, but most importantly the performance counters. The only reliable way to compare new hardware to old would be to do the control run on a intermediate test server that is the same as the production server hardware-wise, and do a test run on the new server (where DBs will be migrated to).

You must log on before posting a comment.

Are you a new visitor? Register Here