SQL Server performance tuning is a large, complex topic. The key to successful performance tuning is breaking the job into smaller, simpler tasks. In Microsoft's SQL Server customer lab, the SQL Server Customer Advisory Team uses SQL Server waits and queues information as part of a repeatable methodologya performance-tuning "toolbox"to systematically track down and resolve customers' SQL Server performance problems.
We periodically bring customer applications into our SQL Server customer lab to evaluate application and database designs, measure performance, and make scalability recommendations. In only a week's time, we install the customer's application and databases, configure them for performance testing, and use the waits-and-queues methodology to pinpoint application bottlenecks, often revealing architecture and design problems. Although this methodology isn't a replacement for good physical design, proper indexing, and well-written SQL, it provides valuable insight into possible performance problems related to cache management, query-plan reuse, recompilation, transaction management, and resource utilization.
Together, waits and queues paint a detailed picture of an application's performance. SQL Server tracks wait statistics that include more than 50 reasons, called wait types, that a user connection might wait for a specific resource. You can see the wait information for user connections in the waittype and waittime columns in the master database's sysprocesses system table. SQL Server also aggregates waits across all user connections, providing a performance profile for a given workload. Thus, SQL Server wait types identify and categorize user (or thread) waits from an application-workload or user perspective.
Performance Monitor counters and other information sources provide the queues part of the performance methodology. When SQL Server executes an SQL statement for a user connection, it uses system resources such as memory, I/O, and CPU. If a request for system resources can't be immediately satisfied, the resource subsystem puts the request in a queue until the resources become available. Performance Monitor provides system counters that report the performance of these system resources.
To get useful information from the waits and queues performance data, you have to correlate wait types, Performance Monitor counters, and specific ratios of these counters to draw conclusions about performance and application bottlenecks. Here's how to evaluate the information you get from wait types.
Wait Types
Each user has an associated row in the master..sysprocesses system table. If a SQL Server user connection or thread isn't running for any reason, SQL Server records a wait type reflecting the reason and associated wait time in master..sysprocesses' waittype and waittime columns, respectively. Another sysprocesses column, lastwaittype, contains a character description of the last wait type for a given thread; SQL Server doesn't reset the lastwaittype description until another wait state occurs. Thus, a non-blank lastwaittype means the thread had at least one wait state. The waittype column is a varbinary column containing the current wait type. Together, a wait type of 0x0000 and a wait time of 0 means the thread isn't waitingit's running or runnable. For any blocked user connection, sysprocesses also contains a wait type showing that the connection's execution is blocked pending the release of locks. (For tips about finding blocking processes, see Kalen Delaney's article "Track Down Troublemakers," August 2003, InstantDoc ID 39453.)
By itself, the sysprocesses table isn't a great source of information for evaluating overall system performance for several reasons. First, user connections are transientthey come, they go, they change. Second, during its life, a user connection can wait many times for different reasons. And finally, because the waittime column in sysprocesses isn't cumulative, it reflects only the current (or last) wait. Fortunately, an undocumented DBCC command provides a source of wait information that can be very useful for our waits-and-queues methodology.
The DBCC SQLPERF(waitstats) command contains aggregate waits for all user connections. You can use this command to identify Performance Monitor counters that affect performance. This DBCC option reads the part of SQL Server memory that contains cumulative wait information without adversely affecting system performance. However, using DBCC SQLPERF(waitstats) for waits and queues analysis has two disadvantages. First, you can't see the relative importance or ranking of waits by percentage (I explain the usefulness of breaking down waits by percentage later). Second, DBCC SQLPERF(waitstats) returns wait types such as synchronization objects over which you have no control. Although SQL Server uses synchronization objects (such as the WAITFOR object in the track_ waitstats stored procedure that I discuss in a moment) internally to guarantee the integrity of the SQL Server process and user data, you should exclude them from waits analysis because they can skew your results. To overcome these disadvantages, I wrote a stored procedure called track_waitstats that ranks wait types by percentage and eliminates wait types such as synchronization objects that we have no control over.