SideBar    Diagnosing Network Problems
DOWNLOAD THE CODE:
Download the Code 40925.zip

SQL Server performance tuning is a large, complex topic. The key to successful performance tuning is breaking the job into smaller, simpler tasks. In Microsoft's SQL Server customer lab, the SQL Server Customer Advisory Team uses SQL Server waits and queues information as part of a repeatable methodology—a performance-tuning "toolbox"—to systematically track down and resolve customers' SQL Server performance problems.

We periodically bring customer applications into our SQL Server customer lab to evaluate application and database designs, measure performance, and make scalability recommendations. In only a week's time, we install the customer's application and databases, configure them for performance testing, and use the waits-and-queues methodology to pinpoint application bottlenecks, often revealing architecture and design problems. Although this methodology isn't a replacement for good physical design, proper indexing, and well-written SQL, it provides valuable insight into possible performance problems related to cache management, query-plan reuse, recompilation, transaction management, and resource utilization.

Together, waits and queues paint a detailed picture of an application's performance. SQL Server tracks wait statistics that include more than 50 reasons, called wait types, that a user connection might wait for a specific resource. You can see the wait information for user connections in the waittype and waittime columns in the master database's sysprocesses system table. SQL Server also aggregates waits across all user connections, providing a performance profile for a given workload. Thus, SQL Server wait types identify and categorize user (or thread) waits from an application-workload or user perspective.

Performance Monitor counters and other information sources provide the queues part of the performance methodology. When SQL Server executes an SQL statement for a user connection, it uses system resources such as memory, I/O, and CPU. If a request for system resources can't be immediately satisfied, the resource subsystem puts the request in a queue until the resources become available. Performance Monitor provides system counters that report the performance of these system resources.

To get useful information from the waits and queues performance data, you have to correlate wait types, Performance Monitor counters, and specific ratios of these counters to draw conclusions about performance and application bottlenecks. Here's how to evaluate the information you get from wait types.

Wait Types
Each user has an associated row in the master..sysprocesses system table. If a SQL Server user connection or thread isn't running for any reason, SQL Server records a wait type reflecting the reason and associated wait time in master..sysprocesses' waittype and waittime columns, respectively. Another sysprocesses column, lastwaittype, contains a character description of the last wait type for a given thread; SQL Server doesn't reset the lastwaittype description until another wait state occurs. Thus, a non-blank lastwaittype means the thread had at least one wait state. The waittype column is a varbinary column containing the current wait type. Together, a wait type of 0x0000 and a wait time of 0 means the thread isn't waiting—it's running or runnable. For any blocked user connection, sysprocesses also contains a wait type showing that the connection's execution is blocked pending the release of locks. (For tips about finding blocking processes, see Kalen Delaney's article "Track Down Troublemakers," August 2003, InstantDoc ID 39453.)

By itself, the sysprocesses table isn't a great source of information for evaluating overall system performance for several reasons. First, user connections are transient—they come, they go, they change. Second, during its life, a user connection can wait many times for different reasons. And finally, because the waittime column in sysprocesses isn't cumulative, it reflects only the current (or last) wait. Fortunately, an undocumented DBCC command provides a source of wait information that can be very useful for our waits-and-queues methodology.

The DBCC SQLPERF(waitstats) command contains aggregate waits for all user connections. You can use this command to identify Performance Monitor counters that affect performance. This DBCC option reads the part of SQL Server memory that contains cumulative wait information without adversely affecting system performance. However, using DBCC SQLPERF(waitstats) for waits and queues analysis has two disadvantages. First, you can't see the relative importance or ranking of waits by percentage (I explain the usefulness of breaking down waits by percentage later). Second, DBCC SQLPERF(waitstats) returns wait types such as synchronization objects over which you have no control. Although SQL Server uses synchronization objects (such as the WAITFOR object in the track_ waitstats stored procedure that I discuss in a moment) internally to guarantee the integrity of the SQL Server process and user data, you should exclude them from waits analysis because they can skew your results. To overcome these disadvantages, I wrote a stored procedure called track_waitstats that ranks wait types by percentage and eliminates wait types such as synchronization objects that we have no control over.

   Prev. page   [1] 2 3     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

Great artiI was reading the latest article Opening Microsoft's Performance-Tuning Toolbox and was trying to locate the document "Wait Types, Perfmon Counters and Correlations" which is specified in the article. I cannot locate this document. Can you please tell me how I can get it?

Thank you.

Pete Williams

Thank you Tom for a very interesting article. I am using your methodology whenever we have a perfromance problem. I hope in future articles you' ll show us more of the tools/scripts in the "SQL Server Customer Advisory Team" tool box.

Cheers

Rachid

rachid.ould-hamouda

Excellent article. It has helped me determine which items are not responsible for application performance issues we are experiencing.

Matthew Robinette

Excellent article. It has helped me determine which items are not responsible for application performance issues we are experiencing.

In virtually every issue of SqlMag there is at least one article that pertains to an issue I am currently experiencing.

Thanks for the help!

Matthew Robinette

The article gives relative ratios of various counters (A > B is good) but how does 2/1 compare to 10/1? More precision please.

Mark Pollina

Could you give me the syntax for run the track_waitlist stored procedure?

thanks, jodi

Jodi Pritchard

Very relevant and proffessional

tom parnell

Wonderful! Finally after all these years Microsoft engineer(s) see the light. Wait event based tuning is there for almost ten years in the Oracle world. Tom gives a very good explanation about the how's and why's, and the first proper 'wait event' documentation ever! I hope this will eventually lead to a per session wait interface, and even more detailed documentation.

Mario Broodbakker, The Netherlands

It is a great article and as Mario pointed out it is in right direction. I hope to see session level waitstats capture feature in SQL 2005, together with object level IO Statistics.

In this article however, I have a question about waitstat capture procedure. It captures by default data 10 times every 10 minutes, however the report produced only prints detail about the last sample. The waits can't be summed from the first sampling to the last, it needs dbcc sqlperf (waitstats,clear) within the sampling loop to be more useful.

skhanal

Article Rating 4 out of 5