Ratio of compilations, recompilations, and batch requests. You can use the Performance Monitor counters SQLServer:SQL Statistics:SQL Compilations/sec and SQLServer: SQL Statistics:SQL Re-compilations/sec to diagnose compilation and recompilation problems. SQL Compilations/sec includes both initial compiles and recompiles for all stored procedures. Because SQL Re-compilations/sec includes only recompiles, subtracting SQL Re-compilations/sec from SQL Compilations/sec yields the number of initial compiles, as the following formula shows:
Initial Compiles(%)=(SQL
Compilations/sec - SQL Re-
compilations/sec) /Batch
Requests/sec
If the number of initial compiles is low compared to recompilations, you might have a recompilation problem. In contrast, a low recompilation number means that most compilations are initial compilations and that SQL Server isn't effectively reusing query plans.
If initial query-plan creation is high (when compared to the number of batch requests), SQL Server might not be reusing query plans effectively. You can use the following formula to aid your analysis:
Plan Reuse (%) = (Batch
Requests/sec - Initial
Compiles/sec) / Batch
Requests/sec
If Batch Requests/sec are higher than Initial Compiles/sec, you have some plan reuse. If Batch Requests/sec equals Initial Compilations/sec, you have no plan reuse. You can see which query plans SQL Server reuses by looking in the usecounts column in master..syscacheobjects. A usecounts value greater than 1 indicates plan reuse for executable plans.
Disk Queue Length vs. Avg. Disk Seconds/transfer. The Performance Monitor counters PhysicalDisk:Avg Disk Queue Length and PhysicalDisk:Avg Disk Seconds/transfer provide measures of I/O subsystem performance. A queue results when SQL Server can't immediately satisfy requests for I/O. As disk queue length increases, so does the Avg Disk Seconds/transfer value. Although many people rely on queue-length information to determine whether an I/O subsystem bottleneck exists, the best source of bottleneck information is the Avg Disk Seconds/transfer counter because it factors in the effects of queues. From an end-user point of view, the average time it takes to complete a read or write is a true reflection of I/O performance. For example, if you bulk-load data into SQL Server and both the bulk-load file and SQL Server database file are on the same drive, you could get a sustained disk queue length of 1 and poor Disk Reads/sec and Disk Writes/sec performance. The queue length of 1 might not seem to be a problem; but the Disk Reads/sec value might be as high as 25ms, indicating an I/O subsystem bottleneck. Table 3 shows other effects SQL Server might have on I/O resources.
Low page-life expectancy with high checkpoint pages and lazywrites. You can use the Performance Monitor counters Page Life Expectancy, Checkpoint Pages/sec, and Lazywrites/sec to find out how long data pages stay in cache and how much cache activity (checkpoints and lazywrites) is required to keep memory available for current activities. Low page-life expectancy combined with high checkpoint pages and lazywrites causes memory pressure. To improve performance, implement one or more of the following changes:
- Add more memory to the Windows server.
- Increase SQL Server memory.
- Avoid table and index scans by using proper indexing.
- Eliminate bad query plans.
These interesting ratios can help you find ways to diagnose more complex performance problems. If a performance problem persists, you can keep extending your search by looking at other factors that might affect your system, such as network limitations. The sidebar "Diagnosing Network Problems" describes a method for determining how big a part network limitations play in performance.
Application and Database Design Problems
Application and database design affect performance in a variety of ways depending on the nature of your business and how you use your database. When you use the waits-and-queues methodology, you have to consider the design constraints of the application or database you're using. Table 4 describes some application-design implications.
For example, in an OLTP environment, plan reuse is typically desirable when you're performing highly predictable inventory lookups or order entry. In these cases, result sets are uniform and plan reuse is important for scalability. In other environments, such as those that extensively use ad hoc queries and reporting, plan reuse might not be desirable. Plan reuse isn't good when result-set sizes can vary greatly depending on parameter values or when intermediate work tables vary in size. For example, finding out how many widgets your company sold yesterday in California is different from finding the number of widgets sold last year, and you don't want to reuse the same query plan to find these two values.
Database design problems can range from improper indexing to overly normalized database designs in an OLTP environment. The kind of correlations you discover in your waits and queues analysis will depend on how you use your database.
The key to using the waits-and-queues methodology is to corroborate wait statistics with queues information to draw meaningful conclusions about performance problems. For example, a high percentage of a certain SQL Server wait type signals the need for further Performance Monitor investigation of the usage of underlying resources such as the processor, I/O subsystem, and network.
In some cases, an experienced performance expert must look beyond the symptom to find the root problem, and the "Wait Types, Perfmon Counters, and Correlations" document can help you pinpoint causes of performance problems. The document's correlated performance information, possible conclusions and actions, and interesting ratios and comparisons sections shed light on root problems for given symptoms. Correlating wait types information with performance counters and related counter ratios provides a broad view of application performance.