• subscribe
November 18, 1999 02:35 PM

Tuning SQL Server 6.5 for High-Performance Queries

SQL Server Pro
InstantDoc ID #7630
Reduce I/O and query runtime

As a DBA, you probably tune your database server so that it quickly and efficiently processes queries that applications send. Whether your organization purchases applications from another company or writes applications in-house, you might tend to see the program as a black box that can't be modified. You can conduct performance tuning at the server level by improving disk I/O, increasing memory, and creating or modifying indexes. Ultimately, though, the application's performance depends on the database's design and the queries written for it. Database application developers who understand how SQL Server optimizes and processes queries usually have the best-performing software and the fewest surprises as their applications scale from small installations to large ones. With a few simple tweaks of your SQL Server 6.5 database and queries, you can get optimal performance.

But what about SQL Server 7.0? Because SQL Server 7.0 automatically tunes most of its internal settings, its performance-tuning methodology differs significantly from the methods that you use to tune SQL Server 6.5. The information in this article about selecting indexes and minimizing I/O in SQL Server 6.5 also applies to 7.0, but most of the other suggestions aren't as effective in SQL Server 7.0 because much of its architecture is completely new. You can get more information at http://www.microsoft.com/ sql/productinfo/perftuning.htm, which has several white papers on performance tuning.

What Am I Optimizing?
SQL Server 6.5 uses a cost-based query optimizer. The biggest cost in most queries is I/O operations involving the disk. Because the hard disk is hundreds of times slower than RAM, anything you can do to limit hard-disk use is good. At a basic level, therefore, you're first trying to optimize physical I/O—a page read from the hard disk—and second, logical I/O—a page read from memory.

To optimize your server, first make sure that you allocate as much RAM as possible to SQL Server. In all cases, a logical read is many times faster than a physical read. In an ideal situation, you have enough RAM to cache your entire database. Unfortunately, you can't stop at minimizing physical reads. Even a fast logical read takes some time, and thousands, or even millions, of reads take much more time.

As an example, consider this simple two-table join.

SELECT * FROM Employees a INNER JOIN
   Payroll_Checks b ON (a.empID = b.empID)

One of my clients asked me why this query was running so slowly. After I turned on SHOWPLAN and looked at the query plan, the answer was obvious: The empID column in the Payroll_Checks table had no index. The Employees table had about 10,000 rows and the Payroll_Checks table had about 750,000 rows. Without the index, SQL Server 6.5 performed a table scan on Payroll_Checks 10,000 times. When I canceled the query, the server had already performed 15 million logical I/O operations. Putting an index on the empID column dropped processing time to seconds and logical I/O operations to approximately 750,000.

The Magic Row Sizes
The first step in minimizing I/O is to ensure that the row is as compact as possible. Rows can't span pages in SQL Server, and each page in SQL Server 6.5 can store a maximum of 32 bytes for the row header and 2016 bytes for the data. Each data row also has overhead in the row. The maximum size that any single row can be is 1962 bytes, including its overhead bytes. This limit is imposed so that a row being inserted or deleted can also fit on a transaction log page. Although a single row can't exceed 1962 bytes, two rows can take the whole 2016 bytes available for data on a page. So certain row sizes can help you decrease I/O by large percentages. For example, if your rows are 1009 bytes long, only one row will fit on a page. If you shorten the row size by one byte, two rows can fit on a page. You cut the I/O for a table scan in half simply by removing one byte! Similar situations exist for 673 bytes, 505 bytes, 404 bytes, etc. If you can keep the row size under these amounts, you'll lower I/O by 33 percent, 25 percent, and 20 percent, respectively.

All rows have some overhead bytes, so you need to take those bytes into consideration when you make your calculations. Note that variable-length rows have more overhead than fixed-length rows. To find out whether you have wasted space on a page, run DBCC SHOWCONTIG to determine the average page density and average number of free bytes per page. The most likely candidates for wasted space are tables for which the average number of free bytes is close to the size of a record.

A similarly wasteful situation exists for tables without clustered indexes and in which a large percentage of the rows were deleted. Deletions leave empty space on a page, and because SQL Server doesn't reuse space in a page if the table doesn't have a clustered index, all new rows in the table end up on the last page. As a result, the table will have pages that are less than 100 percent full, so table scans will take more I/O operations. Before you attempt to shrink the row size for such tables, create a clustered index. Then run DBCC SHOWCONTIG again to see how much free space the table has.

Unexpected Update I/Os
Although you might think that your table doesn't need a clustered index because it never has deleted rows, you might be surprised to learn that the UPDATE statement can leave empty spaces in your tables. A seemingly straightforward update such as this one

UPDATE Payroll_Charges SET ChargeAcctCode = 
   9102 WHERE ChargeAcctCode = 7102

can potentially create a tremendous number of transaction log entries. The problem stems from the way SQL Server structures the series of operations to prevent violating constraints. Here's a simple example:

UPDATE Authors SET au_id = au_id + 1

If au_id is a primary key (or a unique key), updating the first row will probably violate the uniqueness constraint, especially because au_id is a monotonically increasing value. But the UPDATE statement is valid, so how can SQL Server execute it without violating the constraints? Here SQL Server 6.5 uses a deferred-mode update in which it splits the operation into a row delete followed by a row insert with the new value.

SQL Server handles this situation by writing to the transaction log no-op records that tell which operation needs to be performed. Then, after identifying all the affected rows and writing a delete and an insert record into the log, SQL Server returns to the beginning of the transaction and starts applying the delete operations. When it finishes the deletes, it applies the inserts. The operations are full deletes and full inserts, and they include modifying all affected indexes.

Deferred updates can degrade database and application performance because they cause the transaction log to consume a lot of space and they take more time than direct-mode updates. The longer an update takes, the longer the exclusive locks stay in place and the longer other users must wait to access the page, which increases the likelihood of deadlocks.

SQL Server 6.5 can carry out your update operations in four ways. The fastest is a direct update in- place, where no row movement takes place, and the only information that has to be logged is one record that indicates which bytes changed to new values. The slowest update is the deferred update I just described. The other two types of updates are both direct updates (no extra logging is done) but not in place, so some row movement will occur. A long list of conditions must be met for your update to be an update in place, and I won't document them here. The SQL Server 6.5 Books Online (BOL) section titled The update mode is direct lists some conditions that must exist for SQL Server to use a direct in-place update. However, BOL confuses direct with update-in-place, so the discussion isn't always accurate. The most complete published description of the different types of updates and the complete set of conditions for the various kinds of updates to occur is in Inside SQL Server 6.5 by Ron Soukup (Microsoft Press, 1997).

Two main conditions that must be met for your updates to happen in place are that you can't modify the key columns in a clustered index and you can't mark the table for replication. Modifying the clustered index forces SQL Server to move a row to its proper physical location in the table, a process that it always implements as a delete followed by an insert. And because replication reads the log and creates ODBC commands for the subscribers, the delete/ insert combination is the most straightforward way to describe the update operation. Both situations force the update to not be in-place.

A similar rule applies to variable-length and nullable columns. The updated column in multirow updates must be fixed-length to use an in-place update. SQL Server stores nullable columns as variable-length columns, even if you declare them with a fixed-length data type. SQL Server always uses deferred mode for multirow updates of nullable columns.

Knowing these restrictions and configuring your database accordingly pays handsome dividends when you're trying to squeeze performance from updates. Using methods that involve direct-mode updates can save you log write I/O, logical page read I/O on the log pages, and time during log backups and restores and database recoveries. When you design databases, consider making fixed-length, non-null columns the standard. If you're a programmer, know the properties of the columns you update, keep the effects in mind when you write UPDATE statements, and carefully choose when to issue your statements.


ARTICLE TOOLS

Comments
    There are no comments to display. Be the first one!
You must log on before posting a comment.

Are you a new visitor? Register Here
  • SP1?
    I know there is a SP1 for SQL 2008 R2 available....and there is a "feature pack" as well... ...
  • SQL database mirroring
    I have SQL Server 2008 R2 Enterprise 64bit on Windows 2008 R2 Enterprise 64bit.  Each SQL Server has...
  • Dell Compellent Disk Drive
    Does anybody has experience with Dell Compellent Disk Drive? Basically, this system manages all disk...
  • Sql server performance tuning
    I need to find a tool that help me to optimize sql server,queries,improve the performance and solve ...