DOWNLOAD THE CODE:
Download the Code 93633.zip

Correct and Consistent Data Type Usage
In addition to working hard to isolate your columns and vertically partition for better column grouping, you should also decide what's the best data type for any given job. If you need a date but you don't need a precise time, or if your dates are all relatively recent, consider using smalldatetime instead of datetime to save 4 bytes per row. If you have multiple dates, you could save considerable space over the course of many rows.

Additionally, keep your data type choices consistent throughout your application. If your stored procedures use different (but "compatible") data types, SQL Server might have to do an implicit conversion during processing. This implicit conversion might require a data (or index) scan, whereas SQL Server could have used an index more efficiently (through a seek) if the column didn't need to be converted. SQL Server can certainly handle some of the implicit conversions required by the use of different data types, but explicitly converting or consistently using the same data type in all references will eliminate confusion and result in better performance.

For example, say a member table contains a varchar column that stores last name values. This column is indexed, and the last name Tripp is highly selective (there are only two rows that match the value Tripp).When you execute the two queries in Listing 1, you get the showplan output that Figure 1 shows. First, notice the batch cost of these two statements.The first statement is only 6 percent of the cost of the batch and the second statement is 94 percent. The only difference between the first and the second statement is in the implicit conversion, as the showplan tooltip shows.You might want to consider profiling your server and capturing the showplan XML events to filter for CONVERT_IMPLICIT. This technique might help you narrow your search for poorly written applications, stored procedures, and ad-hoc statements that aren't consistent in terms of search arguments and data types.

Column Nullability
Although nullability no longer affects the physical structure of your table (SQL Server versions before 7.0 used to track nulls by using a single space), nullability can still cause additional work in terms of using functions to change, replace, or ignore null values. I'm not recommending that you avoid nulls, I just want to stress that you should use them only when they make sense and when your data truly requires tracking the "third state" in your columns' domain of legal values. Also, the architecture of the null block (which is how SQL Server 7.0 and later tracks nulls) changed between SQL Server 7.0 and SQL Server 2000. In SQL Server 2005 and 2000, the null block has one bit for every column, regardless of whether that column allows nulls.The null block is at least one byte, and for bits that don't represent real columns, the value will be 1. If the bit represents a real column and the column value is null, the corresponding bit is set to 1.

For optimal performance, consistency is crucial. If you're creating scripts, the state of the column's nullability isn't a required table option (for the CREATETABLE statement) and as a result, the nullability is determined by session settings. The two session settings that might determine whether your column allows nulls are ANSI_NULL_DFLT_ON (which can be ON or OFF) and ANSI_NULL_DFLT_OFF (which also can be ON or OFF). Note that these two session settings can't be ON at the same time. However, both can be OFF at the same time. If both are off, then the database option ANSI null default will be used—if this option is ON. If all session settings are off and the database option is off, then the column won't allow nulls.The obvious problem with all of these rules is that they're complex (please don't bother trying to remember this). And more importantly, your script results might not be reproducible between systems, environments, and executions. Leaving the nullability option off also means that you might have data integrity problems with complex scripts and that you haven't clearly decided whether your columns should allow nulls. Column nullability isn't something that you should leave to chance.

LOB Data on the Data Page
SQL Server 2005 and 2000 both give you the ability to store text, ntext, and image data "on page." By storing the data on page with the in_row data structure, SQL Server could avoid an additional lookup into a separate table structure, thus reducing I/O. But you end up with the problem you have when your rows are too wide. Do you really need to use the LOB data on every request? Is it really beneficial to widen rows for LOB data if it's not often used in queries?

In SQL Server 2000, the default is that the data is always off page unless you use sp_tableoption to set the text in row table option; this option requires several bytes that dictate whether the LOB value is stored in the data row. For example, the string

sp_tableoption <tablename>, "text in row", 2000 

says that LOB values of up to 2000 bytes should be stored with the "in_row" structure.This setting might be useful if you have a LOB value that's almost always returned by every query and when the LOB value is typically under 2000. However, if the LOB value isn't frequently used, this option makes the "in_row" structure unnecessarily wide (for most requests) and can reduce the effectiveness of the cache by filling it with the unneeded data. As a result, I rarely recommend this option.

In SQL Server 2005, Microsoft introduced four new data types: varchar(max), nvarchar(max), varbinary(max), and XML. The first three data types replace text, ntext, and image, respectively. XML is a new type. Although the new "max" types are meant to replace the old types, they also introduce some new behaviors. First, by default, SQL Server 2005 automatically stores the new types with the in_row structures if the sum (in bytes) of the LOB value and the in_row structure is less than the in_row maximum of 8060 bytes per row. As in SQL Server 2000, if the LOB value is consistently (and relatively) narrow (in number of bytes) but infrequently queried, you might want to consider setting the new option as

sp_tableoption <tablename>, 
 'large value types out of row', 1 

to force SQL Server to store these values off the page. However, it's important to know your data and know your users in this case because data access patterns are crucial for dictating the best choice.

Finally, although SQL Server 2005's new LOB data types simplify usage—you use them much like a variable column with the same access and function support—they have one additional side effect when you use them in an index. SQL Server 2005 allows any column to be in the leaf level of an index—including LOB types. If a LOB type is in the leaf level of an index, the index won't support online operations.You would very rarely want a LOB column in the leaf level of a non-clustered index, but if you've followed some of Kalen Delaney's internals articles over the years (http://www.sqlmag.com), you know that the leaf level of the clustered index includes all columns of the table. Simply put, if a table has a LOB column, that table's clustered index doesn't support online operations. For this reason, I would once again consider vertical partitioning. For cases in which minimizing downtime is a top priority, consider creating one table that holds critical data and a second table of infrequently accessed columns and LOB columns. This technique will let you keep your more crucial columns online even when you're performing some maintenance operations.

Prev. page     1 2 [3] 4     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

The article was very well written and assisted with the meeting I had the very next day regarding our performance pitfalls. Thank you!

stevenmarin@hotmail.com

Article Rating 5 out of 5

Excellent aticle. I am a seasoned dba, IDMS, Teradata, DB2 and now SQL 2000/ 2005, I still pick up a lot of tips from the article. Thanks.

barnett

Article Rating 5 out of 5

 
 

ADS BY GOOGLE