Once upon a time, a client engaged me to help justify the pending purchase of a $100,000 SAN. With 3,000–4,000 concurrent users accessing over 150GB of data, the client was starting to encounter performance problems. Hardware vendors had recommended a new SAN to tame the I/O bottleneck, and I was on hand to OK the proposed solution and help with the migration, should one be needed.
However, an evaluation of the client’s workload and current environment made it painfully obvious that although more than $35,000 worth of disks, RAID controllers, and other hardware was already in place, it simply wasn't being utilized efficiently. By using only the existing hardware, we increased performance and eliminated the “need” for an expensive SAN. (In hindsight, I should have charged a commission on the money saved, instead of merely charging by the hour.) By applying the following strategies, which I used to help my client, you can boost performance and possibly save yourself significant hardware costs.
Storage Is Not Created Equal
Disk drives keep getting less expensive, but the cost for swarms of high-end hard drives and RAID controllers is still a major concern for most organizations. However, not all hardware is a good choice for use with SQL Server. Table 1 lists some common types of RAID arrays, Table 2 lists common ways for storage to connect to computers, and Table 3 lists common disk rotational speeds. Each table tells you how good different hardware choices are for SQL Server. The tables make it obvious that not all storage is created equal—which is fine, as performance requirements vary from deployment to deployment. Accordingly, I like to approach disk optimization scenarios by identifying what I call “premium disk,” the most performant storage option available. The key to identifying premium disk is the fact that almost all SQL Server deployments have access to at least two kinds of disk. For example, in the disk configurations that Figure 1 shows, the premium disk on Server 1 is obviously the 220GB volume, while on Server 2 it's the 640GB volume.
Figure 1: Sample SQL Server Configurations
Server 1
8 cores; 64GB RAM
220GB RAID 1+0 with 15,000rpm Fibre Channel–attached drives
1TB SAN (RAID 5 on 10,000rpm iSCSI-attached drives)
Server 2
2 cores; 32GB RAM
640GB RAID 5 on 7200RPM SATA disks
800GB RAID 1 accessed via Network File Share
Premium disk capabilities vary widely, but you can usually count on two things. First, fast premium disks are also small because high-performance drives generally sacrifice capacity for speed. Second, there’s less premium disk available than non-premium disk and, if the system has been around for a while, there’s less premium disk capacity available than you want. The key to striking a balance between performance and cost lies in optimizing the performance of premium disk while off-loading as many tasks as possible to non-premium disk.
TABLE 1: Commonly Used RAID Types |
Type |
Pros |
Cons |
Recommendation |
RAID 0
(striping) |
Best performance. Reads and writes are split evenly over all disks in the array, making them very fast. |
No redundancy. Loss of a single drive results in complete data loss. |
Not recommended for use in SQL Server environments due to the risk of data loss. |
RAID 1
(mirroring) |
Reliable fault-tolerance because data is written to all disks in array. Reads can be split over all the disks in the array. |
Writes are slowed to the speed of a single disk because all writes are written to each disk. |
Works well for low-demand data files. Excellent for log files. |
RAID 5
(striping with parity) |
Excellent redundancy:
A single drive can fail without jeopardizing the entire array. Decent read performance. Very affordable. |
To enforce redundancy, RAID-5 solutions require two reads and two writes for every logical write operation, making writes very expensive. |
Affordable solution for high-performance workloads that don’t do lots of writes. Don't use in systems where writes represent more than 10 percent of overall activity. |
RAID 1+0
(mirroring plus striping) |
Combines strengths of RAID-0 and RAID-1 to deliver fault protection and great performance. |
Requires lots of disks and high-end controllers. One of the most expensive RAID solutions. |
Excellent for all high-performance database needs. Slight overkill for log files in some cases. |
TABLE 2: Common I/O Buses |
Bus |
Speed |
Summary |
SATA |
Up to 300Mbps1 |
Inexpensive successor to ATA. Easy to manage and configure. Currently there are no 15,000rpm SATA disks available and only a few 10,000rpm SATA disks. |
SCSI |
Up to 320Mbps2 |
Historically high performers. Slightly difficult to configure. More expensive than SATA but have best support for high-speed (10,000 and 15,000rpm) disks. Supports multiple devices per channel. |
Serial Attached SCSI |
Up to 375Mbps |
Successor to SCSI (despite slightly lower speeds). Easier to configure than SCSI but still maintains access to high-performance disk drives. Provides higher redundancy than SATA controllers and can use high-speed SCSI disks or cheaper SATA disks if desired. Marketed at enterprise servers and widely used today. |
iSCSI |
Up to 125Mbps3 |
Commonly used in SANs, where SCSI commands are relayed over Ethernet. Low throughput is acceptable for backups and other low-level needs, but not a good candidate for highly used SQL Server resources. |
Fibre Channel |
Up to 425Mbps4 |
Lots of configuration options exist; can be used to connect to both internal and external drives. Commonly used in clustering scenarios against high-speed SCSI disks for high-performance and high-availability solutions. Extremely fast, but expensive. |
For comparison purposes, USB is 60Mbps and FireWire 800 is almost 100Mbps.
1SATA 300. SATA 150 is only half of a SATA 300 bus and doesn’t support multipath I/O.
2with Ultra320 SCSI; other types of SCSI buses also exist.
3Over Gigabit Ethernet.
4with 4GBps Fibre Channel
TABLE 3: Disk Rotational Speeds |
Rotation Speed |
Rough Seek Time |
Summary |
5400rpm |
5+ ms |
Low speed, low heat, low performance. Common in laptops. |
7200rpm |
4 ms |
Suitable for servers where disk activity is minimal. Common in desktops. |
10,000rpm |
3 ms |
Capable of 125Mbps throughput. |
15,000rpm |
2 ms |
Trades capacity for quickest access speeds and throughput. |