• subscribe
March 25, 2009 12:00 AM

Maximize Storage Performance

Increase performance without increasing costs
SQL Server Pro
InstantDoc ID #100893

Once upon a time, a client engaged me to help justify the pending purchase of a $100,000 SAN. With 3,000–4,000 concurrent users accessing over 150GB of data, the client was starting to encounter performance problems. Hardware vendors had recommended a new SAN to tame the I/O bottleneck, and I was on hand to OK the proposed solution and help with the migration, should one be needed.

 

However, an evaluation of the client’s workload and current environment made it painfully obvious that although more than $35,000 worth of disks, RAID controllers, and other hardware was already in place, it simply wasn't being utilized efficiently. By using only the existing hardware, we increased performance and eliminated the “need” for an expensive SAN. (In hindsight, I should have charged a commission on the money saved, instead of merely charging by the hour.) By applying the following strategies, which I used to help my client, you can boost performance and possibly save yourself significant hardware costs.

Storage Is Not Created Equal
Disk drives keep getting less expensive, but the cost for swarms of high-end hard drives and RAID controllers is still a major concern for most organizations. However, not all hardware is a good choice for use with SQL Server. Table 1 lists some common types of RAID arrays, Table 2 lists common ways for storage to connect to computers, and Table 3 lists common disk rotational speeds. Each table tells you how good different hardware choices are for SQL Server. The tables make it obvious that not all storage is created equal—which is fine, as performance requirements vary from deployment to deployment. Accordingly, I like to approach disk optimization scenarios by identifying what I call “premium disk,” the most performant storage option available. The key to identifying premium disk is the fact that almost all SQL Server deployments have access to at least two kinds of disk. For example, in the disk configurations that Figure 1 shows, the premium disk on Server 1 is obviously the 220GB volume, while on Server 2 it's the 640GB volume.

 

Figure 1: Sample SQL Server Configurations
Server 1
8 cores; 64GB RAM
220GB RAID 1+0 with 15,000rpm Fibre Channel–attached drives
1TB SAN (RAID 5 on 10,000rpm iSCSI-attached drives)

Server 2
2 cores; 32GB RAM
640GB RAID 5 on 7200RPM SATA disks
800GB RAID 1 accessed via Network File Share

 

Premium disk capabilities vary widely, but you can usually count on two things. First, fast premium disks are also small because high-performance drives generally sacrifice capacity for speed. Second, there’s less premium disk available than non-premium disk and, if the system has been around for a while, there’s less premium disk capacity available than you want. The key to striking a balance between performance and cost lies in optimizing the performance of premium disk while off-loading as many tasks as possible to non-premium disk.

 

TABLE 1: Commonly Used RAID Types

Type

Pros

Cons

Recommendation

RAID 0 (striping)

Best performance. Reads and writes are split evenly over all disks in the array, making them very fast.

No redundancy. Loss of a single drive results in complete data loss.

Not recommended for use in SQL Server environments due to the risk of data loss.

RAID 1 (mirroring)

Reliable fault-tolerance because data is written to all disks in array. Reads can be split over all the disks in the array.

Writes are slowed to the speed of a single disk because all writes are written to each disk.

Works well for low-demand data files. Excellent for log files.

RAID 5 (striping with parity)

Excellent redundancy: A single drive can fail without jeopardizing the entire array. Decent read performance. Very affordable.

To enforce redundancy, RAID-5 solutions require two reads and two writes for every logical write operation, making writes very expensive.

Affordable solution for high-performance workloads that don’t do lots of writes. Don't use in systems where writes represent more than 10 percent of overall activity.

RAID 1+0 (mirroring plus striping)

Combines strengths of RAID-0 and RAID-1 to deliver fault protection and great performance.

Requires lots of disks and high-end controllers. One of the most expensive RAID solutions.

Excellent for all high-performance database needs. Slight overkill for log files in some cases.

 

TABLE 2: Common I/O Buses

Bus

Speed

Summary

SATA

Up to 300Mbps1

Inexpensive successor to ATA. Easy to manage and configure. Currently there are no 15,000rpm SATA disks available and only a few 10,000rpm SATA disks.

SCSI

Up to 320Mbps2

Historically high performers. Slightly difficult to configure. More expensive than SATA but have best support for high-speed (10,000 and 15,000rpm) disks. Supports multiple devices per channel.

Serial Attached SCSI

Up to 375Mbps

Successor to SCSI (despite slightly lower speeds). Easier to configure than SCSI but still maintains access to high-performance disk drives. Provides higher redundancy than SATA controllers and can use high-speed SCSI disks or cheaper SATA disks if desired. Marketed at enterprise servers and widely used today.

iSCSI

Up to 125Mbps3

Commonly used in SANs, where SCSI commands are relayed over Ethernet. Low throughput is acceptable for backups and other low-level needs, but not a good candidate for highly used SQL Server resources.

Fibre Channel

Up to 425Mbps4

Lots of configuration options exist; can be used to connect to both internal and external drives. Commonly used in clustering scenarios against high-speed SCSI disks for high-performance and high-availability solutions. Extremely fast, but expensive.

For comparison purposes, USB is 60Mbps and FireWire 800 is almost 100Mbps.
1SATA 300. SATA 150 is only half of a SATA 300 bus and doesn’t support multipath I/O.
2with Ultra320 SCSI; other types of SCSI buses also exist.
3Over Gigabit Ethernet.
4with 4GBps Fibre Channel

 

TABLE 3: Disk Rotational Speeds

Rotation Speed

Rough Seek Time

Summary

5400rpm

5+ ms

Low speed, low heat, low performance. Common in laptops.

7200rpm

4 ms

Suitable for servers where disk activity is minimal. Common in desktops.

10,000rpm

3 ms

Capable of 125Mbps throughput.

15,000rpm

2 ms

Trades capacity for quickest access speeds and throughput.

 



ARTICLE TOOLS

Comments
  • Kevin
    3 years ago
    Apr 20, 2009

    It is interesting point that , from my understanding after reading your statement, several files in same filegroup which is in same disk will make better performance because SQL Server give assign threads to each of the files, so one thread per file instead of one thread per disk; is my understanding right?

  • Michael
    3 years ago
    Apr 07, 2009

    @dnewtontmw: I was actually going to take umbrage at your comment as (in my mind) the advice I gave clearly didn't work off the one-thread-per-CPU mind-set. (And, truth is, I hate that PSS post anyhow, because it does state that multiple files/filegroups CAN/DOES speed up perf, but spends so much time focused on other details that I think that MANY readers focus too much on the words 'Urban Legend' and wronfully assume that asynch IO can't be leveraged to drastically increase throughput.)

    At any rate.. after I got off my high-horse, I reread what I had written and realized that I sadly did step in it, as my analogy of using multiple files concludes with the statement "instead of forcing query execution to wait on a single thread to pull data from disk."... which clearly sounds wrong from a technical standpoint. (Still kind of scratching my head about what I meant or was thinking when I wrote that...so feel free to throw crap at me as I deserve it.)

    That said, the point is: the use of multiple files can facilitate asynchronicity by increasing available I/O bandwidth (or by helping to keep the I/O path from getting saturated).

  • Doug
    3 years ago
    Apr 01, 2009

    Note however that the "one thread per SQL Server file" is not true, per Microsoft. For details, see:

    http://blogs.msdn.com/psssql/archive/2007/02/21/sql-server-urban-legends-discussed.aspx

    You could get better performance if you had multple files and those files were all on separate spindles or RAID arrays, but that's just for I/O bandwidth, not multi-threading issues.

  • Michael
    3 years ago
    Mar 31, 2009

    @Marcos, glad you liked the article. And yeah, this was a very basic, or introductory, article. The primary purpose was to introduce basic notions of maximizing performance from a wide variety of angles - and to help DBAs think in terms of maximizing their high-performance disks by offloading as much non-essential work as possible off of them. Covering some of the more advanced (and less covered) features would be a blast too...

  • Marcos
    3 years ago
    Mar 30, 2009

    Good article on an already well documented aspect of SQL Server's systems.
    Isn't any new feature, or maybe a "not so clear" aspect that deserves attention? I know it is much easier to write about something that has a lot of reference on the web, but I believe that you could explore less documented features, I know it will require more time, but ,maybe, the article will be more useful.

You must log on before posting a comment.

Are you a new visitor? Register Here
  • SP1?
    I know there is a SP1 for SQL 2008 R2 available....and there is a "feature pack" as well... ...
  • SQL database mirroring
    I have SQL Server 2008 R2 Enterprise 64bit on Windows 2008 R2 Enterprise 64bit.  Each SQL Server has...
  • Dell Compellent Disk Drive
    Does anybody has experience with Dell Compellent Disk Drive? Basically, this system manages all disk...
  • Sql server performance tuning
    I need to find a tool that help me to optimize sql server,queries,improve the performance and solve ...