Any large enterprise that plans to implement a business intelligence or data warehouse solution with 1TB or more of data might find itself looking at a significant investment for a server system that can provide the capacity and throughput that such a database environment requires. Large SAN and server solutions can certainly support large data warehouses, video streaming, or data-mining implementations. But such systems come at a high price. A system that can scan through the database at roughly 700MB per second of sequential I/O might cost anywhere from $500,000 to more than $1 million—and that's for just the hardware, not including the cost of software and maintenance.This enormous price tag led me and my colleague, Son Voba, to look for an alternative server solution.

Voba and I were impressed by the tremendous advances in software and hardware technology over the last 12 to 18 months, and we were inspired by the 2004 SATA disk throughput research of Microsoft Research Scientist Jim Gray and others in Microsoft Research. We wanted to build a prototype server that would consistently deliver more than 2GB per second sequential database reads and writes. Gray's research and testing used the Newisys 4300 server, SATA drives, and SATA disk controllers connected to 48 SATA disks with 48 SATA cables directly attached to the disks. Gray's configuration achieved a remarkable 2.5GB per second for sequential writes and slightly less than that for sequential reads.

We decided to build a white-box server (a non-branded system built from generic components) consisting of a beefy microcomputer with a 64-bit OS with massive main memory, database software, and direct-attached storage. This article describes that system—a prototype server running Windows Server 2003 x64, SQL Server 2005 x64, 32 GB of RAM, and high-capacity, direct-attached 7200 RPM SATA disks. This system, which our team has tested over the past 6 months, can compete with mega-servers often costing ten to twenty times more. Because of its low cost, such a system could become a contender for a big share of the database server solutions market.

The Prototype
In July of 2005, Son Voba, a handful of hardware vendors, and I put together our prototype server in a lab on the Microsoft Redmond Campus. The goal was to test SQL Server 2005 x64 with high-speed, high-capacity direct-attached disk storage by using SATA disk drives and serially attached SAS SCSI disk controllers, SAS expander boards, and cabling.We chose SATA drives because they have a high capacity—up to 500GB per drive today, with larger drives on the horizon. SATA drives are also several times less expensive than SCSI drives based on capacity and are comparable to SCSI drives in sequential I/O bandwidth.This bandwidth is essential for data-warehousing workloads that scan gigabytes of database storage when processing queries or moving large data sets around. I won't attempt to compare the differences in flexibility, high availability, and reliability of SAN storage solutions and direct-attached disk solutions. For excellent explanations of SATA disk drives and SAS protocols, see the materials at the SCSI Trade Association (http://www.scsita.org) and the Serial ATA International Organization (http://www.sata-io.org).

To test our prototype,we used real customer databases, more than 2TB total, from a data-warehouse solution running on SQL Server 2000. In addition, we used a disk-stress utility called SQLIO (SQLIO.exe—available for download at http://www.microsoft.com/downloads) to measure disk throughput.

We built the prototype server during July and August of 2005 using a four-processor dual-core AMD server. Four hardware companies generously loaned us the hardware. Table 1 shows the cost of the components based on market price in December 2005. Newisys (a Sanmina-MCI Company) loaned us the 4300 Server, which came with 32GB of memory and four dual-core AMD Opteron 2.2GHz CPUs. Newsys also provided 64 400GB Seagate SATA disks. LSI Logic Corporation loaned us six SAS3442X Host Bus Adapters (HBAs—disk controllers). Vitesse Semiconductor Company loaned us four SAS expander cards. And Chenbro Micom Company loaned us four 16-bay drive chassis. Figure 1 shows the configuration we created with these components.

We decided to use LSI SAS ( serial-attached SCSI) HBAs and expanders because they are emerging technology that's evolving from the traditional SCSI standard and because SAS lets you mix and match SATA and SAS disk drives connected to the same controller. We plugged the six SAS HBAs into the Newisys server by using four of the PCI-X 133Mhz slots, one of the PCI-X 100Mhz slots, and one 66MHz PCI-X slot.This configuration distributed the six HBAs across the PCI-X buses and the three PCI-X bridges on the motherboard, which Figure 1 shows. This distribution optimized data bandwidth and kept the processors quickly fed with data. We found that SQL Server 2005 could keep up with more data, so the faster the HBAs and disks, the faster DRAM could process queries or data writes.

By using the SAS disk controllers from LSI and SAS expander boards from Vitesse, we were able to use one Molex cable from each LSI HBA in each of the six PCI-X slots to connect to one of the four SAS expander boards,reducing cable clutter to six cables from the 48 that Gray used.The SAS expander let us connect 16 disk drives to one expander. Because we had six HBAs from LSI and only four Vitesse expander boards, we were able to plug the additional Molex cables from disk controllers 5 and 6 into the same Vitesse SAS expanders as disk controllers 3 and 4.

The SAS expander boards let you daisy-chain more expanders, increasing the number of disk drives you can directly attach to a disk controller. However, our initial testing showed that one disk controller can keep up with at most 12 drives when doing sequential scans. Six cards can fully exploit the bandwidth of 72 disk drives for performance but also let you double or triple the number of disks by daisy-chaining the expander boards connected to additional drives.

By zoning the Vitesse expanders, we were able to segregate the disk traffic on disk controllers 3,4,5,and 6 to eight physical disk drives each. The Vitesse SAS expander board basically lets you plug 16 physical SATA drives into the expander board by using Infiniband cables and plug one cable from the expander board into each LSI HBA in the six PCIX slots. In our configuration, each LSI HBA was capable of delivering 400MB per second or more of data throughput—double the current throughput of a typical fiber-channel disk controller used to connect to a SAN.

Once all our hardware and cables were connected, we were able to use SQLIO.exe to demonstrate roughly 2.2GB per second for sequential reads, just over 2GB per second for sequential writes, and more than 24TB of useable disk space. It's important to point out that a direct-attached disk subsystem delivering more than 2GB per second of data throughput is incredibly fast—in many cases, faster than SAN storage systems costing over forty times more than our solution. For some SAN systems, the cost per useable terabyte is close to $20,000. And as you'll recall, the total price for our prototype, including the disks and entire disk subsystem hardware, was roughly $46,100.

Even though our prototype exceeded our expectations, our throughput result was slightly less than the 2.5GB per second sequential I/O throughput that Jim Gray achieved, perhaps because we used SAS protocols. We plan future testing to push for higher throughput. But by using SAS, we were able to daisy-chain more SAS expander boards to the existing four boards we used in the prototype and reduce cable clutter from 48 cables to 6 from the HBAs to the expander boards and drive chassis.The net result is that we could have added a lot more disk drives than the 64 we used—perhaps double, up to 128 total—giving us more than 48TB of useable disk space.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.

Reader Comments

Article text and Figure 1 don't match. Article mentions 6 LSI SAS HBAs, and the figure shows only 3 (1 dual, 2 single). It would make sense to reduce the cards to 4 or fewer so that each could run on the host computer's 133Mhz PCI-X slots.

Also, the article mentions 32GB of RAM, while the figure shows 64GB. Why the difference? Which setup was used to post the results?

trimai

Article Rating 5 out of 5

Hi - The author has this response to the discrepancy on the figues:

From Rich Johnson: The graphic I saw was a picture of memory cards that said 16 x 4 GB which would be 64 GB of memory. It should be 8 x 4 GB cards. Is that what the reader saw? Its kind of small but if you really look at the diagram and the fact we refer to 32 GB of main memory than that’s a discrepancy. Regards, ---------------------------------------------- Rich Johnson Architect Business Intelligence Solutions for the Retail & Hospitality Industry

djmay

Article Rating 5 out of 5

In Table 1, the typical market cost of 64 Seagate 400GB drives is $11,340. This comes out to $178 per drive. Where can I obtain these drives at that cost? The best I can do is $230.

THEFox

Article Rating 5 out of 5

To trimai, to fully utilize PCI-X bus, you need to know deep knowledge of the hardware platform. (e.g. how many buses there in the chipset, which and which slot share the same bus). Bus always wide than HBA, so it makes great sense to use two HBA on one bus. No one will stop you to only use 4 cards, but with two more, you get more throughput.

To THEFox, $230 is not far away from $178, isn't it? Buying in a bunch always get cheaper, I reckon.

Jim Gray's paper is in more details. Something lack from this article is the RAID configuration and DB filegroup setting up. And what about OLTP? How good will this system respond to Random access?

SAN maker's easy money day should over ASAP.

xied75

Article Rating 5 out of 5

I'm also curious what (if any) RAID configuration was used as well as the location of the database drives. If you didn't use any RAID would you consider redoing the test with in applied so we could see how much of an impact this would have on performance? Enterprise customers would likely be willing to compromise on some amount of performance if we retain the reliability we get in a SAN. Thanks for considering & for the article itself.

toennitm

Article Rating 5 out of 5

I am also curious about the RAID configuration in this scenario? I heard that there is interoperability issues between LSI SAS HBA and Vitesse SAS expander but not sure they are the ones used in this setup.

jingtau

Article Rating 4 out of 5

Excellent Article, Thank you.

hughesg4

Article Rating 5 out of 5

I hope the author can update his article as somethings have changed. Vitesse no longer makes SAS Expanders. And it looks like Supermicro SC933 is a better fit for a 16-bay chassis since the LSI Expander card is included, however I think the backplane is only SAS drives, not SATA.

I did manage to find a chenbro expander card, but they sell two models, and if you want the one described in the article, it would take forever to ORDER one. I heard the lead time is about 6-weeks....

He said that they would come out with the results by plugging in SAS drives but no results or updates were forthcoming. I guess Chenbro wanted their chassis back.

Has anyone built this model yet with today's available resellers? And preferably on both SATA and SAS in your chassis so you can put your OLTP on SAS and your storage needs on SATA?

andrewn2008

Article Rating 5 out of 5

Hi andrewn2008, Thanks so much for your feedback. You ask a lot of great questions, which could end up being topics for future storage articles. We will definitely keep your comments in mind when we are acquiring and scheduling storage articles for SQL Server Magazine. Thanks again!

Megan Bearly Associate Editor, SQL Server Magazine mbearly@sqlmag.com

meganbearly

Article Rating 5 out of 5