How many circus clowns can you fit
into a Volkswagen Beetle? I'd bet good money that I could squeeze 10 clowns into
one, as long as they were small and I had
plenty of butter to grease the clowns up. (Editor's note: No clowns were harmed during the
writing of this article.) Adding each additional
clown would become more and more difficult, and predicting the theoretical limit of
how many clowns could squeeze into the car
would be extremely tough.
Obviously, 10,000 clowns wouldn't fit.
What if I reduced the workload—that is,
decreased the number of clowns I'm trying
to squeeze through my bottleneck (i.e., the
car) by a factor of 10. A 10x decrease in the
workload is substantial; I could surely fit just
1000 clowns into the Beetle. Right? All the
butter in the world wouldn't get that job
done.
What does all this clown imagery have to
do with SQL Server I/O? Oddly enough, it
can help us frame a problem that a tremendously large number of SQL Server customers suffer from—namely, I/O subsystems
that aren't properly equipped to handle the
workload placed on them.
The Wrong Approach
Many customers assume that "adding a faster
I/O subsystem" will magically solve their
problems. Perhaps they mistakenly believe
that stuffing 1000 clowns into a car must
be easier than stuffing 10,000 clowns in it.
Alas, hardware is pretty mystical and unapproachable to a lot of software people. We
don't visualize the bandwidth and bottleneck
problems in the same way.
Suppose your existing SQL Server–
induced workload is I/O-bound and that your I/O subsystem can't keep up. An
unacceptably large number of SQL Server
customers—and probably customers of other
major database platforms—will immediately
say, "I guess it's time to buy a better SAN."
They'll diligently research SAN options,
then spend a lot of time and money setting
up the new SAN. Whether the solution will
help is a crapshoot unless the customer does
the necessary due diligence to understand the
I/O characteristics of the workload he or she
expects the I/O subsystem to handle—contrasted against the SAN's actual performance
and throughput capabilities. Spending hundreds of thousands of dollars on a new I/O
subsystem that's five to twenty times faster
than your current system won't necessarily
have a material impact on the overall performance of your application if your new SAN
purchase is the circus equivalent of saying,
"10,000 clowns won't fit, so let's try 1000."
Sometimes, throwing hardware at a
problem works. You'll never squeeze 100
clowns into a Beetle, but you might be able
to cram 100 clowns into a school bus rated
to carry 50 people. It would be a tight fit,
and the bus's performance probably wouldn't
be up to your expectations, but you'd be
able to drive those clowns around town.
Sometimes, you'll get lucky while trying to
solve I/O problems by simply buying more
I/O—without having a concrete grasp of
your existing I/O workload requirements.
However, it's never a good idea. And you risk
having to explain to your boss why the new
six-figure SAN hasn't changed anything.
There's no magic formula that tells you
"how much SAN you need." However,
if your vendor is willing to take scads of
your company's hard-earned money without asking you for detailed, low-level I/O-usage
patterns—and you don't know how to size
the system yourself—you should consider
seeking outside assistance to ensure that your
new I/O capacity is sized and configured for
your needs.
The Right Approach
This column's intent isn't to teach you how
to be an I/O expert. An important skill
among successful IT pros is knowing the
limits of one's abilities, recognizing situations
in which the costs of making the wrong decision are substantial, and seeking appropriate
outside counsel as necessary. But I can't leave
you hanging entirely.
There are countless ways to look at I/O
information, and SQL Server 2005 provides a
rich set of tools for monitoring performance
metrics. However, I've always been a big fan
of fn_virtualfilestats, a handy function that lets
you quickly and easily track the number of
physical I/O read and write operations that
happen on a per-file level for SQL Server. It
tracks the actual number of bytes transferred
for reads and writes on a per-file basis. It also
gives you valuable information about the
total number of I/O stalls that each file in
your SQL Server instance has experienced.
All this information can be helpful in understanding the existing I/O patterns of your
current workload and in properly sizing any
new I/O subsystem purchase that you're
contemplating.
Join Us Online!
Save the clowns. Size your I/O subsystem properly. And be sure to visit SQL
Server Magazine performance-tuning forums
(http://sqlforums.windowsitpro.com) for
further advice and tips.