If you want to start an argument with a Microsoft Exchange Server administrator,
try giving unsolicited advice about storage design and configuration. Exchange
2000 Server and Exchange Server 2003 offer so many storage options that determining
an optimal configuration is often difficult. In addition, some common and enduring
Exchange misconceptions further complicate the decision-making process. In this
article, I discuss some lesser-known Exchange storage design principles that
will help you clarify what works so you can make the best design decisions for
your environment.
Storage Partitioning
Exchange Server 5.5 uses a monolithic database design, with a maximum of three
databases on each server: a mailbox database, a public folder database, and
a directory database. This design allows some truly scary configurations; for
example, I once had a customer with an average Exchange 5.5 mailbox database
size of 140GB.
Exchange 2000 introduced the concept of multiple storage groups (SGs), each
of which can contain multiple databases; Exchange 2003 uses the same mechanism.
The SG (a logical object that doesn't exist on the hard disk) is an instance
of the Exchange Information Store (IS) that runs within the store.exe process
and owns the transaction logs for all the mailbox and public folder databases
in the group. Each database is a separate logical object with a pair of physical
disk files (the .edb and .stm files). Many customers who upgrade from Exchange
5.5 to Exchange 2000 or Exchange 2003 accept the default migration settings.
This practice isn't advisable because you get Exchange 5.5's huge single database
rather than having the benefit of multiple databases.
You can back up or restore only one database at a time
per SG. If you have multiple SGs, you can back up or restore
multiple databases simultaneously. Suppose you have a
140GB database that's divided into four SGs, each with a
35GB database. Backing up this divided database takes the
same total amount of time as backing up one 140GB database; however, the individual backups take about a quarter
of the time. If you back up to tape, you can add a second
tape drive to back up two databases in parallel and cut
the backup time in half. But the biggest performance gain
occurs if you restore multiple databases in parallel. If you've
backed up multiple SGs, you can restore one database from
each SG at the same time and significantly reduce the overall restore time.
Another good reason to partition your storage is that doing so can help you
abide by your service level agreements (SLAs). Suppose you have an SLA that
requires you to restore executives' access to email within an hour of an outage
but gives you a five-hour window for other employees. If you put the executives'
and employees' mailboxes into separate SGs, you can restore the databases independently.
Assuming that you have fewer executives than employees, you should be able to
restore the executives' email access according to your SLA.
Microsoft's recommendation with the initial release of Exchange 2000 was to
create the smallest number of SGs possible because each additional SG required
a fixed allocation of between 100MB and 250MB of RAM—a significant amount
at the time. Exchange 2000 Service Pack 3 (SP3) includes RAM allocation process
modifications that dramatically reduce the amount of RAM required for additional
SGs. Now, Microsoft's recommendation is to create as many SGs as possible. To
draw on my earlier example, Microsoft recommends creating four SGs with one
database each instead of one SG with four databases because each Exchange SG
has its own set of logs. If you have only one database per SG, each database
essentially has its own set of logs. This configuration simplifies and expedites
disaster recovery because only one database's transaction logs must replay when
you restore the database.
RAID
Once upon a time, administrators debated whether using RAID with Exchange was
a good idea. That debate has long since been put to rest; administrators know
that RAID can add a valuable degree of protection to Exchange data. Now the
debate has turned to the type of RAID to use.
To determine which type of RAID to use, you need to remember that each RAID
level balances performance against recoverability. What's good for one data
type can be bad for another. Imagine a striped volume with two disks. Striping
gives you great speed because applications can read from and write to all physical
disks at the same time. But if you lose one disk in the stripe set you effectively
lose the whole volume. This design might be acceptable in situations in which
the performance boost would be beneficial but a transient disk failure wouldn't
be the end of the world (e.g., for SMTP queues on a gateway machine). However,
you'd have to be fairly risk-tolerant to put your databases on such a volume.
Microsoft's general recommendation is to use mirroring for data when protection
is most important (e.g., transaction logs, the system volume). When data protection
and access speed are both important, use either RAID 5 or RAID 0+1. If you have
the budget, RAID 0+1 is preferable.
Logs and Databases
When you install or upgrade Exchange and accept the default log and database
locations, all your Exchange data is stored on one volume. However, Microsoft
has long recommended that you put transaction logs and databases on separate
volumes because of the differences in their access patterns. Log files are always
written to sequentially, and they're read (also sequentially) only during log
playback. Databases are written to and read from in essentially random patterns,
according to users' requests. Thus, putting your log files and databases on
the same disk volume is a bad idea for two reasons: Doing so impairs performance
and can compromise your ability to recover data in the event of a disk failure.
These risks are present even if you're using RAID arrays instead of plain physical
disks. Consider a case in which you have one large RAID 5 array with 10 disks
that contains transaction logs and databases for two SGs. A better configuration
from a performance and disaster recovery standpoint is to use two disks to make
a mirrored volume for the transaction logs, dedicate seven disks for a RAID
5 array for the databases, and keep one unallocated disk as a hot spare. Depending
on the database access patterns, you can also create separate RAID 5 volumes
(each with its own set of physical disks) for the databases.
Keep in mind that online full backups remove the transaction logs. If you see
a lot of log files after a backup finishes, you need to investigate because
the backup wasn't successful. Never manually delete transaction log files
without a good reason for doing so—such as if Microsoft Customer Service
and Support (CSS) advises you to. Even then, you need to ensure that you have
a current copy of the logs stored in a safe location before you delete them.
Prev. page  
[1]
2
next page