| Executive Summary: Although a SQL Server cluster isn’t cheap, it does prevent application downtime and loss of productivity. This article teaches you everything you need to know about creating a SQL Server cluster in your environment, including how clustering works, the hardware and software that are required, what you need to preconfigure, and how to install the SQL Server cluster. |
Servers and systems have outages—it’s a fact
of life in IT. Clustering SQL Server instances
gives us a measure of protection against these
failures. Even in well laid out environments in which
web and application servers have redundant backups,
the SQL Server system is usually just sitting there as
a single point of failure, despite taking the longest
time to repair and restore. (Getting a database server
set up and the database restored to it from tape will
take several hours or days, depending on the speed
of the restore and the size of the database.) You can
avoid hardware and application downtime by setting
up a SQL Server cluster in your environment. In this
article, I’ll explain how clustering works and the hardware
and software that’s needed to create a SQL Server
cluster. In addition, I’ll show you how to create a
failover cluster using SQL Server 2005 and Windows
Server 2003 Enterprise Edition.
How Clustering Works
When you get right down to it, clustering is a fairly
basic concept. A service runs on each node of the
cluster and checks to see if the SQL Server service is
running on any of the nodes. If the SQL Server service
isn’t running on any nodes, then it’s started on
one of the nodes. What’s actually happening under
the hood during this process is a bit more complex.
Learning Path
SQL SERVER MAGAZINE RESOURCES
For more information about clustering:
“Disk Configuration for Failover Clustering,”
InstantDoc ID 96381
“Active/Passive vs. Active/Active Clustering,”
InstantDoc ID 44938
“Clustering SQL Server,” InstantDoc ID 40034 |
When you install SQL Server in a clustered configuration,
the services are actually installed on all the nodes in the cluster. (Note that SQL Server must be installed
using the clustering option; you can’t change a
single-server installation into a clustered installation.)
If you go into Administrative Tools, Services on any
node, you’ll see all the SQL Server services in manual
startup mode. When SQL Server is running in a cluster,
the service runs on only one physical node at a time.
If you manually move the service from one node to
another, you’re actually stopping the service on one
machine and restarting it on another machine.
When the machine running SQL Server goes offline
(e.g., because of a hardware failure), the passive
SQL Server system will detect that the active SQL
Server system has gone offline, and take ownership of
the SQL Server service, as well as the other resources
in the resource group (e.g., IP address, network name,
disk drives), and bring the service back online. With
SQL Server 2000, this process can take several seconds
to several minutes, depending on how much
data is in the transaction log to be rolled forward and
back. SQL Server 2005 starts up much faster because
it rolls forward any completed transactions, brings
the database online, and then rolls back any completed
transactions.
Hardware and Software
Requirements
Clustering isn’t easy to set up, nor is it cheap. You need
at least two servers and a shared storage solution. Although
the servers don’t need to be identical, configuration
is easier if they are. It’s recommended that the
servers have at least the same number of CPUs and
the same amount of RAM. Technically, it isn’t necessary
for the CPU count and RAM to match, but it
does make it easier to estimate the load the backup
server can take. Using identical servers also gives you
the luxury of not having to worry about which server
the database is running on because all servers will perform
the same.
The storage can’t be the RAID array that you plug
into the server. Instead, it must be an array specifically
designed to be used in a cluster (i.e., shared storage).
This storage is most often handled by a Fibre
Channel or iSCSI SAN solution; however, the major
server vendors all have SCSI or Serial Attached SCSI
(SAS) solutions that can be used as well.
The version of SQL Server and the number of
nodes (servers) in your cluster will determine what
software you need to purchase. The same rules apply
to both 32-bit (x86) and 64-bit (x64 or Itanium)
systems. You must be running Windows Server 2008
Enterprise Edition, Windows Server 2003 Enterprise
Edition, or Windows 2000 Server Enterprise Edition
to cluster. If you want to cluster more than four nodes,
Server 2008 Enterprise Edition, which supports up to
16 nodes, or Windows 2003 Enterprise Edition, which
supports up to eight nodes, is required. If you’re using
a SCSI, SAS, or Serial ATA (SATA) storage array,
the maximum number of nodes that are supported
is two, no matter which OS or SQL Server edition
you’re running. Table 1 shows the number of nodes
supported by each version and edition of SQL Server.
To create a cluster of more than two nodes, both the
OS and SQL Server must be an edition and version
that supports being clustered in that configuration
(e.g., a three-node SQL Server 2005 cluster requires
Win2K Enterprise Edition or later and SQL Server
2000 Enterprise Edition or later).
For Windows 2003 and Win2K clusters, all the
servers must have a static IP address and be on the
same subnet that the cluster IP address will be on.
(Although Server 2008 supports cluster nodes being
on different subnets and using DHCP for the IP addresses,
I recommend using static IP addresses in the
same subnet for simplicity.) In addition, each SQL
Server instance will need an IP address and cluster
name. These are required so that each SQL Server instance
can fail over from one node of the cluster to
another node independently of the other SQL Server
instances.
Planning the Cluster
Just as with any other database, you’ll want your data
files, log files, and tempdb to be on different volumes
on the server and in different RAID groups within
the SAN. You don’t want the high writes of the tempdb
and the logs to get in the way of the reads for the
database. When dealing with very large databases, you might find that 26 drive letters just aren’t enough.
You might consider using mount points within your
drive letter to assign additional drive space and I/O
capacity to the drive. For more information about
mount points, see the web-exclusive sidebar “Creating
Mount Points” (InstantDoc ID
100398). Note that if you’re going to be installing
more than one SQL Server instance, make sure to allocate
each instance its own drive letter. When clustered,
each instance has to have its own drive because
each instance is in its own resource group.
Each node will need not only an IP address but
also a heartbeat IP address. A heartbeat IP address
is typically a different subnet than the IP subnet that
your network uses. The heartbeat IP address is a
cross-connect cable going between the servers in the
cluster. In a cluster with three or more nodes, a network
switch will be needed, as all the machines’ heartbeat
network adapter cards have to be on the same
network. The heartbeat network is the network that
the Windows cluster service uses to see if the other
machines within the cluster are working correctly.
In addition, you’ll need an IP address to cluster
the Windows OSs together, including the Quorum
drive and any Windows services that need to be clustered.
Also, you’ll need to assign an IP address to
SQL Server.
You’ll also need to have several names for your
cluster. Each node will need its own name, as does
the server that holds the Quorum and other clustered
Windows services. In addition, each SQL
Server instance will need its own host name because
each instance has its own IP address, so host names
can’t be shared. This situation is where a good naming
convention comes in very handy. Using
a generic name with an incrementing
number works fine for standalone servers,
but can be challenging to manage when
you start clustering. Extending the naming
convention to fit clusters is necessary. For example, name the hosts SQL01A and SQL01B and
name the Windows cluster SQL01. This naming
convention tells you that SQL01A and SQL01B are
the nodes and that SQL01 is the base of the cluster.
Name the SQL Server cluster SQL01V01. This
name tells you that SQL01V01 is the first server on
the Windows cluster. If you need to install a second
instance, you would name it SQL01V02, with
each instance being a separate server in the SQL01
cluster.
When configuring the cluster, you’ll be prompted
for a domain account, which will be used to monitor
the cluster. Don’t use your personal domain account;
instead, set up a separate account on the
domain for this purpose. This account will need administrative rights to all nodes in the cluster,
and it will need to be able to log in to SQL Server.
This account doesn’t need to be a member of the
sysadmin fixed server role; it only needs to be
able to connect to SQL Server. If the account doesn’t have rights to log in to the instance, the
database will fail back and forth between the
nodes until you force the database online and
add the login. If you plan to remove the BUILTIN Administrators group from SQL Server as part of your security lockdown, make sure to add this new
account as a login to the SQL Server instance that’s
removing the BUILTIN\Administrators group.
Finally, you need to preconfigure the Domain
Groups for Clustered Services page. These are Windows
domain groups that contain the domain accounts
that will run the various services. These groups will
be given administrative rights within the SQL Server
instance, so they should contain only the domain account
that the SQL Server instance will run under.
Three domain groups are needed; however, if the full
text service, SQL Server Agent service, and SQL Server
service all run under the same domain account, only
one domain group is needed.
Creating and Configuring
the Cluster
Now that you’ve planned for your SQL Server cluster,
you can configure the cluster and install SQL Server.
Open the Cluster Administrator from the Start, Programs,
Administrative Tools menu and select Create
new cluster from the Action drop-down menu, as
shown in Figure 1. Next, select the domain that the
servers are located in and enter the cluster name (in
this case SQL01) in the Cluster name text box of the
New Server Cluster Wizard. On the next screen, enter
the username, password, and domain of the account
that will be used to monitor the cluster.
Continue to page 2