SQL Server Multi-Instance Clustering
As Microsoft Clustering Services (MSCS) in Windows 2003 and Windows 2000 has matured, SQL Server has gained a lot of market space because it lets you maintain high data availability when you run SQL Server on a Windows failover cluster (not to be confused with the Network Load BalancingNLBclusters that Web farms use). As Figure 1 shows, a typical active-passive failover cluster configuration consists of two nodes: a primary node and a failover node. (For information about clustering, see the clustering articles by Brian Knight and Richard Waymire in Related Reading.) This architecture is dependable, but it can also be very costly; Microsoft doesn't require you to license SQL Server on the failover (or passive) node in a cluster, but you can't avoid the hardware cost, the cost of the OS (which must be Enterprise Edition to support clustering), and the administrative overhead for the passive server. If you have several mission-critical SQL Server instances in your enterprise, each requiring a clustered environment, maintaining two servers for each instance adds up quickly. A cost-saving alternative is multi-instance clustering (aka an active-active cluster), in which you have two nodes, each acting as a primary node for a SQL Server instance and as a failover node for the other instance, as Figure 2 shows. This architecture lets you fully utilize all your investments in hardware and the OS, and it has little added administrative overheadyou add only the cost of licensing SQL Server on the second node of the cluster. This architecture works well in environments that don't have large databases, high concurrency, or overly complex queries. But if any memory-hungry activity is happening on the clustered instances during a failover, as in the database-consolidation senario, the two SQL Server instances will compete for all the available RAM and processing facilities on the same server. Instantly, all activity in the databases in both SQL Server instances will slow to a crawl. After about the third call from an irate database user, what seemed like a great plan (saving money by using multi-instance clustering) turns out to be a really bad plan.
Conceptually and logically, nothing is wrong with the multi-instance clustering architecture, but physically it just doesn't work with the 4GB addressable memory space and limited processing horsepower of a 32-bit server. However, if you host the multi-instance cluster on 64-bit hardware, your cluster has a greater capacity to weather a failover with less effect on end users. Of course, you need to appropriately configure your servers with adequate excess memory so that you can realize these benefits, but these costs are nominal compared to the costs of maintaining an expensive active-passive model. In an active-active model, you immediately make up for the cost of the 64-bit hardware and increased memory because all hardware and software is in use rather than sitting in a passive mode.
Some shops that use Windows 2003 Enterprise Edition or Windows 2000 Datacenter Edition might employ an N+1 clustered architecture, in which, for example, the cluster might contain four nodesthree active nodes and one standby node that the others use for failing over. When you crunch the numbers, this architecture is less expensive than a two-node single-instance cluster (active-passive), even in a 32-bit architecture, because you're using more of the hardware you've invested in. But even this model can be further improved if you use 64-bit. If you have an N+1 cluster on 32-bit and more than one node fails over, you need to manually intervene (or do a fair amount of cluster.exe scripting) and carefully supervise the fail-back or restore process. Otherwise, you can quickly end up with all of your SQL Server instances running on only one node in the clusterand you might not have enough resources to run all your instances on the same 32-bit node. But hosting the cluster on 64-bit hardware gives you the memory and processing power you need to weather this type of convergent failover with less effect on end users, so you'll have the time to ride out a recoveryeven from a catastrophic hardware failure.
Time to Ante Up
With SQL Server 2005 just ahead, we have a lot to look forward to in 64-bitnotably, comprehensive product builds for Itanium2, Opteron, and Xeon-extendedand we'll have all the SQL Server 2005 subproducts, including Integration Services, Reporting Services, Notification Services, and Client Tools. If you're considering new server purchases, and your organization is hedging its bets by thinking about sticking with a 32-bit system, take another look at how 64-bit might serve both your current and future needs. You might discover that it's worth the gamble.