See correction to this article

The other day, a hardware failure brought down our Exchange server. This failure created a panic in our user community because we consider email availability as important as a dial tone. Had we been using a Windows NT cluster, we users would never have noticed the problem. By providing continuous availability through replication, an NT cluster could have saved us a lot of frustration and prevented the loss in productivity.

Today's NT clustering solutions solve one business computing problem: availability. By replicating data, applications, and even entire systems, clustering lets two or more systems watch each other's back and take over the workload (user connections, applications, and services) in case one system fails. This article will review the types of clustering solutions currently available, categorize clustering solutions, and illustrate what types of business computing problems clustering can help solve now.

So What's a Cluster Anyway?
A cluster is a group of whole, standard computers that work together as a unified computing resource and that can create the illusion of being one machine, a single system image. (With NT clusters, the term whole computer, which is synonymous with node, means a system that can run on its own, apart from the cluster. If you're not familiar with clustering terms, you can refer to "Clustering Terms and Technologies.") This unified computing resource ensures availability because any node can take on the workload of any other node that happens to fail.

Clusters come in three configuration types: active/active, active/standby, and fault tolerant. Let's examine each of the three types of cluster configurations:

  • Active/active: All nodes in the cluster perform meaningful work. If any node fails, the remaining node (or nodes) continues handling its workload and takes on the workload from the failed node. Failover time is between 15 seconds and 90 seconds.
  • Active/standby: One node (the primary node) performs work, and the other (the standby, or secondary node) stands by waiting for a failure in the primary node. If the primary node fails, the clustering solution transfers the primary node's workload to the standby node and terminates any users or workload on the standby node. Failover time is between 15 seconds and 90 seconds.
  • Fault tolerant: A fault-tolerant cluster is a completely redundant system (disk and CPU) whose goal is to be available 99.999 percent of the time. That goal translates to fewer than 6 minutes of downtime per year. Both nodes of a fault-tolerant cluster simultaneously perform identical tasks; the nodes' workloads are redundant. Failover time is less than 1 second.

To illustrate the definition of a cluster, let's say you have users doing file and print on Server A and another group of users accessing an Oracle database on Server B. Servers A and B are nodes in an active/active cluster. If Server A fails, Server B continues handling its workload and picks up Server A's workload. The users accessing the Oracle database do not notice any change in their service; the users doing file and print at most experience a short delay.

NT Clustering Solutions
As the need for availability becomes ever more crucial in the NT environment, many third-party vendors and Microsoft have introduced or are about to introduce clustering solutions for NT. To help you evaluate these clustering solutions, let me briefly explain Microsoft's clustering initiative, Wolfpack, and categorize its capabilities in comparison with those of some prominent third-party clustering solutions. (For reviews of several individual clustering products, including Wolfpack, see Lab Reports.)

Wolfpack
Wolfpack is Microsoft's two-node, active/active clustering solution and set of APIs for NT. Wolfpack's purpose is to provide high availability to your NT Server environment.

Wolfpack will have an effect in several significant areas. First, you can expect all server manufacturers who want to reach NT customers to offer Wolfpack-based clustering support this year. Even a year before its release, Wolfpack had the backing of Digital Equipment, Compaq Computer, Tandem, Intel, Hewlett-Packard, NCR, and IBM.

Theoretically, Wolfpack will work on any two Intel-based or any two Alpha-based servers, but you can't mix Intel and Alpha. However, in practical terms, the number of supported systems will be very restricted because to get on the Wolfpack Hardware Compatibility List (WHCL), each manufacturer must test complete configurations (system, disk subsystem, and SCSI adapter) for compatibility. This approach stands in contrast to NT's existing Hardware Compatibility List (HCL), which lets manufacturers list individual system components. For the WHCL's first release, Microsoft will let each manufacturer list only two configurations. Microsoft will support Wolfpack only for systems on the WHCL, so don't try to build your own Wolfpack clustering solution. Although these requirements will initially limit the selection of Wolfpack-compliant configurations, the WHCL will grow over time.

   Prev. page   [1] 2 3 4     next page
CORRECTIONS TO THIS ARTICLE:
In Mark Smith's article, "Clusters for Everyone," we incorrectly reported that Stratus uses a proprietary interconnect. In fact, Stratus uses standard, redundant 100Base-T connections. Though Isis Availability Manager (cluster software) runs on only Stratus, Stratus hardware can run multiple clustering software solutions, including Microsoft's Wolfpack clustering software. Finally, Stratus uses mirroring technology rather than SCSI-switching as was originally reported. For more information, visit the Stratus Web site at http://www.stratus.com.

 
 

ADS BY GOOGLE