Meeting the high-availability challenge

Participating in global markets and maintaining 24 * 7 presence for customers can be rewarding, but the challenge of maintaining servers to meet those availability requirements can cause many headaches for systems administrators. System outages can cause lost revenue, lost productivity, data loss, and customer dissatisfaction. One tightly packaged solution to meet the availability challenge is Stratus Technologies' ftServer 3210, an entry-level, fault-tolerant server with an average annual downtime of less than 1 hour.

Product Architecture
Stratus Technologies builds the ftServer 3210 from the ground up to be a highly available, fault-tolerant platform for mission-critical applications. The product runs Windows 2000 Advanced Server to accommodate the plethora of applications that require high availability (e.g., securities trading, retail banking, messaging, health care, point of sale).

The system's hardware compares to alternative solutions such as Win2K clustering in the way it uses standard architecture (e.g., Intel Pentium III processors, DIMM memory, Ultra 160 SCSI, PCI bus, hot-swappable drives). The defining difference is in the implementation of redundant hardware components. Whereas a cluster might have a second server standing by for failover use, the ftServer 3210 uses paired components internally.

Stratus Technologies uses the term Dual Modular Redundancy (DMR) to refer to the wholly redundant, lockstep-operating CPU modules. Lockstep operation means that at any given time, the system executes identical and precisely parallel actions on each of the paired components. The server ships with VERITAS Software's VERITAS Volume Manager 2.7, which dual-initiates the disks by using redundant SCSI controllers and buses, then logically pairs the disks. The Intel PROSet II utility manages the redundant NICs to establish fault tolerance. Other components (e.g., floppy disk and CD-ROM drives) are standard simplex devices. You can hot-swap the majority of the ftServer 3210 modules for service and upgrade operations.

The ftServer 3210 has some similarities to a Win2K-based cluster. The Win2K kernel is at the heart of both systems, and additional software is responsible for monitoring and responding to hardware problems. Stratus has built a layer of fault-tolerant services that exist below the application level in Win2K, so applications don't require modification to benefit from the ftServer 3210's availability. This software also maintains mean time between failure (MTBF) statistics that a systems administrator can analyze and use to tailor responses to crucial hardware concerns. Hardened device drivers for Stratus-supported PCI adapters provide self-monitoring duplexed operation and manage physical memory that the adapters access.

The characteristics that best differentiate the ftServer 3210 from a cluster relate to failure, recovery, and implementation time. With the ftServer 3210's lockstep operation, no switchover occurs when a component fails. Components that haven't failed simply continue to run the system until you replace the failed components. Recovering from a failure is less labor-intensive because you need only to swap modules in most cases. The new module synchronizes with the live module without interruption. Another benefit of lockstep operation is that in the event of a failure, the system protects data that's in memory as well as data on disk. Because implementing the ftServer 3210 doesn't require scripting or application modification, implementation is fast. The ftServer 3210 also sidesteps the typical shared-disk storage problems that plague conventional cluster implementations because the redundant adapters and cables are all internal to the product.

Monitoring and Management
Stratus provides two Microsoft Management Console (MMC) snap-ins for monitoring and management: ftServer Management Console and Stratus ftServer Software Availability Manager. (The Software Availability Manager items are visible only through a pop-up menu when you right-click the Availability Manager icon in MMC.) The ftServer Management Console lets you drill down for specific hardware information as Figure 1 shows, and the Software Availability Manager lets you set thresholds and alerting options for various monitored hardware and software components. I found both snap-ins to be intuitive and useful for monitoring and managing the server. The well-instrumented hardware provides more statistics than most people would ever want to see. You can also take advantage of this instrumentation through leading third-party management applications (e.g., Computer Associates'—CA's—Unicenter, IBM's Tivoli), and you can run a Remote Management Installation, which gives you the tools to manage the ftServer 3210 from another system. Although you can customize monitoring to suit your needs, the ftServer 3210 also comes equipped with integrated service technology.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.