SideBar    2 Servers Are Better Than 4

Keeping mission-critical applications running is one of an IT department's most important responsibilities. Although clustering products provide an effective high-availability solution, the failover process can disrupt application processing for 30 seconds or longer. Depending on the client application's design, users might have to reconnect to the clustered application when it resumes on the new node, and if the failed node sits on a remote site, you'll have to dispatch a technician to repair it. Furthermore, Windows 2000 Datacenter Server­based clusters require careful management to maintain their high level of reliability.

Several server vendors have developed specialized products that address some or all of these concerns. Marathon Technologies, NEC Technologies, and Stratus Technologies have introduced solutions that claim to deliver five-nines hardware reliability for departments and small-to-midsized businesses. Their solutions rely on fault tolerance rather than clustering technology and use Win2K Advanced Server with standard versions of your applications. Unlike clustering, in which a server failure halts applications temporarily while application processing shifts to an alternate node, fault-tolerant systems let applications run uninterrupted on a redundant subsystem. After you replace the failed parts, both clustered nodes and fault-tolerant systems halt processing temporarily. NEC and Stratus say that powering up and resynchronizing the new part (known as reintegration in fault-tolerant systems) can take as long as 12 seconds under Win2K AS. Marathon says its reintegration times are a few seconds at most. In contrast, failing back a cluster can halt application processing for 30 seconds or more.

When you compare clustering and fault-tolerant technologies, remember that Microsoft Cluster service addresses hardware and software failures, whereas fault-tolerant systems primarily address hardware reliability. Although the approaches that NEC, Marathon, and Stratus use in their fault-tolerant architectures should reduce the likelihood of a software failure, if you need Cluster service's high level of software reliability, you'll need to purchase cluster-aware versions of your applications, which is an added expense.

A Unique Approach
Marathon's Endurance 6200 system, which the Windows & .NET Magazine Lab tested ("Endurance 6200 3.0," July 2001, http://www.winnetmag.com, InstantDoc ID 21140), uses four servers that appear as one system to the application and the user. Two servers function as compute engines; the other two function as I/O Processors (IOPs). Marathon separates the application environment from most drivers, thereby shielding the application from driver-induced failures. The Endurance product is targeted at applications that run on single- or dual-processor server systems.

The compute engines have one or two processors, memory, core logic, and a disk drive; the IOPs contain from one to four CPUs, memory, disk controllers, storage, and NICs. All four servers are connected, and Marathon's proprietary NICs provide 50Mbps throughput. You must configure compute engines and IOPs identically. Compute engines are paired with I/O processors into what Marathon calls tuples. Both compute engines run the same applications in lockstep and store data on their respective IOPs. When a fault halts processing on one tuple, processing continues uninterrupted on the other.

When the Lab tested Endurance 6200 last year, the product performed flawlessly with uninterrupted application processing when we initiated hardware failures, with only short pauses when we reintegrated the downed tuple. In our tests, the longest application processing interruption during reintegration was just 4.5 seconds, much shorter than a cluster-failback interruption.

When we reviewed Endurance 6200, the only way to purchase the product was to buy identically configured server pairs with the product already integrated, which made Endurance 6200 expensive unless you implemented it as part of an existing plan to add new servers. Now you can purchase the software and Marathon Interface Cards (MICs) as a kit and retrofit existing servers for about $20,000 (you must still use identically configured server pairs for the compute engines and the IOPs). Endurance 6200 runs on Win2K Server, Win2K AS, Windows NT Server, and NT Server, Enterprise Edition (NTS/E).

Marathon recently announced a new software-only implementation of the Endurance product that will require two servers rather than four. (See the sidebar "2 Servers Are Better Than 4.") With the current Endurance hardware and software kit, your application needs only one license. If you purchase either product with a Win2K license, you'll need only one license for your OS, as well. If you choose to rely on your existing Win2K volume license agreement, the wording of that agreement will determine whether you'll need a second OS license. The new product (whose name hasn't been finalized) will be available for Win2K and will support Windows .NET Server (Win.NET Server) 2003 shortly after its release. Marathon has no plans to support NT with this product.

A Different Approach
NEC and Stratus have taken a different approach to fault-tolerant computing. Stratus licensed its fault-tolerant core logic chipset to NEC, and NEC has designed several two- and four-processor servers around it that both vendors market under their own names. Each vendor uses its own peripheral components or those of a selected OEM, however. These server designs use Win2K AS and unmodified, standard versions of applications. Unlike clustered systems, the NEC and Stratus products require just one OS and application license.

The first server based on both vendors' technology is a dual-processor server that employs Intel 800MHz Pentium III processors with 256KB cache. The Lab tested Stratus's version of this server, the ftServer 3210 (see "Stratus ftServer 3210," July 2002, http://www.winnetmag.com, InstantDoc ID 25335). NEC's version of this server is known as the Express5800/ft 320La.

These servers have user-replaceable component modules that let a bank branch manager, for example, replace a failed module and integrate the new module without having to send for an IT person. Each 8U (14") rack-mount server includes shoe-box­sized pairs of processor, I/O, and storage modules. Each processor module contains a motherboard with one or two processors, core logic, and memory; each storage module contains as many as three SCSI drives. The I/O module contains two SCSI controllers along with video subsystems and PCI slots. The server also includes a pair of NICs for maximum redundancy. The NEC and Stratus servers use VERITAS Software's VERITAS Volume Manager (included with each server) to mirror the OS and data on the pair of disk subsystems.

   Prev. page   [1] 2     next page



You must log on before posting a comment.

If you don't have a username & password, please register now.