Using Exchange 2000 in Clusters
Exchange 2000 was the first release to support active-active clusters, meaning that every node in the cluster supports an Exchange virtual server at the same time. Unfortunately, active-active clusters ran into virtual-memory fragmentation problems within the Store, and this problem has prevented Exchange 2000 from scaling up as much as it should on a cluster.
As Exchange 2000 runs, Windows allocates and deallocates virtual memory to the Store to map mailboxes and other structures. Virtual memory is sometimes allocated in contiguous chunks, such as the approximately 10MB of memory that's necessary to mount a database. However, as time goes by, providing the Store with enough contiguous virtual memory becomes difficult because the memory becomes fragmented. In concept, this fragmentation is similar to the fragmentation that occurs on disks and usually doesn't cause too many problems, except for cluster state transitions.
During a cluster state transition, the cluster must move the SGs that were active on a failed node to one or more other nodes in the cluster. SGs consist of sets of databases, so the Store has to initialize the SGs, then mount the databases so that users can access their mailboxes. You can track this activity through event ID 1133 in the Application event log. On a heavily loaded cluster, the Store might not be able to mount the databases because not enough contiguous virtual memory is available, in which case you'll see an event such as event ID 2065. Thus, you encounter a situation in which the cluster state transition occurs but the Store is essentially brain-dead because the databases are unavailable. This kind of situation occurs only on heavily loaded systems, but consolidating servers and building big, highly resilient systems are prime driving factors for considering clusters in the first place.
After receiving problem reports, Microsoft analyzed the situation and realized a problem existed when running in active-active mode. Microsoft began advising customers to limit cluster designs and limit the number of concurrently supported clients to 1000 in Exchange 2000, 1500 in Exchange 2000 Service Pack 1 (SP1), and 1900 in Exchange 2000 SP3 and SP2.
The client numbers that Microsoft recommends are based on Messaging API (MAPI) loads. Because MAPI is the most functional and feature-rich protocol, MAPI clients usually generate the heaviest workload for Exchange. Microsoft Outlook Web Access (OWA) clients generate much the same type of demand as MAPI clients. However, other client protocols (e.g., IMAP4, POP3) typically generate lower system demand and can result in a lesser workload for the server. So, organizations might be able to support more client connections than the number of clients Microsoft recommends before the virtual-memory problem appears.
Exchange 2000 SP3 includes a new virtual-memory allocation scheme for the Store. This new scheme changes the way in which Windows allocates and deallocates memory. Experience to date demonstrates that servers running SP3 encounter fewer memory problems on high-end clusters. Thus, Microsoft highly recommends that organizations with large clusters upgrade to Exchange 2000 SP3 or, even better, upgrade the OS to Windows 2003 and deploy Exchange 2003, which better manages memory.
The problems with virtual-memory management have forced Microsoft to express views about how to set up active clusters. Essentially, Microsoft's advice is to keep a passive node available whenever possible, meaning that a two-node cluster should run in active-passive mode and a four-node cluster should have three active nodes and one passive node.
Virtual memory begins to decline as the load on a cluster grows. Exchange logs event ID 9582 when less than 32MB of available memory is present, then flags the same event when no contiguous blocks of virtual memory larger than 16MB exist inside the Store. After Exchange reaches this threshold, the cluster can become unstable and stop responding to client requests, and you must reboot. You might also see event ID 9582 in two other situations:
- Event ID 9582 might appear immediately after a failover to a passive node if the passive node previously hosted the same virtual server. Each node maintains a stub store.exe process, and the structures within the process might have already been fragmented, leading to the error. If this error occurs, you can transition the virtual server to another node in the cluster, then restart the server that has the fragmented memory. If a passive node isn't available, you have to restart the active node. Exchange 2000 SP3 generates far fewer problems of this nature, so you're unlikely to see event ID 9582 triggered under anything but extreme load.
- Incorrect use of the /3GB switch in the boot.ini file can generate event ID 9582. If you're hosting Exchange 2003 on a Windows 2003 server that has more than 1GB of physical memory, you should set the /3GB switch and its associated /Userva= switch in the boot.ini file so that Windows 2003 has a better balance in its allocation of resources between kernel- and user-mode memory. For more information about these switches, see the Microsoft article "XADM: Event Viewer Log Entries Cite Virtual-Memory Fragmentation on an Exchange 2000 Server" (http://support.microsoft.com/?kbid=314736) and "XADM: Using the /Userva Switch on Windows 2003 Server-Based Exchange Servers" (http://support.microsoft.com/?kbid=810371).
Note that some third-party products, particularly virus checkers, can affect virtual-memory usage. The sidebar "Monitoring Virtual Memory," page 56, discusses how to determine the amount of virtual memory a third-party product uses as well as how to monitor the amount of available virtual memory in a cluster.
Prev. page
1
[2]
3
4
next page