Understanding DC Failover
If a site has two or more DCs, you generally don't have to worry about failover because a client will always choose a DC in its site, as long as one is available.
However, under certain circumstances, two or more DCs in a site can become unavailable while the clients are still functional (e.g., a blown data center circuit breaker or failed air conditioning). Because these situations are unlikely, designing for this type of failover isn't cost-effective.
So what happens in a typical site topology when DC failover occurs? If SPOKE1-DC1 is unavailable, the client attempts to query the next DC in the DC list. Remember that the list consists of sitewide DCs and domainwide DCs. Because the Spoke1 site contains only one DC and because that DC fails to respond to the client's Lightweight Directory Access Protocol (LDAP) over UDP pings, the client begins querying the domainwide DCs on the list. The remaining DCs on the list appear in random order, so the next DC that the client queries could be anywhere in the domain. The client will work through this list until a DC responds to its queries.
Querying the domainwide DCs on the list decreases the chance that the client will get the best possible DC because no DC is favored over any other, regardless of how close a particular DC might be to the client. The Windows 2003 and Win2K DC query intervalthe interval between queries that the client waits before moving to the next DC on the listcompounds the difficulty of the situation. In Windows NT 4.0, the OS sent these queries immediately with no pause between them, which meant that the fastest-responding DC (presumably the closest) would win the session setup with the client. In Win2K, the client waits 100 milliseconds (ms) between DC queries. In Windows 2003, the client waits 400ms between queries for the first five DCs, then 200ms between the next five, then 100ms between the remaining DCs. In either Windows 2003 or Win2K, this interval lets the client easily pick an inappropriate DC.
Let's use Figure 2 to show how this behavior will influence DC selection. Assume the network latency between the client and the DCs in Spoke 3 is 150ms. The DCs in the closer Hub site are only 75ms away from the client. Because SPOKE3-DC2 is at the top of the DC list, the client pings it first. In a Win2K network, SPOKE3-DC2 can't respond within the 100ms interval, so the client moves to the next DC on the list and pings HUB-DC2, a more appropriate choice. Before HUB-DC2 can respond, however, SPOKE3-DC2's response, which has a 100ms head start, returns to the client and the client establishes a session with that DC. In a Windows 2003 network, SPOKE3-DC2 has plenty of time to respond before the 400ms interval has expired and the client attempts to contact the next DC in the list. In either configuration, if your client can't find a DC in its own site and you don't manually influence the domainwide DC list, you might not get the most appropriate DC.
Providing DC Failover Capability
A common misconception is that automatic site coverage (AutoSiteCoverage), which is an integral part of the Windows 2003 and Win2K directory service, will provide failover coverage if no DCs are available in the client's site. When you use AutoSiteCoverage, DCs in the site nearest to the client's site can automatically register themselves into the DC-less site. However, these DCs provide coverage only if no DCs are registered in the client's site. Because AutoSiteCoverage doesn't work if a DC exists in the client's site but isn't responding, AutoSiteCoverage doesn't help with DC failover. You can, however, manually force a DC to register itself to provide DC or GC services for another site. To do so, you must add the site names (separated by spaces) to the SiteCoverage registry value of type REG_DWORD to the HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Netlogon\Parameters registry subkey and perform some additional steps that I'll describe later to make the registration work correctly.
You can use any of three major techniques to provide DC failover capability for your network. Depending on your needs, you can use these techniques individually or in combination.
Method 1: Selective SRV registration. As I mentioned previously, controlling the contents of the DC list controls the client's DC failover behavior. In our example, the domainwide section of the DC list contains DCs from the distant Spoke2 and Spoke3 sites as well as the closer Hub site. What if you could prevent DNS from adding the Spoke2 and Spoke3 DCs but still add the Hub DCs to the DC list? You can, by using a technique to prevent the spoke-site DCs from registering their domainwide SRV records. In other words, if the spoke sites don't register their domainwide SRV records (e.g., _ldap._tcp.dc._msdcs.domain.name), DNS will exclude them from the domainwide section of the DC list. As a result, the domainwide section will contain only the Hub-site DCs, as Figure 3, page 76, shows. However, this technique doesn't prevent the spoke-site DCs from registering site-specific SRV records. Because the spoke-site DCs register their site-specific SRV records, when a spoke-site client requests a DC list, the spoke-site DCs appear in the sitewide DC section of the list.
For details regarding how to prevent DCs from registering certain SRV records in DNS, see the Microsoft Active Directory Branch Office Guide Series: Planning Guide, Chapter 2, "Structural Planning for Branch Office Environments" (http://www.microsoft.com/technet/treeview/default.asp?url=/technet/prodtechnol/ad/windows2000/deploy/adguide/default.asp).
Prev. page
1
[2]
3
next page