More tricks to prepare for and recover from NT meltdowns
In "Recovering from NT Startup Failures, Part 1," September 1999, I discussed common causes of Windows NT startup failures and introduced you to several techniques that you can use to prevent and quickly recover from NT boot disasters. In this second installation, I provide more prevention and recovery tips, and discuss additional NT boot failure causes and the methods and troubleshooting tools you can use to quickly recover from them.
Be Prepared
As I concluded in part 1, the most important step in NT recovery happens long before a failure occurspreparing for a problem before it begins. To prepare for tomorrow's worst possibilities, you need to take precautionary steps today, such as properly designing your NT systems' hardware and software setup, backing up crucial system configuration information, and developing a disaster-recovery toolkit that includes all the utilities you'll need to recover from common NT boot problems. These resources are your ace in the hole if things go awry.
Most NT users know the importance of maintaining up-to-date copies of the Emergency Repair Disk (ERD) for NT systems. This disk contains a copy of the system Registry and provides crucial information that you need to use NT Setup's Repair process to locate and repair a damaged NT installation. Most IT shops perform regular system backups and create updates of the NT ERD for their NT servers. However, many organizations consider this process tedious and time-consuming because the process requires administrators to physically visit each server and run rdisk.exe. Thus, critical servers' ERDs aren't always as up-to-date as they need to be. If this situation sounds familiar, consider an alternative method of collecting ERD information for your NT machines. Aelita Software Group's ERDisk utility, which Screen 1, page 84 shows, can perform remote, over-the-network ERD creations. In addition to storing ERD information to any drive location (local or network drive) that you specify, ERDisk can handle multiple machines' batch jobs, which you can schedule to run automatically. ERDisk can automate the ERD update process on all your networked NT systems, so you don't have an excuse for not having updated ERDs. (For Aelita's contact information, see "Recovery Resources.")
Cross-Backups
You need to be vigilant about maintaining updated ERDs for each of your NT systems, but your preventive maintenance shouldn't stop there. In part 1, I discussed methods for maintaining Registry backups that are convenient when you have to perform a recovery operation. For example, the Microsoft Windows NT Server 4.0 Resource Kit regback.exe utility lets you create uncompressed copies of individual Registry hive files. These uncompressed Registry copies are convenient when you need to replace Registry hives. (For more information about regback.exe, see the sidebar "The Regback Profile Quirk," page 86.) However, common sense dictates that storing backup data on the hard disk of the system you're backing up isn't the most fault-tolerant practice. Alternatively, consider using cross-backups, in which you copy important system configuration data, such as Registry backups, from one machine to another machine on the network. The principle behind this practice is that more is always better when it comes to backups, and the best place to store a system's backup is anywhere but on that system.
If cross-backups appeal to you, consider extending this practice beyond Registry data to other types of crucial data. For example, I periodically make offline backups of my Microsoft Exchange Server databases (i.e., dir.edb, pub.edb, and priv.edb) to another server on the network. My backup software uses an Exchange agent to make online backups of Exchange Server; however, I've discovered that a recent offline backup simplifies full Exchange Server recoveries (i.e., when you have to restore Exchange Server from scratch). However, cross-backups should serve as an additional resource that complements your existing disaster-recovery plandon't use cross-backups to replace your primary backup solution (e.g., tape backups).
If you don't want to junk up your systems with backup data, you can place this information on removable media, such as CD-Recordable (CD-R) and CD-Rewritable (CD-RW) discs, Zip and Jaz cartridges, magneto-optic (MO) cartridges, or similar media. This practice is a good idea because 3.5" disks, which are the only storage media that NT's ERD utility supports, don't have the reputation of being the most reliable media type.
Autostarting Services and Devices
In part 1, I talked about the following common causes of NT startup failures and the blue screen of death:
- Installing software that corrupts the HKEY_LOCAL_MACHINE portion of the Registryparticularly software that installs new services or drivers on the system.
- Changing a system's network configuration (e.g., in the Control Panel Network applet), followed by NT miswriting the configuration's network bindings in the Registry.
- Underlying file corruption that occurs on a key system file that was already in memory and working before the corruption.
In addition, I provided methods you can use to resolve these problems. The recovery methods I discussed involved wholesale replacement of Registry hive files.
This month, I highlight startup failures that result from a service or driver causing a STOP error when it initializes. Rather than completely restoring the Registry or overwriting entire Registry hives, you can edit the Registry to solve this problem. This solution might be preferable to replacing Registry hive files if you don't want to lose configuration settings or if you're not sure which service or driver is causing the problem.
In some cases, the STOP error results from a service or driver that loads before the GUI appears (i.e., when NT initializes the video display driver and shifts into graphics mode). In other cases, the error might occur after NT shifts into graphics mode; it can even happen during or after the logon process because some drivers and services might still be loading in the background after NT displays the logon prompt. This situation might be the cause if you've installed a new service or driver, or after you've reinstalled NT. Additional causes of a service/driver startup problem include software installations that install services or drivers that conflict with other services or drivers or the NT's service pack level, and changes to a system's hardware or software configuration that cause drivers or services that previously loaded successfully to become problematic. For example, physically changing the type of network card without first removing the driver causes the old driver to produce a STOP error.
Another situation that results in a STOP error is when you change a video card driver on a system with a remote control package installed (e.g., Symantec's pcANYWHERE32). Most remote control applications hook the current display driver during their installation, so problems result when you pull the original display driver out from under these applications. The originally hooked driver is no longer active, so rebooting the system results in a STOP error or blue screen. To safely change a video driver on a system with a remote control package installed, uninstall the remote control software, change the video driver, then reinstall the remote control software.
Renaming, Moving, or Deleting Offending Files
You can employ several methods to prevent a service or driver STOP error. One method is to rename, move, or delete the file to stop the service or driver from loading. If you know the name of the offending service or driver, you can try booting into DOS if the boot volume is FAT or try a parallel NT installation if the boot volume is NTFS, then rename the file to a temporary name. In many cases, this solution causes the STOP error to disappear but leaves a reference in NT's configuration to a service or driver that is no longer there. If you choose this method, be sure to reinstall the service or driver or completely uninstall it after you've booted into NT. This renaming method doesn't work and can cause problems in situations that involve multiple chained services or drivers, such as the previous remote control software example.
Offline Registry Editing
Another method to resolve this server/driver startup problem is to edit the Registry to manually disable the service or driver. How do you edit the Registry if you can't boot NT? As long as you have an alternative method of accessing the volume that contains your original NT installation, you can edit the Registry. To gain access to Registry data from outside the original NT installation, you can boot to a parallel NT installation on the same system, or you can install a disk that contains the NT boot partition (i.e., the NT installation folder and Registry hive files) onto another NT system.
Gaining access to the Registry through a parallel NT installation on the same system is easier than using a disk because a parallel installation doesn't require physically moving disks between systems. However, whether the NT boot partition is FAT or NTFS, you must boot from NT to edit Registry data because you have to use an NT Registry editor to edit Registry data, which is impossible from outside NT. Unfortunately, no one has developed an NT Registry editor that runs under a different OS, such as DOS.
Prev. page  
[1]
2
next page