How to Troubleshoot Cluster Service Startup Issues in Windows Server 2003
These are the steps in the startup process in order:
- Authenticate the Service account.
- Load the local copy of the cluster database.
- Use information in the local database to try to contact other nodes to begin the join procedure. If a node is contacted and authentication is successful, the join procedure is successful.
- If no other node is available, the Cluster service uses the information in the local database to mount the quorum device and updates the local copy of the database by loading the latest checkpoint file and replaying the quorum log.
Troubleshooting Cluster Service Startup IssuesImportant This section, method, or task contains steps that tell you how to modify the registry. However, serious problems might occur if you modify the registry incorrectly. Therefore, make sure that you follow these steps carefully. For added protection, back up the registry before you modify it. Then, you can restore the registry if a problem occurs. For more information about how to back up and restore the registry, click the following article number to view the article in the Microsoft Knowledge Base:
- Verify that the cluster node that is having problems is able to properly authenticate the Service account. You can determine this by logging on to the computer with the Cluster service account, or by checking the System event log for Cluster service logon problem event messages.
- Verify that the %SystemRoot%\Cluster folder contains a valid Clusdb file and that the Cluster service attempted to start. Start Registry Editor (Regedt32.3xe) and verify that the following registry key is valid and loaded:HKEY_LOCAL_MACHINE\ClusterThe cluster hive should have a structure that is very similar to Cluster Administrator. Make note of the network and quorum keys. If the database is not valid, you can copy and use the cluster database from a live node. If all nodes do not have a valid cluster database, see the following article in the Microsoft Knowledge Base:224999 How to Use the Cluster TMP file to Replace a Damaged Clusdb File
- If the node is not the first node in the cluster, check connectivity to other cluster nodes across all available networks. Use the Ping.exe tool to verify TCP/IP connectivity, and use Cluster Administrator to verify that the Cluster service can be contacted. Use the TCP/IP addresses of the network adapters in the other nodes in the Connect to dialog box in Cluster Administrator.
- If it cannot contact any other node, the service continues with the form phase. It attempts to locate information about the quorum in the local cluster database, and then tries to mount the disk. If the quorum disk cannot be mounted, the service does not start. If another node has successfully started and has ownership of the quorum, the service does not start. This is usually caused by connectivity or authentication issues. If this is not the case, you can check the status of the quorum device by starting the service with the -fixquorum switch, and attempt to bring the quorum disk online, or change the quorum location for the service. Also, check the System event log for disk errors. If the quorum disk successfully comes online, it is likely that the quorum is corrupted. To correct this issue, see the following Microsoft Knowledge Base articles:Windows NT 4.0:172951 How to Recover from a Corrupted Quorum LogWindows 2000:245762 Recovering from a Lost or Corrupted Quorum Log
- Check the attributes of the Cluster.log file to make sure that it is not read-only, and make sure that no policy is in effect that prevents modification of the Cluster.log file. If either of these conditions exist, the Cluster service cannot start.
Article ID: 266274 - Last Review: 03/16/2015 02:41:00 - Revision: 5.0
- kbclustering kbhowto kbtshoot w2000mscs KB266274