Cluster Service Stops Responding on a Cluster Node When You Restart the Active Node

Article translations Article translations
Article ID: 822050 - View products that this article applies to.
Expand all | Collapse all

SYMPTOMS

When you restart the active node of a server cluster that consists of two or more nodes, you experience all the following symptoms:
  • If you are running Cluster Administrator on the remaining nodes, you receive the following error message when you try to connect to the cluster:
    Cluster 'ClusterName' is no longer available.
  • If you try to start Cluster Administrator, Cluster Administrator stops responding, and you may receive the following error message:
    An error occurred trying to open the cluster at 'ServerName':

    The interface is unknown.

    Error ID: 1717 (000006b5).
  • When you view the contents of C:\Winnt\Cluster.log, you see information similar to the following:
    [FM] OnlineGroup: Failed on resource e3f4af72-6454-4199-b9af-fa6f57032a65. Status 70
    Microsoft Clustering Service suffered an unexpected fatal error
    at line 701 of source module D:\nt\private\cluster\service\fm\group.c. The error code was 70.
  • When the restarted cluster node starts successfully, the Cluster Administrator program that is running on the other nodes responds as you expect.

CAUSE

This issue occurs if you pause one node of a server cluster and then you restart the active cluster node. When the active node restarts, the paused node tries to bring resource groups online. Because this node is paused, the node cannot make additional connections, and it cannot bring the Quorum disk group online. Error code 70 corresponds to the following error message:
The remote server has been paused or is in the process of being started.
Note These results will also occur in clusters that have more than two nodes. Even though a non-paused node exists in a working state when the active node is restarted, if the paused node is the first node that is contacted to take ownership of the quorum disk. The non-paused node does not have the opportunity to arbitrate for the quorum disk.

RESOLUTION

To resolve this issue, resume the paused cluster node before you restart the active cluster node.

Note Before you resume a paused cluster node, you must first determine if a cluster node is paused.
  1. Click Start, click Run, type cmd in the Open box, and then click OK.
  2. At the command prompt, type cluster node, and then press ENTER. Output that is similar to the following appears.

    Note The following sample output is based on a two-node cluster configuration. If you have more than two nodes, the additional nodes will also appear in the list.
    Node           Node ID Status
    -------------- ------- ---------------------
    CLUSTER-1            1 Paused
    CLUSTER-2            2 Up
    Note If the only cluster node that is not paused is in the process of restarting, you receive the following error message:
    System error 1753 has occurred.
    There are no more endpoints available from the endpoint mapper.
  3. At the command prompt, type cluster node node_name /resume (where node_name is the name of the cluster node) and then press ENTER.

    For example, type cluster node cluster-1 /resume, and then press ENTER. Information appears that is similar to the following:
    Resuming node 'cluster-1'...
    
    Node           Node ID Status
    -------------- ------- ---------------------
    CLUSTER-1            1 Up

Properties

Article ID: 822050 - Last Review: October 30, 2006 - Revision: 3.2
APPLIES TO
  • Microsoft Windows 2000 Advanced Server
  • Microsoft Windows 2000 Datacenter Server
  • Microsoft Windows Server 2003, Datacenter Edition (32-bit x86)
  • Microsoft Windows Server 2003, Enterprise Edition (32-bit x86)
Keywords: 
kbprb KB822050

Give Feedback

 

Contact us for more help

Contact us for more help
Connect with Answer Desk for expert help.
Get more support from smallbusiness.support.microsoft.com