If the heartbeat packet is not received within two heartbeat periods, and the Local Area Network (LAN) to which the server cluster is connected to is configured for client to cluster communication, and then the Cluster service tests the ability of each node to communicate with external hosts. Note that external hosts, by this definition, correspond to IP addresses that are obtained by using the method in the following example. Note that a frequently used external host would be the local router (default gateway).
- The cluster has two nodes, Node1 and Node2.
- HEARTBEAT CONNECTION is configured as a private network for heartbeat communication.
- PUBLIC CONNECTION is configured as a mixed network for client access.
- NIC1 is attached to Node1. NIC2 is attached to Node2. NIC1 and NIC2 are members of PUBLIC CONNECTION.
- Obtain all IP addresses that are bound to NIC1 to form IPLIST1.
- Obtain all IP addresses that are bound to NIC2 to form IPLIST2.
- Combine IPLIST1 and IPLIST2 to form IPLIST.
- Check the IP Route Table of Node1 to obtain the IP addresses (PINGLIST11) that are listed as Gateways and masked with the network mask of Interface NIC1 to match the subnet of NIC1 (the default gateway of NIC1 is included in this list). Check the current TCP Connection Table that is established with NIC1 to obtain the TCP Remote addresses (PINGLIST12). Combine PINGLIST11 and PINGLIST12 to form PINGLIST1.
- Check the IP Route Table of Node2 to obtain the IP addresses (PINGLIST21) that are listed as Gateways and masked with the network mask of Interface NIC2 to match the subnet of NIC2 (the default gateway of NIC2 is included in this list). Check the current TCP Connection Table that is established with NIC2 to obtain the TCP Remote addresses (PINGLIST22). Combine PINGLIST21 and PINGLIST22 to form PINGLIST2.
- Combine PINGLIST1 and PINGLIST2 to form PINGLIST.
- Combine IPLISTS and PINGLIST to form UNIONLIST. Remove the duplicate items, remove the IP addresses that are bound to local NICs, and remove the IP addresses that are not in the LAN of PUBLIC CONNECTION. UNIONLIST lists all the IP addresses that can be "external hosts."
Based on these requirements, there are no external hosts for use in determining the extent of the failure. If there is no alternate LAN for private cluster communication, the Cluster service must use the quorum device to arbitrate which node should remain up and running. Otherwise, an alternate available LAN is used for private cluster communications. Note that this process does not take into account the status of LANs designated for client use only.
Network Interface States
UnavailableThe owning node is down.
FailedReports that other interfaces on the LAN can communicate with each other or with external hosts, while the local interface cannot. The possible causes for this state are:
- Network adapter failure.
- Network adapter driver failure.
- Local cable failure.
- Port failure on the device that the network adapter is connected to.
UnreachableCannot communicate with at least one other interface whose state is not Failed, and/or not Unavailable.
UpCan communicate with all other interfaces on the LAN whose states are not Failed, and/or not Unavailable. This is the normal operational state.
UnavailableAll interfaces defined on this cluster network are Unavailable.
DownAll network interfaces defined on this cluster network have lost communication with each other and with all known external hosts. All connected network interfaces on up nodes are in either the Failed or the Unreachable state. Therefore, all Transport Control Protocol/Internet Protocol (TCP/IP) address resources that are defined on the same subnet, and all resources that depend on these resources, do not work and are unavailable on the LAN.
PartitionedOne or more network interfaces are in the Unreachable state, but at least two interfaces can still communicate with each other or with an external host.
NOTE: This only applies to server clusters that have two or more nodes.
UpAll network interfaces defined on this cluster network that are not Failed and are not Unavailable can communicate. This is the normal operational state. In the following examples, there is only one LAN in the server cluster which is configured for client to public communication, and this LAN is lost.
NOTE: Disabling media sense on each node in the cluster affects its behavior, and this behavior is noted in the examples listed below.
For more information about disabling media sense, click the following article number to view the article in the Microsoft Knowledge Base:
Node A and Node B
- Node A and node B lose communication.
- Node B can communicate with an external host.
- Node A cannot communicate with any external hosts.
- The node A network interface state is Unreachable, Failed and then this network interface disappears from Cluster Administrator.
- The node B network interface state is Unreachable, and then Up.
- The Network state is Up.
- Any resource groups with TCP/IP address resources dependent on the network interface that has failed, fail over to node B.
Node A and Node B
- Node A and node B lose communication.
- Node A and node B cannot communicate with any external hosts.
- The state of both node A and node B network interfaces is Unreachable, and they disappear from Cluster Administrator.
- The Network state is Down, and the network disappears from Cluster Administrator. When the LAN connection is restored, this LAN inherits the default network role which is to be used for both client and private communication. If something different is needed, it must be modified manually.
- No resource groups fail over. TCP/IP address resources dependent on that network fail, and all resources that are dependent on that TCP/IP address are taken offline.
Results with Media Sense Disabled
- Both network interfaces are Unreachable until network connectivity can be re-established.
- Network state remains Down until the LAN connection is restored. This retains the network role configuration.
- The resources remain online.
For more information about the Windows NT 4.0 interface state algorithm, click the following article number to view the article in the Microsoft Knowledge Base:
Artikelnummer: 242600 – Letzte Überarbeitung: 01.03.2007 – Revision: 1