Note This article is directed toward fibre channel solutions because if you remove the cable in a small computer system interface (SCSI) solution, termination may be broken and the SCSI bus may be disrupted.
- Storage controllers
- Drivers for the HBA
- Related firmware revisions
- Multiple-path software that may be installed
When you remove the cable from the HBA, if the physical disk resource does not fail over automatically or if it fails over but does not fail back, contact the hardware vendor. The hardware controls the behavior of devices on a shared fibre bus (switches, storage controllers, HBAs, and other devices) when you remove the cable. The Cluster service operates accordingly if you remove and reinsert the cable to the HBA. The hardware vendor must determine and support the particular server cluster implementation and confirm that all devices (switches, storage controllers, HBAs, and other devices) handle the event properly. Typically, problems occur if HBAs, switches, and storage controllers are configured incorrectly or if you are using incorrect versions of drivers and firmware for these devices.
Microsoft Product Support Services (PSS) does not have access to all configurations of a fibre cluster (switches, storage controllers, HBAs, and other devices). The vendor who designed and implemented the fibre solution must have tested the solution. It is the hardware vendor's responsibility to verify that the physical disk will work as expected if you remove the fibre cable.
Hardware vendors use the following different types of drivers for HBAs:
- A miniport driver: This driver contains the hardware-specific information for the fibre card and it interfaces with Scsiport.sys to communicate with Windows.
- A port driver: This driver bypasses Scsiport.sys to communicate with Windows; the driver implements Scsiport.sys functionality internally.
There are two sets of connections between the server cluster nodes and the actual storage device. One connection is between the nodes and a switch; the second connection is between the switch and the storage controller. Therefore, there are two places for cable failure. The failure may occur between the HBA and the switch or between the switch and the storage controller. The following sections describe the behavior that occurs when you remove and reinsert cables for these kinds of connections.
Note For multiple-path scenarios, there are more connections that may be affected. See the "Removing and Reinserting a Cable in a Multiple-Path Environment" section later in this article for more information.
Removing the cable between the HBA and the switchIf you remove a cable between the HBA and the switch (or if the cable fails), the HBA driver logs several events to Windows. The HBA miniport driver generates a "BusChangeDetected" notification, which indicates that a target device has been added or removed from the bus. However, if the HBA miniport driver reports a "ResetDetected" generic status notification, this notification indicates that the HBA has detected a bus reset on the SCSI bus. After this notification is generated, the HBA miniport driver is still responsible for completing any active requests. Port drivers do not issue "BusChangeDetected" notifications; instead, they process device removals internally. Port drivers typically generate "IoInvalidateDeviceRelations" notifications in Windows. Regardless of the type of driver, if the HBA driver properly reports to Windows that a device has been removed, the clustering software detects that the disk is no longer available, and it fails the disks over to another member in the cluster. If a SCSI reserve that is issued by Clusdisk.sys (the cluster disk driver) fails, the failure is detected in approximately three seconds.
- LooksAlive: This routine is a cursory status check that runs every five seconds (by default). This routine checks that the disk status is not marked as "Failed," which indicates a loss of the periodic SCSI reserve.
- IsAlive: This routine is a more complete check that occurs every 60 seconds (by default). This routine checks that the disk status is not marked as "Failed," which indicates periodic SCSI reserve failure. If the status is not marked as "Failed," FindFirstFile runs on the root of the disk to make sure the file system is still mounted and that the disk is accessible.
Removing the cable between the switch and the storage controllerIf you disconnect or remove the cable from the switch to the storage controller, the HBA receives a Registered State Change Notification (RSCN) from the switch. When the HBA receives the RSCN, the HBA driver notifies Windows of any changes it detects. Detection is a very complex operation, which has many external dependencies that the hardware vendor will have to verify. If the switch does not issue an RSCN or if the HBA driver does not notify Windows that the device has been removed, the LooksAlive routine or the IsAlive routine fails on the resource, and fail over occurs to the other node (as described earlier).
Reinserting the cable between the HBA and the switchThe Cluster service allows you to reconnect or replace a cable between the HBA and the switch, and then allows for the node to take ownership of the physical disk resource. The HBA driver must consider several issues for a Plug and Play rescan to occur. The HBA miniport driver must issue a "BusChangeDetected" notification when the cable is reconnected (an HBA port driver issues an "IoInvalidateDeviceRelations" notification) so that Windows is notified that a change has been made to the shared bus. When Windows receives this notification, the disks are redetected. If the HBA driver does not properly notify Windows of the device insertion, you may be able to manually initiate the discovery of the disk. To do so, make Windows rescan the storage system by using the Disk Management utility. If you do so, Windows is forced to detect devices on the shared bus. If issues occur when you rescan the devices after the cable has been reinserted, contact the hardware manufacturer. Typically, issues occur with this functionality if the HBAs and the switches are configured incorrectly or if you are using incorrect drivers or firmware.
Reinserting the cable between the switch and the storage ControllerWhen you reinsert the cable between the switch and the storage controller, the switch issues an RSCN so that the HBA miniport driver can notify Windows by using a "BusChangeDetected" notification (an HBA port driver issues an "IoInvalidateDeviceRelations" notification), which indicates that the devices are now available. If the driver does not notify Windows that the devices are available, you may be able to rescan the storage system by using the Disk Management utility. If you do so, Windows is forced to detect devices on the shared bus. If you experience issues rescanning the devices after you reinsert the cable, contact the hardware manufacturer.
Removing and reinserting a cable in a multiple-path environmentYou can use multiple-path (also known as MPIO) software to add redundancy to the shared bus to help maintain a high level of availability. Some hardware vendors offer multiple-path software that enables you to use multiple HBAs to the shared disk (the use of this software is acceptable). If you remove the cable from one of the HBAs, all data can be rerouted through another HBA. However, if any problems seem to be related to multiple-path software, Microsoft PSS requires the hardware vendor to work on the issue. In extreme cases, Microsoft PSS may ask you to disable the multiple-path software temporarily to see if it is causing the problem.
WARNING You may experience problems if you disable the multiple-path software before you unplug all but one path to the storage. Contact the hardware manufacturer and see the following Microsoft Knowledge Base article for details:
For more information about how to use the disk on the shared bus, click the following article numbers to view the articles in the Microsoft Knowledge Base:
Artikelnummer: 294173 – Letzte Überarbeitung: 12.11.2009 – Revision: 1