Article ID: 311081 - Last Review: June 28, 2007 - Revision: 4.6 Troubleshooting Multiple Cluster Symptoms on the Same SANThis article was previously published under Q311081 SUMMARY This article describes the multiple-cluster scenarios when
certification is not met and the disks are allowed to see multiple-clustered
nodes on the same SAN. Multiple-cluster is more than one set of MSCS clusters
that are assigned to one or more fiber-attached host bus adapters
(HBAs). These same SAN devices can be attached to main frame computers or UNIX operating systems. This can present some challenges because of differences in SCSI commands sets. These anomalies can be caused by firmware revisions or the inability to properly zone or mask the bus resets to control the disks with MSCS Cluster Services. Without this protection in place (and proper masking, zoning or a combination of both) the following problems could occur:
MORE INFORMATION The issues that are described in the "Summary" section of
this article may appear to resolve themselves after Chkdsk.exe runs, but they
may then return several weeks later and repeat the same pattern on one or more
clustered nodes. These issues may be seen in any pattern, but Event IDs 26, 50,
or 51 are most prevalent. You may also see event warnings and error messages that are similar to the following error messages: Event ID: 51 Source: Disk Description: An error was detected on device \Device\Harddisk9\DR9 during a paging operation. Event ID: 50 Source: Disk Description: {Lost Delayed-Write Data} The system was attempting to transfer file data from buffers to \Device\Harddisk\Volumex. The write operation failed, and only some of the data may have been written to the file. Event ID: 26 Source: Application Popup Description: Application popup: Windows - Delayed Write Failed : Windows was unable to save all the data for the file \Device\HarddiskVolumex\SQLDatabases\System\machine\LOG. The data has been lost. This error may be caused by a failure of your computer hardware or network connection. Please try to save this file elsewhere. Event ID: 9 Source: HBA Driver Description: The device, \Device\Scsi\HBA driver, did not respond within the timeout period. Event ID: 15 Source: Disk Description: The device, \Device\Harddiskx\DRx, is not ready for access yet.
Event ID: 1066 Source: ClusSvc Description: Cluster disk resource Disk x: is corrupt. Running ChkDsk /F to repair problems. Event ID: 1123 Source: ClusSvc Description: The node lost communication with cluster node 'machine' on network 'heartbeat'. Event ID: 1122 Source: ClusSvc Description: The node (re)established communication with cluster node 'machine' on network 'heartbeat'. APPLIES TO
| Article Translations
|
Back to the top
