ConfigurationThe following issues must be addressed so multiple computers can successfully boot from a SAN:
- To boot multiple computers from a SAN, the SAN must either be configured in a switched environment, or it must be directly attached from each host to one of the storage sub-system's Fibre Channel ports. The use of Fiber Channel - Arbitrated Loop (FC-AL) is not supported when booting multipe servers from the SAN because it does not allow the hosts that are attached to the SAN to be properly segregated from each other. A switched environment allows the hosts to be separate from each other. Booting to a SAN with a Fiber Channel-Arbitrated Loop topology is only supported when booting a single server from the SAN.
- The host must have exclusive access to the disk that it is booting from. No other host on the SAN should be able to detect or have access to the same logical disk. This can be accomplished by using a type of Logical Unit Number (LUN) management such as LUN masking, zoning or some combination of these methods. LUN management is normally configured at the switch, storage subsystem and/or Host Bus Adapter (HBA) level and not within Windows. Windows provide no capabilities for mapping LUNs.
- Multi-path software and multiple HBAs improve your chances of recovery from a path failure. The purpose of having multiple HBAs in a single host is to have redundancy and (possibly) increased throughput. However, if a failure occurs and a path to the SAN is lost, there may be a period of time where the drives on the SAN are not accessible. This path failure may cause problems with the Windows server. Behavior of multi-path software varies greatly between vendors. Check the Windows Catalog (formerly Hardware Compatibility List or HCL) for Storage/RAID systems to make sure that the multi-path driver is in the Windows Catalog with the storage system. If you cannot find the multi-path software, contact your SAN vendor.
To see the Storage/RAID Catalog, please view the following Microsoft Web site:
- If the hosts that are attached are part of a Windows 2000 cluster solution, you must use one HBA for the boot process and a separate HBA for shared storage.
- If the hosts that are attached are part of a Windows 2000 cluster solution and are using the Microsoft multipath I/O (MPIO) feature, you need four HBAs.
TroubleshootingThis section describes several issues that may prevent a Windows server from successfully booting from a SAN:
- A very common problem when you configure a SAN is that it is possible that multiple hosts may have access to the same logical disk. This usually occurs because proper LUN management was not employed. The default behavior of Windows is to attach and mount every logical unit that it detects when the HBA driver loads. If multiple hosts mount the same disk, file system damage can occur. It is up to the configuration of the SAN to insure that only one host can access a particular logical disk at a time. Symptoms of multiple hosts accessing the same logical disk are: Disk Management displays the same logical disk on multiple hosts.Plug and Play notification that new hardware is found may occur on multiple hosts when you add or configure a new logical disk.When you try to access a logical disk by using My Computer or Windows Explorer, you may receive an "Access Denied", "Device not Ready", or similar error message that may indicate that other hosts have access to the same logical disk.
- Your computer stops responding (hangs) or has slow response times. This can indicate that there is a high latency to the pagefile, and this may be accompanied by events in the System Log such as: Event ID: 51If the preceding error messages are in the System Log, it indicates that Windows was trying to access a disk and there was a problem. If the disk that is referenced is on the SAN, it could indicate a latency issue. If an Event ID 51 is shown, this indicates that Memory Manager was attempting to copy data to or from memory and had a problem. Another indicator of pagefile latency issues is if the Windows server has a system failure, and either of the following error messages are displayed on a blue screen:
Event Type: Warning
Event Source: Disk
Description: An error was detected on device \Device\Harddisk0\DR0 during a paging operation.
Event ID: 11
Description: The driver detected a controller error on Device\ScsiPort0.
Event ID: 9
Description: The device, \Device\ScsiPort0, did not respond within the timeout period.0x00000050 PAGE_FAULT_IN_NONPAGED_AREAA possible resolution is to place the pagefile on the host's local hard disk. Windows needs reliable access to the pagefile as data is paged in or out of memory. Having the pagefile local to the host guarantees that access is not influenced by other devices and hosts on the SAN.
Note If the pagefile is not on the same partition as the boot partition (typically c:\Windows or c:\WINNT), the creation of a Memory.dmp file will not occur. A Memory.dmp file is used for troubleshooting a Windows computer that has a STOP error. For information about how to configure your computer for a crashdump, see Windows Help.
For more information about Windows server clusters in a SAN environment, click the following article numbers to view the articles in the Microsoft Knowledge Base: