Slow performance or “Lost Communication,” “IO Error,” “Detached,” or “No Redundancy” errors for Storage Spaces Direct deployments that use Intel P3x00 NVMe devices

Applies to: Windows Server 2016 Datacenter

Summary


Microsoft has identified a critical issue that affects some Storage Spaces Direct (S2D) users who are using hardware based on the Intel P3x00 family of NVM Express (NVMe) devices with firmware versions before “Maintenance Release 8”.

Note Individual OEMs may have devices that are based on the Intel P3x00 family of NVMe devices with unique firmware version strings .Contact your OEM for more information of the latest firmware version.

If you are using hardware in your deployment based on the Intel P3x00 family of NVMe devices, we recommends that you immediately apply the latest available firmware (at least “Maintenance Release 8”).

Symptoms


When this issue occurs, your cluster may experience any of the following symptoms:

  • Slow workload performance
  • Virtual disks in the cluster that have an Operational Status value of Detached or No Redundancy.
  • Physical disks that report a status of Lost Communication or IO Error.

Updating storage device firmware


For more information on updating storage device firmware in an automated manner with Storage Spaces Direct (S2D), see the following article:

Automated firmware updates with Storage Spaces Direct.

For a step-by-step video on updating storage device firmware in an automated manner with Storage Spaces Direct (S2D), refer to the following video:

Update Drive Firmware Without Downtime in Storage Spaces Direct

More Information


Microsoft has observed reports of unexpectedly long tail latencies for the Intel P3x00 family of NVMe devices with firmware versions prior to “Maintenance Release 8”. In some cases, these latencies exceed 30 seconds. This can cause Windows to mark the device as unresponsive.

After multiple unsuccessful attempts to reuse the hardware, Windows stops using the device within the cluster. If enough devices become unresponsive, the availability of virtual disks can be affected.

Status


Microsoft has confirmed that this is a hardware issue which impacts the Microsoft products that are listed in the "Applies to" section. Intel has root-caused the issue, and has confirmed it has been addressed in firmware versions based on “Maintenance Release 8”.

Known hardware impacted:

  • Hardware based on Intel P3x00 family of NVMe devices (example: P3500, P3600, P3700 NVMe in all capacities)

Not impacted:

  • Hardware based on the Intel S3x00 family of SATA devices (example S3500, S3600, S3700 SATA in all capacities)
  • Hardware based on the Intel P4x00 family of NVMe devices
  • Hardware based on the Intel S4x00 family of SATA devices.