Update resolves heavy memory use in ReFS on a computer that is running Windows Server 2016

Applies to: Windows Server Datacenter 2016Windows Server Standard 2016

Summary


You notice heavy memory use in the Resilient File System (ReFS) file system type on a computer that is running Windows Server 2016. You may also notice an ReFS volume become unresponsive or freeze when you perform backups. This can specifically occur when you use a backup application that does large block-clone operations.

This update improves ReFS performance by more thoroughly unmapping multiple views of a file.

How to get the update


This update is included in the February 22, 2018, cumulative update.

More Information


This update includes optional tunable registry parameters to address large ReFS metadata streams that were previously documented in KB 4016173 and KB 4035951.

Important

  • A restart is required for these parameter changes to take effect.
  • These parameters can be used in any combination because they do not overlap functionally.
  • These parameters must be set consistently on every node of a failover cluster.

Tunable parameters

Parameter

Description

RefsEnableLargeWorkingSetTrim

This option causes ReFS to try a complete an MM unmap of all metadata streams at every checkpoint. This option will produce the expected result only if the volume is idle and has no mapped pages.

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem

Value Name: RefsEnableLargeWorkingSetTrim
Value Type: REG_DWORD

Value Data: 1

RefsNumberOfChunksToTrim

ReFS has a lazy MM unmap logic. Therefore, when ReFS cycles the namespace to complete an MM unmap, it unmaps at a certain granularity. The number of virtual address space that is unmapped is determined by the following formula:

RefsNumberOfChunksToTrim * 128MB (for volume of size > 10 TB)

RefsNumberOfChunksToTrim * 64MB (for volume of size < 10 TB)

This option works if the VA range that is being unmapped has no active references (that is, mapped metadata pages).

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem

Value Name: RefsNumberOfChunksToTrim
Value Type: REG_DWORD
Value Data: 4 (decimal)

Note Setting RefsNumberOfChunksToTrim to larger values causes ReFS to trim more aggressively. This reduces the memory that is being used. Set the trim value to an appropriate number: 8, 16, 32, and so on.

RefsEnableInlineTrim

In this option, ReFS sends down an MM trim inline while it unmaps its metadata page. This is the most aggressive option because it can cause performance regression if ReFS is used on high-performance media, such as an SSD or NVMe.

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem

Value Name: RefsEnableInlineTrim
Value Type: REG_DWORD
Value Data: 1

Recommendations:

  • If a large active working set causes poor performance, try to set RefsEnableLargeWorkingSetTrim = 1.
  • If this setting does not produce a satisfactory result, try different values for RefsNumberOfChunksToTrim, such as 8, 16, 32, and so on.
  • If this still does not provide the effect that you want, set RefsEnableInlineTrim = 1.

RefsDisableCachedPins

This option disables cached pins. This was a major cause of the large active working set. Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem

Value Name: RefsDisableCachedPins
Value Type: REG_DWORD
Value Data: 1

RefsProcessedDeleteQueueEntryCountThreshold

This option adds a heuristic to ReFS checkpointing logic. This causes ReFS to run a checkpoint when the delete queue reaches a certain size. IOs are stuck on ReFS because the checkpoint logic got stuck while processing a large delete queue.

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Control\FileSystem

Value Name: RefsProcessedDeleteQueueEntryCountThreshold
Value Type: REG_DWORD
Value Data: 2048 (decimal)

Note Setting RefsProcessedDeleteQueueEntryThreshold to lower values causes ReFS to run checkpoints more frequently. Set the value to 2048, then reduce the value to 1024, then 512.

DuplicateExtentBatchSizeinMB
(Only applicable to Microsoft Data Protection Manager)

Large duplicate extents calls introduce latency into the system. This is because other operations have to wait until these long-running operations are completed. This option reduces the size of the duplicate extents call.

Note DPM will set this registry key change as the default value as part of UR4.

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\Software\Microsoft\Microsoft Data Protection Manager\Configuration\DiskStorage

Value Name: DuplicateExtentBatchSizeinMB
Value Type: REG_DWORD
Value Data: 100 (decimal)

Note The default value for DuplicateExtentBatchSizeinMB is 2000 (2 GB). Any value from 1 to 4095 is accepted.

TimeOutValue

This option extends the TimeOutValue value.

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\Disk

Value Name: TimeOutValue
Value Type: REG_DWORD
Value Data: 0x78 (hexadecimal)

Note The default value for TimeOutValue is 0x41 (65 decimal). 0x78 translates to 120 decimal.