FIX: Heavy memory usage in ReFS on Windows Server 2016 and Windows 10

Applies to: Windows 10, version 1809Windows Server 2019, all versionsWindows Server 2016 More

Symptoms


You notice heavy memory usage on a computer that's running Windows Server 2016 or Windows 10.

Cause


To provide greater resiliency for its metadata, the Resilient File System (ReFS) in Windows Server 2016 uses allocate-on-write semantics for all metadata updates. This means that ReFS never makes in-place updates to metadata. Instead, it makes all writes to newly allocated regions. 

However, allocating-on-write causes ReFS to issue more metadata I/O to new regions of the volume than write-in-place file systems do. Additionally, ReFS uses block caching logic to cache its metadata in RAM. This is not as resource-efficient as file caching logic. 

Together, the ReFS block caching logic and allocate-on-write semantics cause ReFS metadata streams to be large. ReFS uses the cache manager to create the metadata streams, and the cache manager lazily unmaps inactive views. In some situations, this lazy unmapping causes the active working set on the server to grow. This creates memory pressure that can cause poor performance.

Resolution


This issue is addressed in cumulative update 4013429 that was released on March 14, 2017. The update introduces three tunable registry parameters. (See the "Workaround" section.)

Cumulative update 4013429 is available through Windows Update. You can also download it directly through the Microsoft Update Catalog

For more information, see the following article in the Microsoft Knowledge Base:

How to set the tunable parameters


This update provides three tunable registry parameters to address large ReFS metadata streams. You can use the following optional methods to set the parameters. These parameters can be used in any combination because they don't overlap functionally. 

Warning Serious problems might occur if you modify the registry incorrectly by using Registry Editor or by using another method. These problems might require that you reinstall the operating system. Microsoft cannot guarantee that these problems can be solved. Modify the registry at your own risk.
Important 

  • A restart is required for these parameter changes to take effect.
  • These parameters must be set consistently on every node of a failover cluster.

Option 1

This option causes ReFS to try a complete an MM unmap of all metadata streams at every checkpoint. This option will produce the expected result only if the volume is idle and does not have any mapped pages. 

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
Value Name: RefsEnableLargeWorkingSetTrim
Set RefsEnableLargeWorkingSetTrim = 1
Value Type: REG_DWORD


Option 2

ReFS has a lazy MM unmap logic. Therefore, when ReFS cycles the entire namespace to complete an MM unmap, it unmaps at a certain granularity. The amount of virtual address space that is unmapped is determined by the following formula:

RefsNumberOfChunksToTrim * 128MB (for volume of size > 10 TB)
RefsNumberOfChunksToTrim * 64MB (for volume of size < 10 TB)

This option works if the VA range that's being unmapped does not have any active references (that is, mapped metadata pages).

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
Value Name: RefsNumberOfChunksToTrim
Value Type: REG_DWORD
DEFAULT (if not set or 0): 4

Note Setting RefsNumberOfChunksToTrim to higher values causes ReFS to trim more aggressively. This reduces the amount of memory that's being used. Set the trim value to an appropriate number: 8, 16, 32, and so on.


Option 3

In this option, ReFS sends down an MM trim inline while it unmaps its metadata page. This is the most aggressive option because it can cause performance regression if ReFS is used on high-performance media, such as an SSD or NVMe. 

Specify the indicated values in the following subkey:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
Value Name: RefsEnableInlineTrim
Value Type: REG_DWORD
Set RefsEnableInlineTrim = 1

Recommendation:

If a large active working set causes poor performance, first try to set RefsEnableLargeWorkingSetTrim = 1.

If this setting doesn’t produce a satisfactory result, try different values for RefsNumberOfChunksToTrim, such as 8, 16, 32, and so on.

If this still doesn’t provide the desired effect, set RefsEnableInlineTrim = 1

More Information


To update its metadata, ReFS uses allocate-on-write instead of writing-in-place to improve its resiliency to corruption.

Writing-in-place is susceptible to torn writes. This occurs if a power failure or an unexpected dismount causes a write to be only partially completed.

Allocating-on-write enables ReFS to reliably maintain metadata consistency after a power failure or an unexpected dismount. This is because ReFS can still reference the previous, consistent metadata copy.