INF: SQL Server may display memory corruption and recovery errors

Summary

Microsoft SQL Server may display one of the following messages in the SQL Server error log:

854: Machine supports memory error recovery. SQL memory protection is enabled to recover from memory corruption.

856: SQL Server has detected hardware memory corruption in database '%ls', file ID: %u, page ID; %u, memory address: 0x%I64x and has successfully recovered the page.

855: Uncorrectable hardware memory corruption detected. Your system may become unstable. Please check the Windows event log for more details.

More Information

On computers that have newer hardware and are running Windows Server 2012 or a later version, the hardware can notify the operating system and applications that memory pages (operating system pages) are marked as bad or damaged. Applications such as SQL Server can register these bad memory page notifications by using the following API set:
  • GetMemoryErrorHandlingCapabilities
  • RegisterBadMemoryNotification
  • BadMemoryCallbackRoutine
SQL Server adds support for these notifications in Microsoft SQL Server 2012 and later versions. During SQL Server startup, SQL Server checks whether the hardware supports this new feature. Additionally, you receive the following message in the error log:

2014-05-04 10:06:01.54 Server Machine supports memory error recovery. SQL memory protection is enabled to recover from memory corruption.

Currently, only the buffer pool takes action when SQL Server receives these notifications. When it receives a notification, SQL Server has to iterate through the whole buffer pool and discover the address for each allocated buffer. Then, SQL Server uses the QueryWorkingSetEX API to check whether any of the memory pages that back the data page is marked as bad. The PSAPI_WORKING_SET_EX_BLOCK output structure that corresponds to this memory page will have its member bad set to 1 if there is any damaged reported.

If that buffer pool or data page is currently not changed or not processing I/O, SQL Server can discard and de-commit the data page. Then, SQL Server logs the following message:

SQL Server has detected hardware memory corruption in database '%ls', file ID: %u, page ID; %u, memory address: 0x%I64x and has successfully recovered the page.

When queries require that data page again, the buffer pool can read the data page back from disk and bring the contents back to the buffer pool. It is also possible for the on-disk version of the page to be in a damaged state. In that case, SQL Server may log additional errors such as error 824. 

If the bad page is used not by the buffer pool but by some other cached object or structure, SQL Server logs the following message:

Uncorrectable hardware memory corruption detected. Your system may become unstable. Please check the Windows event log for more details.

If the server is reporting memory errors, you should contact the computer hardware vendor and perform appropriate actions such as performing memory diagnostics, updating BIOS and firmware, and replacing bad memory modules.

The following two extended events are available starting with SQL Server 2012. They are called for each page that is either fixed or identified as corrupted but cannot be fixed.



You can use SQL Server trace flag 849 to keep SQL Server from registering with the operating system for memory error notifications. However, be aware that trace flag 849 will disable SQL Server from receiving bad memory notifications from operating system. Therefore, we do not recommend that you use this trace flag under typical circumstances.

Also, be aware that, by default, SQL Server will receive these notifications on supported hardware.

You should also be aware that when SQL Server registers for these memory error notifications, the lazy writer system process does not perform constant page checks. For more information about constant page checks, click the following article number to view the article in the Microsoft Knowledge Base:
2015759 How to troubleshoot Msg 832 (constant page has changed) in SQL Server

Properties

Article ID: 2967651 - Last Review: 20 May 2014 - Revision: 1

Feedback