FIX: "Could not redo log record" error, and replica is suspended in SQL Server 2016, 2014, or 2012

Symptoms

Consider the following scenario:

  • You are using Microsoft SQL Server 2016, 2014, or 2012.
  • You have a database that is part of the AlwaysOn Availability Group.
  • On the primary replica, you shrink the database files to reduce their size.
  • The primary replica sends all the changes that are recorded in the transaction log to the secondary replica.
  • On the secondary replica, the redo threads apply the changes from the transaction log to the database that is part of the availability group.
In this scenario, the replica is suspended.

Additionally, you may receive an error message that resembles the following:
<time stamp> spid41s Error: 3456, Severity: 21, State: 1.
<time stamp> spid41s Could not redo log record (#), for transaction ID (#), on page (#), database '<dbname>' (database ID #). Page: LSN = (#), allocation unit = #, type = #. Log: OpCode = #, context #, PrevPageLSN: (#). Restore from a backup of the database, or repair the database.
<time stamp> spid41s AlwaysOn Availability Groups data movement for database '<dbname>' has been suspended for the following reason: "system" (Source ID 2; Source string: 'SUSPEND_FROM_REDO'). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online.
<time stamp> spid41s Error: 3313, Severity: 21, State: 2.<time stamp> spid41s During redoing of a logged operation in database '<dbname>', an error occurred at log record ID (#). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.

Cause

This problem occurs when changes are applied during the redo process if the database engine encounters out-of-order LSNs on the system pages (GAM, PFS).

Resolution

The issue was first fixed in the following cumulative update of SQL Server:

Recommendation: Install the latest cumulative update for SQL Server


The update prevents this problem from occurring. If the problem has already occurred, follow these steps to rejoin the AlwaysOn Availability Group:
  1. Remove the existing AlwaysOn secondary replica.
  2. Run the following command on the affected data files to remove unallocated space from the database:
    DBCC SHRINKFILE(<file_id>, TRUNCATEONLY) 
  3. Back up the database and log files.
  4. Restore the database and logs on the AlwaysOn secondary replica.
  5. Join the AlwaysOn Availability Group.

Status

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.

References

Learn about the terminology that Microsoft uses to describe software updates.
Propriétés

ID d'article : 3173471 - Dernière mise à jour : 6 janv. 2017 - Révision : 1

Commentaires