Symptoms
Consider the following scenario:
-
You are using Microsoft SQL Server 2016, 2014, or 2012.
-
You have a database that is part of the AlwaysOn Availability Group.
-
On the primary replica, you shrink the database files to reduce their size.
-
The primary replica sends all the changes that are recorded in the transaction log to the secondary replica.
-
On the secondary replica, the redo threads apply the changes from the transaction log to the database that is part of the availability group.
In this scenario, the replica is suspended.
Additionally, you may receive an error message that resembles the following:<time stamp> spid41s Error: 3456, Severity: 21, State: 1.
<time stamp> spid41s Could not redo log record (#), for transaction ID (#), on page (#), database '<dbname>' (database ID #). Page: LSN = (#), allocation unit = #, type = #. Log: OpCode = #, context #, PrevPageLSN: (#). Restore from a backup of the database, or repair the database. <time stamp> spid41s AlwaysOn Availability Groups data movement for database '<dbname>' has been suspended for the following reason: "system" (Source ID 2; Source string: 'SUSPEND_FROM_REDO'). To resume data movement on the database, you will need to resume the database manually. For information about how to resume an availability database, see SQL Server Books Online. <time stamp> spid41s Error: 3313, Severity: 21, State: 2.<time stamp> spid41s During redoing of a logged operation in database '<dbname>', an error occurred at log record ID (#). Typically, the specific failure is previously logged as an error in the Windows Event Log service. Restore the database from a full backup, or repair the database.Cause
This problem occurs when changes are applied during the redo process if the database engine encounters out-of-order LSNs on the system pages (GAM, PFS).
Resolution
The issue was first fixed in the following cumulative update of SQL Server:
Each new cumulative update for SQL Server contains all the hotfixes and all the security fixes that were included with the previous cumulative update. We recommend that you download and install the latest cumulative updates for SQL Server:
The update prevents this problem from occurring. If the problem has already occurred, follow these steps to rejoin the AlwaysOn Availability Group:
-
Remove the existing AlwaysOn secondary replica.
-
Run the following command on the affected data files to remove unallocated space from the database:
DBCC SHRINKFILE(<file_id>, TRUNCATEONLY)
-
Back up the database and log files.
-
Restore the database and logs on the AlwaysOn secondary replica.
-
Join the AlwaysOn Availability Group.
Status
Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.
References
Learn about the terminology that Microsoft uses to describe software updates.