AlwaysOn availability groups may be reported as NOT SYNCHRONIZING after you apply SQL Server 2014 CU5, SQL Server 2012 SP2 CU4, or SQL Server 2012 SP2 CU3

Symptoms


Consider the following scenario:
  • You are running Microsoft SQL Server 2014 or Microsoft SQL Server 2012 Service Pack 2 (SP2) on a server that hosts the secondary replica of an availability group as part of a rolling upgrade.
  • You have applied one of the following updates to the SQL Server installation:
    • SQL Server 2014 Cumulative Update 5
    • SQL Server 2012 Service Pack 2 Cumulative Update 4
    • SQL Server 2012 Service Pack 2 Cumulative Update 3
    Important The hotfix that's mentioned in this article replaces these cumulative updates. Do not install these updates if you have not already done this. 
     
  • To finish installing the cumulative update, you restart this secondary replica.
  • You fail over the availability group that is transitioning the updated secondary replica to the primary role.
In this scenario, you may experience one or more of the following symptoms on the server that's running SQL Server and that's now hosting the primary replica of your availability group: 
  • The secondary replicas are reported as "NOT SYNCHRONIZING."
  • When you query sys.dm_exec_requests, you notice intermittent lock blocking between users sessions and a session whose command is reported as “DB_STARTUP.” You may also notice blocking between the CHECKPOINT and DB_STARTUP commands.
  • Deadlocks that involve the session that recovered one of your availability databases are reported in the SQL Server error log. These logs resemble the following:

    <date/time> spid<xx> Recovery is writing a checkpoint in database <dbname/dbid>. This is
    an informational message only. No user action is required.
    <date/time> spid<xx> Recovery completed for database <dbname/dbid> in <x> second(s) (analysis
    <x> ms, redo <x> ms, undo <x> ms.) This is an informational message only. No user action is required.

    <date/time> spid<xx> Error: 1205, Severity: 13, State: 28.
    <date/time> spid<xx> Transaction (Process ID <xx>) was deadlocked on lock resources with another
    process and has been chosen as the deadlock victim. Rerun the transaction.
  • If your availability database is enabled for Microsoft SQL Server Service Broker, messages in your availability database may not be processed successfully. If you start the Profiler tracing tool, and then you capture the "Broker:Message Classify" event, the following event is captured:

    9791, The broker is disabled in the sender's database
Note This is not a systematic problem. You may be able to apply these cumulative updates on an AlwaysOn configuration without experiencing this problem. If you have already applied these cumulative updates, and you did not notice this problem, your system is not affected and this information does not apply to you.

Cause

This problem occurs because a race condition sometimes occurs between the system threads and the user connections. This prevents the patching logic of the cumulative update from obtaining the locks that are required to complete the update process.

Resolution

To resolve this problem, apply the following critical on-demand (COD) hotfix:

3034679 FIX: AlwaysOn availability groups may be reported as NOT SYNCHRONIZING

Important You must apply this COD hotfix instead of the following cumulative updates:
  • SQL Server 2014 Cumulative Update 5
  • SQL Server 2012 Service Pack 2 Cumulative Update 4
  • SQL Server 2012 Service Pack 2 Cumulative Update 3


Note If you have already applied these cumulative updates, you must use the following work around to resolve this problem.

Workaround

Because this problem is caused by contention between the user session and the upgrade session against the availability databases while the databases transition to the primary role, you must eliminate this contention to enable the databases to recover from this state.

To work around this problem, follow these steps:
  1. Try the following methods in the given order.

    Method 1: Eliminate database access
    When databases are experiencing the symptoms that are mentioned in the “Symptoms” section, use one or both of the following steps as necessary to eliminate the lock-blocking condition:  
    • Query sys.dm_exec_requests to locate sessions in which lock blocking occurs in the availability databases. Use the KILL statement to end these sessions.
    • Disable or stop the application that is accessing the availability databases.

    If Method 1 does not resolve the problem, go to Method 2.


    Method 2: Restart the SQL Server host server
    When application access and user access are still disabled, restart the instance of SQL Server that is hosting the affected availability databases. To do this, follow these steps: 
    1. Disable automatic failover of the availability group.
    2. Restart the affected instance of SQL Server that is hosting the primary replica.
    3. Enable automatic failover of the availability group.
  2. After the affected databases recover completely, reestablish application and user connectivity.

Status

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.

References

For more information about the cumulative updates that are affected by this problem, see the following Microsoft Knowledge Base articles:
속성

문서 ID: 3033492 - 마지막 검토: 2015. 2. 5. - 수정: 1

피드백