Troubleshooting automatic failover problems in SQL Server 2012 AlwaysOn environments

Summary

Microsoft SQL Server 2012 AlwaysOn availability groups can be configured for automatic failover. Therefore, if a health issue is detected on the instance of SQL Server that is hosting the primary replica, the primary role can be transitioned to the automatic failover partner (secondary replica). However, the secondary replica cannot always be transitioned to the primary role, instead being transitioned only to the resolving role. Unless the primary replica returns to a healthy state, there is no replica in the primary role. Additionally, the availability databases are inaccessible.

This article lists some common causes of unsuccessful automatic failover. Additionally, this article discusses the steps that you can perform in order to diagnose the cause of these failures.

More Information

The symptoms when an automatic failover is triggered successfully

When an automatic failover is triggered on the instance of SQL Server that is hosting the primary replica, the secondary replica transitions to the resolving role and then to the primary role. Additionally, you receive error messages in the SQL Server log report that resemble the following:

The state of the local availability replica in availability group '<Group name>' has changed from 'RESOLVING_NORMAL' to 'PRIMARY_PENDING'
The state of the local availability replica in availability group '<Group name>' has changed from 'PRIMARY_PENDING' to 'PRIMARY_NORMAL'
Image 1

Note The secondary replica transitions successfully from a RESOLVING_NORMAL status to a PRIMARY_NORMAL status.

The symptoms when automatic failover is unsuccessful

If an automatic failover event is not successful, the secondary replica does not successfully transition to the primary role. Therefore, the availability replica will report that this replica is in Resolving status. Additionally, the availability databases report that they are in Not Synchronizing status, and applications cannot access these databases.

For example, in the following image, SQL Server Management Studio reports that the secondary replica is in Resolving status because the automatic failover process was unable to transition the secondary replica into the primary role:
Image

This article describes several possible reasons that automatic failover may not succeed, and how to diagnose each cause.

Case 1: "Maximum Failures in the Specified Period" value is exhausted
Case 2: Insufficient NT Authority\SYSTEM account permissions
Case 3: The availability databases are not in a SYNCHRONIZED state
Properties

Article ID: 2833707 - Last Review: Apr 22, 2013 - Revision: 1

Microsoft SQL Server 2012 Developer, Microsoft SQL Server 2012 Enterprise, Microsoft SQL Server 2012 Express, Microsoft SQL Server 2012 Service Pack 1, Microsoft SQL Server 2012 Standard, Microsoft SQL Server 2012 Web, Microsoft SQL Server 2012 Enterprise Core

Feedback