Troubleshooting AlwaysOn availability databases in a "recovery pending" or "suspect" state in SQL Server 2012

Article translations Article translations
Article ID: 2857849 - View products that this article applies to.
Expand all | Collapse all

Summary

Assume that an availability database that is defined in an AlwaysOn availability group transitions to a "recovery pending" or "suspect" state in Microsoft SQL Server 2012. If this occurs on the availability group’s primary replica, database availability is affected. In this situation, you cannot access the database through the client applications. Additionally, you cannot drop or remove the database from the availability group.

For example, assume that Microsoft SQL Server is running and that an availability database is set to the "recovery pending" or "suspect" state. When you query the dynamic management views (DMVs) at the primary replica by using the following SQL script, the database might be reported in a NOT_HEALTHY and RECOVERY_PENDING state or in a SUSPECT state as follows:

select dc.database_name, d.synchronization_health_desc, d.synchronization_state_desc, d.database_state_descfrom sys.dm_hadr_database_replica_states d join sys.availability_databases_cluster dcon d.group_database_id=dc.group_database_id and d.is_local=1
database_name   synchronization_health_desc    synchronization_state_desc     database_state_desc
-------------------- ------------------------------ ------------------------------ ---------------------
<Database name> NOT_HEALTHY NOT SYNCHRONIZING RECOVERY_PENDING
(1 row(s) affected)
Collapse this imageExpand this image
Image 1


Additionally, this database may be reported as being in the Not Synchronizing / Recovery Pending or Suspect state in Microsoft SQL Server Management Studio.

Collapse this imageExpand this image
Image 2


When the database is defined in an availability group, the database cannot be dropped or restored. Therefore, you have to take specific steps to recover the database and return it to production use. This article describes the errors and limitations of an availability database that is in a "recovery pending" or "suspect" state and how to restore the database to full functionality in an availability group.

More information

The following content discusses the errors and limitations of an availability database that is in a "recovery pending" state in various situations.

Database status prevents restoring database

Collapse this imageExpand this image
assets folding start collapsed
You try to run the following SQL script in order to restore the database that has the RECOVERY parameter:

restore database <Database name> with recovery
When you run this script, you receive the following error message because the database is defined in an availability group:

Msg 3104, Level 16, State 1, Line 1
RESTORE cannot operate on database '<Database name>' because it is configured for database mirroring or has joined an availability group. If you intend to restore the database, use ALTER DATABASE to remove mirroring or to remove the database from its availability group.

Msg 3013, Level 16, State 1, Line 1
RESTORE DATABASE is terminating abnormally.

Collapse this imageExpand this image
assets folding end collapsed

Database status prevents dropping database

Collapse this imageExpand this image
assets folding start collapsed
You try to run the following SQL script in order to drop the database:

drop database <Database name>
When you run this script, you receive the following error message because the database is defined in an availability group:

Msg 3752, Level 16, State 1, Line 1
The database '<Database name>' is currently joined to an availability group. Before you can drop the database, you need to remove it from the availability group.

Collapse this imageExpand this image
assets folding end collapsed

Database status prevents removing database from availability group

Collapse this imageExpand this image
assets folding start collapsed
You try to run the following SQL script to remove the database from the availability group:

alter database <Database name> set hadr off
When you try to run this script, you receive the following error message because the availability database belongs to the primary replica:

Msg 35240, Level 16, State 14, Line 1
Database '<Database name>' cannot be joined to or unjoined from availability group '<Availability group name>'. This operation is not supported on the primary replica of the availability group.

Because of this error message, you may be compelled to fail over the database. After the database is failed over, the replica that owns the "recovery pending" database is in the secondary role. In this situation, you try to execute the following SQL script again in order to remove the database from the availability group at the secondary replica:
alter database <Database name> set hadr off
However, you still cannot remove the database from the availability group, and you receive the following error message because the database is still in a "recovery pending" state:

Msg 921, Level 16, State 112, Line 1
Database '<Database name>' has not been recovered yet. Wait and try again.
Collapse this imageExpand this image
assets folding end collapsed

Resolution

To resolve this issue, use the following methods, as appropriate for the situation.

The database is in the secondary role

Collapse this imageExpand this image
assets folding start collapsed
To resolve this issue, take the following general actions:
  • Remove from the availability group the replica that is hosting the damaged database when the database is in the secondary role.
  • Resolve any issues that are affecting the system and that might have contributed to the database failure.
  • Restore the replica to the availability group.
To take these actions, connect to the new primary replica, and then run the "ALTER AVAILABILITY GROUP" SQL script to remove the replica that is hosting the failed availability database. To do this, follow these steps.

Note These steps assume that the primary replica first hosts the damaged database. Therefore, a failover must first occur to transition the replica that is hosting the damaged database into a secondary role. 
  1. Connect to the server that is running SQL Server and that is hosting the secondary replica.
  2. Run the following SQL script:
    alter availability group <Availability group name> failover
  3. Run the following SQL script to remove the replica that is hosting the damaged database from the availability group:
    alter availability group <Availability group name> remove replica on '<SQL Server node name>'
  4. Resolve any issues on the server that is running SQL Server and that might contribute to the database failure. 
  5. Add the replica back into the availability group.
Collapse this imageExpand this image
assets folding end collapsed

The primary replica is the only replica in the availability group

Collapse this imageExpand this image
assets folding start collapsed
If the primary replica hosts the damaged database and is the only working replica in the availability group, the availability group must be dropped. After the availability group is dropped, your database can be recovered from a backup, or other emergency recovery efforts can be applied to restore the databases and to resume production.

To drop the availability group, un the following SQL script :

drop availability group <Availability group name>
At this point, you can try to recover the problematic database. Or, or you can restore the database from the last-known good backup copy.
Collapse this imageExpand this image
assets folding end collapsed

Preserve the listener when you drop the availability group

Collapse this imageExpand this image
assets folding start collapsed
When you drop an availability group, the listener resource is also dropped and interrupts application connectivity to the availability databases.

To minimize application downtime, use one of the following methods to sustain application connectivity through the listener and drop the availability group:

Method 1: Associate the listener with a new availability group (role) in Failover Cluster Manager
This method lets you maintain the listener while dropping and re-creating the availability group.
  1. On the instance of SQL Server to which the existing availability group listener is directing connections, create a new, empty availability group. To simplify this process, use the Transact-SQL command to create an availability group that has no secondary replica or database:

    use master
    go
    create availability group ag for replica on 'sqlnode1' with 
    (endpoint_url = 'tcp://sqlnode1:5022', availability_mode=asynchronous_commit, failover_mode=manual)
  2. Start Failover Cluster Manager, and then click Roles in the left pane. In the pane that lists the roles, select the original availability group.
  3. In the bottom-middle pane under the Resources tab, right-click the availability group resource, and then click Properties. Click the Dependencies tab, delete the dependency to the listener, and then click OK.

    Collapse this imageExpand this image
    2930548
  4. Under the resources, right click the listener, click More Actions, and then click Assign to Another Role.
  5. In the Assign Source to Role dialog box, select the new availability group, and then click OK.

    Collapse this imageExpand this image
    2930549
  6. In the Roles pane, select the new availability group. In the bottom-middle pane, under the Resources tab, you should now see the new availability group and the listener resource. Right-click the new availability group resource, and then click Properties.
  7. Click the Dependencies tab, select the listener resource from the drop-down box, and then click OK.

    Collapse this imageExpand this image
    2930562
  8. In Microsoft SQL Server Management Studio, use Object Explorer to connect to the instance of SQL Server that hosts the primary replica of the new availability group. Click AlwaysOn High Availability, click the new availability group, and then click Availability Group Listeners. You should find the listener.
  9. Right-click the listener, click Properties, type the appropriate port number for the listener, and then click OK.

    Collapse this imageExpand this image
    2930563
This makes sure that applications that use the listener can still use it to connect to the instance of SQL Server that is hosting the production databases without interruption. The original availability group can now be completely removed and re-created. Or, the databases and replicas can be added to the new availability group.

Important If you re-create the original availability group, you should reassign the listener back to the availability group role, set up the dependency between the new availability group resource and the listener, and then reassign the port to the listener. To do this, follow these steps:
  1. Start Failover Cluster Manager, and then click Roles in the left pane. In the pane that lists the roles, click the new availability group that hosts the listener.
  2. In the bottom middle pane under the Resources tab, right-click the listener, click More Actions, and then click Assign to Another Role. In the dialog box, choose the re-created availability group, and then click OK.
  3. In the Roles pane, click the re-created availability group. In the bottom middle pane, under the Resources tab, you should now see the re-created availability group and the listener resource. Right-click the re-created availability group resource, and then click Properties.
  4. Click the Dependencies tab, select the listener resource from the drop-down box, and then click OK.
  5. In SQL Server Management Studio, use Object Explorer to connect to the instance of SQL Server that hosts the primary replica of the re-created availability group. Click AlwaysOn High Availability, click the new availability group, and then click Availability Group Listeners. You should find the listener.
  6. Right-click the listener, click Properties, type the appropriate port number for the listener, and then click OK.
Method 2: Associate the listener with an existing SQL Server Failover Clustered Instance (SQLFCI)

If you are hosting your availability group on a SQL Server Failover Clustered Instance (SQLFCI), you can associate the listener clustered resource with the SQLFCI clustered resource group while you drop and then re-create the availability group.
  1. Start Failover Cluster Manager, and then click Roles in the left pane.
  2. In the pane that lists the roles, select the original availability group.
  3. In the bottom middle pane under the Resources tab, right-click the availability group resource, and then click Properties.
  4. Click the Dependencies tab, delete the dependency to the listener, and then click OK.
  5. In the bottom middle pane under the Resources tab, right click the listener, click More Actions and then click Assign to Another Role.
  6. In the Assign Resource to Role dialog box, click the SQL Server FCI instance, and then click OK.

    Collapse this imageExpand this image
    2930565
  7. In the Roles pane, select the SQL Server Failover Clustered Instance (SQLFCI) group. In the bottom middle pane, under the Resources tab, you should now see the new listener resource.
This makes sure that applications that use the listener can still use it to connect to the instance of SQL Server that hosts the production databases without interruption. The original availability group can now be completely removed and re-created. OR, the databases and replicas can be added to the new availability group.

Important After the availability group is re-created, reassign the listener back to the availability group role. Then set up the dependency between the new availability group resource and the listener, and re-ssign the port to the listener:
  1. Start Failover Cluster Manager, and then click Roles in the left pane.
  2. In the pane that lists the roles, click the original SQL Failover Clustered Instance role.
  3. In the bottom middle pane, under the Resources tab, right-click the listener, click More Actions, and then click Assign to Another Role.
  4. In the dialog box, click the re-created availability group, and then click OK.
  5. In the Roles pane, select the new availability group. 
  6. Under the Resources tab, you should see the new availability group and the listener resource. Right-click the new availability group resource, and then click Properties.
  7. Click the Dependencies tab, select the listener resource from the drop-down box, and then click OK.
  8. In SQL Server Management Studio, use Object Explorer to connect to the instance of SQL Server that hosts the primary replica of the new availability group. 
  9. Click AlwaysOn High Availability, click the new availability group, and then click Availability Group Listeners. You should find the listener.
  10. Right-click the listener, click Properties, type the appropriate port number for the listener, and then click OK.
Method 3: Drop the availability group, and then re-create the availability group and listener with the same listener name

This method will result in a small outage for applications that are currently connected because the availability group and listener are dropped and then re-created:
  1. Drop the availability group.

    Note This will also drop the listener.
  2. Immediately create a new, empty availability group that includes the listener definition, on the same server that hosts the production databases. For example, assume that your availability group listener is "aglisten." The following Transact-SQL statement creates an availability group with no primary or secondary database, but it also creates a listener named "aglisten." Applications can use this listener to connect.

    use master
    go
    create availability group ag for replica on 'sqlnode1' with 
    endpoint_url = 'tcp://sqlnode1:5022', availability_mode=asynchronous_commit, failover_mode=manual)
    listener 'aglisten' (with ip ((n'11.0.0.25', n'255.0.0.0')), port=1433)go
  3. Recover the damaged database. Then add it and the secondary replica back to the availability group.
Collapse this imageExpand this image
assets folding end collapsed

Properties

Article ID: 2857849 - Last Review: February 4, 2014 - Revision: 2.0
Applies to
  • Microsoft SQL Server 2012 Developer
  • Microsoft SQL Server 2012 Enterprise
  • Microsoft SQL Server 2012 Service Pack 1
  • Microsoft SQL Server 2012 Standard
  • Microsoft SQL Server 2012 Web
  • SQL Server 2012 Enterprise Core
Keywords: 
kbsurveynew kbtshoot kbexpertiseadvanced KB2857849

Give Feedback

 

Contact us for more help

Contact us for more help
Connect with Answer Desk for expert help.
Get more support from smallbusiness.support.microsoft.com