Troubleshooting SQL Server AlwaysOn Issues

Ισχύει για: Microsoft SQL ServerSQL Server 2012 EnterpriseSQL Server 2014 Enterprise Περισσότερα

What does this guide do?

Help you solve some of the common issues you may run into when working with AlwaysOn Configuration on SQL server.

Who is it for?

SQL server admins who need to create, configure and manage AlwaysOn Availability groups.

How does it work?

We’ll begin by asking you the SQL issue that you are facing. Then we’ll take you through a series of steps that are specific to your situation.

Estimated time of completion:

15-30 minutes.

Notes:

Notes:

If you are looking for documentation on setting up AlwaysOn configuration please review the following documents:

Getting Started with AlwaysOn Availability Groups (SQL Server) - The document provides answers to many questions you may have related to Availability groups, setup etc. Following all the steps in this doc and reviewing Prerequisites, Restrictions, and Recommendations for AlwaysOn Availability Groups (SQL Server) will help prevent a lot of issues that you may run into with setting up and maintaining availability groups in your environment.

Additional resources:


Is this information helpful?

Typical configuration problems include AlwaysOn Availability Groups are disabled, accounts are incorrectly configured, the database mirroring endpoint does not exist, the endpoint is inaccessible (SQL Server Error 1418), network access does not exist, and a join database command fails (SQL Server Error 35250). Review the following document for help on troubleshooting these issues.

Troubleshoot AlwaysOn Availability Groups Configuration (SQL Server)

Additional links:


Did this resolve your issue?

One of the most common configuration issues customers encounter is availability group listener creation. The errors are similar to the following:

Msg 19471, Level 16, State 0, Line 2The WSFC cluster could not bring the Network Name resource with DNS name '' online. The DNS name may have been taken or have a conflict with existing name services, or the WSFC cluster service may not be running or may be inaccessible. Use a different DNS name to resolve name conflicts, or check the WSFC cluster log for more information.      
Msg 19476, Level 16, State 4, Line 2The attempt to create the network name and IP address for the listener failed. The WSFC service may not be running or may be inaccessible in its current state, or the values provided for the network name and IP address may be incorrect. Check the state of the WSFC cluster and validate the network name and IP address with the network administrator.

The majority of time, listener creation failure resulting in the messages above are due to a lack of permissions for the Cluster Name Object (CNO) in Active Directory to create and read the listener computer object. For troubleshooting this problem, please review the following articles:

Additional links:

Create or Configure an Availability Group Listener (SQL Server)


Did this resolve your issue?

If you notice that the automatic failover is not working as expected either during testing or in production, review the following articles:

Improper configuration of “Maximum failures in the specified period” is one of the leading causes for primary not automatically failing over to the secondary. The default value for this settting is N-1, where N is the number of replica. For additional information refer to the following blog post on MSDN. 


Did this resolve your issue?

After you configure the availability group listener for an AlwaysOn Availability Group in Microsoft SQL Server 2012, you may be unable to ping the listener or connect to it from an application. You may get an error that is similar to the following:

Sqlcmd: Error: Microsoft SQL Native Client : Login timeout expired.

To troubleshoot this and similar errors review the following:

Additional information:


Did this resolve your issue?

  1. Lot of issues related to AlwaysOn occur due to improper configuration of the listener. If you are having connection issues to the listener,
    1. Ensure you read all the limitations of ILB listener and followed all the steps documented in the following article paying particular attention to dependency configuration, IP address, and various other parameters in the PowerShell script.
      Configure an ILB listener for AlwaysOn Availability Groups in Azure
    2. If unsure, you may want to delete and recreate the listener as per the above document.
  2. If you recently moved your VM to a different service or if the IP addresses changed, you need to update the value of the IP address resource to reflect the new address and you need to recreate the load balanced endpoint for your AG. You can update the IP address using the Get/Set commands as follows:
    Get-ClusterResource "IPResourceName" | Set-ClusterParameter -name Address –value “w.x.y.z”

Recommended docs:


Did this resolve your issue?

After an automatic failover or a planned manual failover without data loss on an availability group, you may find that the failover time exceeds your recovery time objective (RTO). To troubleshoot the causes and potential resolutions review the following Troubleshooting guide on MSDN:


Did this resolve your issue?

You may notice that changes on primary replica are not getting propagated to secondary in a timely manner. To troubleshoot and resolve these problems, try the following:


Did this resolve your issue?

You can reduce the transaction log sizes by configuring regular backups at either primary or secondary servers.

Review the following topics for additional information:

Is this information helpful?

Congratulations! We are glad this guide is helpful and your issue is resolved.

We're sorry you cannot solve the issue by this guide. For more information about AlwaysOn Availability Groups , please see the following:




Did this resolve your issue?
Check if the databases are taking a long time to recover by reviewing the messages in the SQL ErrorLog.

Did this resolve your issue?
  1. Is it possible to have two Listeners for one availability group?
    Yes, you can setup multiple listeners for the same availability group. Please refer below link on how configure availability group with multiple Listeners.
    https://blogs.msdn.microsoft.com/sqlalwayson/2012/02/03/how-to-create-multiple-listeners-for-same-availability-group-goden-yao/
  2.  Is it possible to have a separate NIC cards for always on traffic and Client connectivity?
    Yes, you can have dedicated NIC card for AlwaysOn trafffic. Please refer below link on how to use a dedicated NIC for always on traffic.
    https://blogs.msdn.microsoft.com/alwaysonpro/2013/11/01/configure-availability-group-to-communicate-on-a-dedicated-network/
  3. What editions support Always On failover cluster instances?
    The following topic in SQL Server Books Online has more information:
    Editions and Supported Features for SQL Server 2016
  4. How to recover in case of a failure on all nodes of your cluster?
    Review the following article:
    How to restore a replica to the primary role after the quorum is lost in SQL Server
  5. Where can I find information on support for distributed transactions in AG configurations?
    You can review the following articles:
    Cross-Database Transactions Not Supported For Database Mirroring or AlwaysOn Availability Groups (SQL Server)
    Cross-Database Transactions and Distributed Transactions for Always On Availability Groups and Database Mirroring (SQL Server)
  6. How to update AlwaysOn configurations?
    Upgrading Always On Availability Group Replica Instances
  7. How to add TDE (Transparent Data Encryption) enabled database to AG configuration?
    To add TDE enabled DB to AG, you can refer below article:
    How to configure Always On for a TDE database
  8. How to configure alerts for checking if the secondary is lagging behind the primary?You can use the following script
    SELECT ag.name AS ag_name, ar.replica_server_name AS ag_replica_server, dr_state.database_id as database_id, is_ag_replica_local = CASE WHEN ar_state.is_local = 1 THEN N’LOCAL’ ELSE ‘REMOTE’ END , ag_replica_role = CASE WHEN ar_state.role_desc IS NULL THEN N’DISCONNECTED’ ELSE ar_state.role_desc END, dr_state.last_hardened_lsn, dr_state.last_hardened_time, datediff(s,last_hardened_time, getdate()) as ‘seconds behind primary’ FROM (( sys.availability_groups AS ag JOIN sys.availability_replicas AS ar ON ag.group_id = ar.group_id ) JOIN sys.dm_hadr_availability_replica_states AS ar_state ON ar.replica_id = ar_state.replica_id) JOIN sys.dm_hadr_database_replica_states dr_state on ag.group_id = dr_state.group_id and dr_state.replica_id = ar_state.replica_id;

  9. How to get alerted if the state of the database is other than synchronized?
    You can use the following script:
    SELECT ag.name AS ag_name, ar.replica_server_name AS ag_replica_server, dr_state.database_id as database_id, is_ag_replica_local = CASE WHEN ar_state.is_local = 1 THEN N’LOCAL’ ELSE ‘REMOTE’ END , ag_replica_role = CASE WHEN ar_state.role_desc IS NULL THEN N’DISCONNECTED’ ELSE ar_state.role_desc END , ar_state.connected_state_desc, ar.availability_mode_desc, dr_state.synchronization_state_desc FROM (( sys.availability_groups AS ag JOIN sys.availability_replicas AS ar ON ag.group_id = ar.group_id ) JOIN sys.dm_hadr_availability_replica_states AS ar_state ON ar.replica_id = ar_state.replica_id) JOIN sys.dm_hadr_database_replica_states dr_state on ag.group_id = dr_state.group_id and dr_state.replica_id = ar_state.replica_id; 
    You can also review the following links for additional methods to monitor AlwaysOn groups
    Always On Policies for Operational Issues with Always On Availability Groups (SQL Server)
    External link: Using Policy Based Management for SQL Server Availability Groups Data Loss Alerts
    Additional references:
    Behavior of Dynamic Witness on Windows Server 2012 R2 Failover Clustering
    Dell Documentation: Setting the SQL AG listener DNS Time to Live (TTL) and other follow-up tasks