This article describes the improvements that are included in this update for Always On Availability Groups on a Pacemaker cluster in Microsoft SQL Server.
This update includes the following improvements:
- The online_database_retries resource property is no longer used. This property is ignored if it's set. Before Cumulative Update 8 (CU8), this property could be used to control how long the start and monitor actions wait for all databases in the availability group to come ONLINE. These actions now wait indefinitely until the action time-out that's configured on the Pacemaker resource expires.
- The monitor_timeout resource property is renamed to connection_timeout to better reflect its usage. The original name is still used for backward-compatibility.
- Before CU8, the monitor action time-out value could not be shorter than the monitor_timeout property value. Users who wanted the monitor action to fail faster than the recovery time would have used online_database_retries to do this. Because online_database_retries is no longer used in Cumulative Update 8, this restriction on the monitor action time-out is removed.
- The promote action now waits for databases to come ONLINE after it promotes the availability group replica.
- The demote action now sets the replica to the RESOLVING role instead of the SECONDARY role for faster failovers. The original primary remains in the RESOLVING role until a new replica is promoted to the PRIMARY role. After that, the original primary restarts into the SECONDARY role automatically. This restart is triggered by the failure of the monitor action by the original primary. This is reported by cluster-monitoring tools such as crm_mon. This should not be conisdered a cause for concern.
- We recommend that users who set nondefault values for the online_database_retries resource property or the monitor_timeout resource property, or who set nondefault values for any of the resource action timeouts, apply the following changes:
- Set connection_timeout to a value that is greater than the maximum time (in seconds) that it takes for databases in the availability group to complete recovery.
- Set the start and promote action time-outs to a value that is greater than the maximum time (in seconds) that it takes for databases in the availability group to complete recovery.
For example, if the databases in the availability group take 15 minutes (900 seconds) to recover, the settings should be:
- Op start timeout=900s interval=0s
- Op promote timeout=900s interval=0s