Symptoms
Assume that you have Microsoft SQL Server 2014, 2016 or 2017 installed. You may experience one or more of the following issues:
-
The SQL Server instance appears unresponsive and a "Non-yielding Scheduler" error occurs. You may have to restart the server to recover.
-
Rollback of a transaction may take a long time to complete. In most cases restarting the instance will allow the database to recover much faster than the rollback. Note that there are many reasons a rollback may take a long time to complete, see the "More Information" section below for details on monitoring rollbacks before attempting to restart.
-
You may see high waits on spinlocks such as SOS_OBJECT_STORE.
Resolution
This problem is fixed in the following cumulative updates for SQL Server:
Each new cumulative update for SQL Server contains all the hotfixes and all the security fixes that were included with the previous cumulative update. Check out the latest cumulative updates for SQL Server:
Latest cumulative update for SQL Server 2017
Service pack information for SQL Server
This update is fixed in the following service pack for SQL Server:
Service packs are cumulative. Each new service pack contains all the fixes that are in previous service packs, together with any new fixes. Our recommendation is to apply the latest service pack and the latest cumulative update for that service pack. You do not have to install a previous service pack before you install the latest service pack. Use Table 1 in the following article for finding more information about the latest service pack and latest cumulative update.
How to determine the version, edition, and update level of SQL Server and its components
More Information
There are many reasons why a rollback can take a long time such as a long-running transaction, a large number of VLFs in the transaction log file, slow I/O etc. In order to verify that the issue described in this article is the root cause of a slow rollback, we suggest that the following techniques be used to monitor for the progress of the rollback operation:
-
From sys.dm_exec_requests, identify the session_id whose command is set to "KILLED/ROLLBACK" and ensure that the session is accumulating both IO and CPU time indicating progress. If IO is not changing, then it may be an indication that you are encountering the issue described in this article.
-
Query sys.dm_tran_database_transactions to identify the current state of the rollback using a query like the following:
SELECT getdate() as CurrentTime, database_transaction_next_undo_lsn,database_transaction_begin_lsn,t.transaction_id,database_transaction_begin_time,database_transaction_log_record_count,db_name(t.database_id)
FROM sys.dm_tran_database_transactions t
JOIN sys.dm_exec_requests s
ON t.transaction_id=s.transaction_idWHERE t.database_id=db_id('<Database Name') and s.session_id=<Session_id performing the rollback operation>
Note:
In the above query,
database_transaction_next_undo_lsn is the LSN of the next record to undo. database_transaction_begin_lsn is the LSN of the begin record for the transaction in the transaction log.
database_transaction_next_undo_lsn should be decreasing with each snapshot of this query. Rollback will complete successfully when the database_transaction_next_undo_lsn reaches database_transaction_begin_lsn.
The goal here is to take a few snapshots of the previous query within a predetermined interval and then use the delta of the LSNs processed in database_transaction_next_undo_lsn within that interval and extrapolate the time taken in order to estimate the time it will take for the database_transaction_next_undo_lsn to reach the database_transaction_begin_lsn.
If the rollback is progressing at a decent rate between each snapshot, we suggest that the rollback be allowed to complete on its own without restarting the SQL Server instance.
See the following articles for more information on long-running recovery:
-
SQL Server (2000, 2005, 2008): Recovery/Rollback Taking Longer Than Expected
-
Tracking database recovery progress using information from DMV
Status
Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.
References
Learn about the terminologythat Microsoft uses to describe software updates.