KB4338890 - FIX: "Non-yielding Scheduler" error and SQL Server appears unresponsive in SQL Server 2014, 2016 and 2017

Symptoms

Assume that you have Microsoft SQL Server 2014, 2016 or 2017 installed. You may experience one or more of the following issues:

The SQL Server instance appears unresponsive and a "Non-yielding Scheduler" error occurs. You may have to restart the server to recover.
Rollback of a transaction may take a long time to complete. In most cases restarting the instance will allow the database to recover much faster than the rollback. Note that there are many reasons a rollback may take a long time to complete, see the "More Information" section below for details on monitoring rollbacks before attempting to restart.
You may see high waits on spinlocks such as SOS_OBJECT_STORE.

Resolution

This problem is fixed in the following cumulative updates for SQL Server:

Cumulative Update 9 for SQL Server 2017

Cumulative Update 2 for SQL Server 2016 SP2

Each new cumulative update for SQL Server contains all the hotfixes and all the security fixes that were included with the previous cumulative update. Check out the latest cumulative updates for SQL Server:

Latest cumulative update for SQL Server 2017

Latest cumulative update for SQL Server 2016

Latest cumulative update for SQL Server 2014

Service pack information for SQL Server

This update is fixed in the following service pack for SQL Server:

Service Pack 3 for SQL Server 2014

Service packs are cumulative. Each new service pack contains all the fixes that are in previous service packs, together with any new fixes. Our recommendation is to apply the latest service pack and the latest cumulative update for that service pack. You do not have to install a previous service pack before you install the latest service pack. Use Table 1 in the following article for finding more information about the latest service pack and latest cumulative update.

How to determine the version, edition, and update level of SQL Server and its components

More Information

There are many reasons why a rollback can take a long time such as a long-running transaction, a large number of VLFs in the transaction log file, slow I/O etc. In order to verify that the issue described in this article is the root cause of a slow rollback, we suggest that the following techniques be used to monitor for the progress of the rollback operation:

From sys.dm_exec_requests, identify the session_id whose command is set to "KILLED/ROLLBACK" and ensure that the session is accumulating both IO and CPU time indicating progress. If IO is not changing, then it may be an indication that you are encountering the issue described in this article.
Query sys.dm_tran_database_transactions to identify the current state of the rollback using a query like the following:

SELECT getdate() as CurrentTime, database_transaction_next_undo_lsn,database_transaction_begin_lsn,t.transaction_id,database_transaction_begin_time,database_transaction_log_record_count,db_name(t.database_id)

FROM sys.dm_tran_database_transactions t

JOIN sys.dm_exec_requests s ON t.transaction_id=s.transaction_id

WHERE t.database_id=db_id('<Database Name') and s.session_id=<Session_id performing the rollback operation>

Note:

In the above query,

database_transaction_next_undo_lsn is the LSN of the next record to undo. database_transaction_begin_lsn is the LSN of the begin record for the transaction in the transaction log.

database_transaction_next_undo_lsn should be decreasing with each snapshot of this query. Rollback will complete successfully when the database_transaction_next_undo_lsn reaches database_transaction_begin_lsn.

The goal here is to take a few snapshots of the previous query within a predetermined interval and then use the delta of the LSNs processed in database_transaction_next_undo_lsn within that interval and extrapolate the time taken in order to estimate the time it will take for the database_transaction_next_undo_lsn to reach the database_transaction_begin_lsn.

If the rollback is progressing at a decent rate between each snapshot, we suggest that the rollback be allowed to complete on its own without restarting the SQL Server instance.

See the following articles for more information on long-running recovery:

Status

Microsoft has confirmed that this is a problem in the Microsoft products that are listed in the "Applies to" section.

References

Learn about the terminologythat Microsoft uses to describe software updates.

KB4338890 - FIX: "Non-yielding Scheduler" error and SQL Server appears unresponsive in SQL Server 2014, 2016 and 2017

Symptoms

Resolution

Need more help?

Want more options?

Was this information helpful?

Thank you for your feedback!

Symptoms

Resolution

About cumulative updates for SQL Server:

About Service packs for SQL Server:

Need more help?

Want more options?

Was this information helpful?

Thank you for your feedback!