Orchestrator 2012 Runbook Server goes offline due to deadlocks

Symptoms

The System Center 2012 Orchestrator Runbook Service takes itself offline due to database errors resulting from being the victim of one or more deadlocks. Orchestrator platform events as viewed in the Runbook Designer or the Orchestration Console reports the following events:

Summary: Runbook Server <computer> is experiencing frequent errors while accessing the database
Details: The Runbook Server is experiencing frequent errors while accessing the database. Verify the status of the database server. If the problem persists, contact Technical Support.

A review of the deadlocks show a stored procedure of dbo.sp_UnpublishPolicyRequest being executed with an objectlock against the dbo.POLICY_REQUEST_ACTION_SERVERS table from each of the locks.

Cause

Numerous Runbook Instances are being completed at the same time and executing the dbo.sp_UnpublishPolicyRequest stored procedure to remove the completed instance. The foreign key relationship from the dbo.POLICY_REQUEST_ACTION_SERVERS table with a cascading delete option but no index against the SeqNumber column of the foreign key is causing a full table lock to occur rather than a more selective row lock. When the multiple executions attempt to convert their intent to lock (IX) locks to exclusive (X) locks, the deadlock results.

Resolution

Use SQL Server Management Studio to execute the following T-SQL against your Orchestrator database to create an index on the SeqNumber column of the dbo.POLICY_REQUEST_ACTION_SERVERS table.

CREATE NONCLUSTERED INDEX [IX_POLICY_REQUEST_ACTION_SERVERS_SeqNumber] ON [dbo].[POLICY_REQUEST_ACTION_SERVERS]
(
       [SeqNumber] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY]

More Information

Example of what kind of content you will see when examining SQL deadlock graph details:

<process id="process3cd0188" taskpriority="0" logused="1596" waitresource="OBJECT: 10:2142630676:0 " waittime="85" ownerId="255020956" transactionname="DELETE" lasttranstarted="2013-06-05T22:20:51.713" XDES="0x454c4d4d8" lockMode="X" schedulerid="4" kpid="12168" status="suspended" spid="183" sbid="0" ecid="0" priority="0" trancount="2" lastbatchstarted="2013-06-05T22:20:51.713" lastbatchcompleted="2013-06-05T22:20:51.710" lastattention="1900-01-01T00:00:00.710" clientapp="Orchestrator" hostname="COMPUTER" hostpid="3596" loginname="DOMAIN\username" isolationlevel="read committed (2)" xactid="255020956" currentdb="10" lockTimeout="4294967295" clientoption1="673185824" clientoption2="128056">
  <executionStack>
    <frame procname="Orchestrator.Microsoft.SystemCenter.Orchestrator.Runtime.Internal.CompleteJob" line="22" stmtstart="1054" stmtend="1230" sqlhandle="0x03000a0044da8e2771265d0124a0000001000000000000000000000000000000000000000000000000000000">
DELETE FROM
              [dbo].[POLICY_PUBLISH_QUEUE]
       WHERE
              [SeqNumber] = @SeqNumber    </frame>
    <frame procname="Orchestrator.dbo.sp_UnpublishPolicyRequest" line="19" stmtstart="820" stmtend="1080" sqlhandle="0x03000a00ac66a25373265d0124a0000001000000000000000000000000000000000000000000000000000000">
EXEC [Microsoft.SystemCenter.Orchestrator.Runtime.Internal].[CompleteJob] @JobId, @SeqNumber, @PolicyID, 'S-1-5-500', @JobStatus    </frame>
  </executionStack>
  <inputbuf>
Proc [Database Id = 10 Object Id = 1403152044]   </inputbuf>
</process>

<objectlock lockPartition="0" objid="2142630676" subresource="FULL" dbid="10" objectname="Orchestrator.dbo.POLICY_REQUEST_ACTION_SERVERS" id="lockbdedf6280" mode="IX" associatedObjectId="2142630676">


Properties

Article ID: 2860832 - Last Review: Jul 9, 2013 - Revision: 1

Microsoft System Center 2012 Orchestrator, Microsoft System Center 2012 Orchestrator Service Pack 1

Feedback