RAPID PUBLISHING ARTICLES PROVIDE INFORMATION DIRECTLY FROM WITHIN THE MICROSOFT SUPPORT ORGANIZATION. THE INFORMATION CONTAINED HEREIN IS CREATED IN RESPONSE TO EMERGING OR UNIQUE TOPICS, OR IS INTENDED SUPPLEMENT OTHER KNOWLEDGE BASE INFORMATION.
In System Center Operations Manager 2007 a condition can occur when there are a larger number of computer groups and a high volume of state changes occurring on computer objects, which results in one or both of the following conditions:
1. High CPU and disk utilization on the root management server coming from the <<ProcessName(s)>> process.
This is HealthService process.
2. Queues backup (<<Provide counter name>>) on the gateway, management server and root management server system(s).
This is counter \Health Service Management Groups(Management Group Name)\Send Queue % Used
Every time the state changes on one of the top-level monitors for a computer (example: Availability, Performance, Security, and Configuration) that state change is rolled up to every computer group for which the computer is a member. This can result is a cascading series of state changes and in environments where the volume of state changes are significant and the number of computer groups rolling up the changes are high, then the conditions described may occur.
In order to reduce the impact of the issue consider one of the following actions:
1. Remove any unnecessary computer groups from the management group.
2. Disable the state rollup behavior for the computer groups for which rollup is not necessary to reduce the overall load.
Use overrides to enable and disable the four dependency monitors.
on state.basemanagedentityid = basemanagedentity.basemanagedentityid
inner join managedtype with(nolock)
on basemanagedentity.basemanagedtypeid = managedtype.managedtypeid
and typename = 'Microsoft.Windows.Computer'
inner join monitor with(nolock)
on monitor.monitorid = state.monitorid
and monitorname in
group by datepart(year, timegenerated), datepart(month, timegenerated), datepart(day, timegenerated)
order by datepart(year, timegenerated), datepart(month, timegenerated), datepart(day, timegenerated)
If number is larger than 200k per day. It indicates we are probably getting very big hit on state changes.
Determine if CPU and disk usage are high on the root management server
To determine if CPU and disk usage are high on the root management server, use perfmon to check the following:
· Check ”Processor/%Processor time (_Total)” counter, if average value of this performance counter is larger than 40%.
· Check “LogicalDisk/% Idle time (<OpsMgr installation disk>)” counter, if average value of this counter is less than 80%
Identify if the queues are backing up
To identify if the queues are backing up use perfmon to check and see if “HealthService Management Group/Send Queue % Used(<ManagementGroup Name>)” counter is consistently larger than 10% on RMS/MS/GW.
Note: Operations Manager 2007 SP1 is not affected as the rollup behavior was disabled for that release. That change has since been found to be a significant regression so the rollup behavior was re-enabled for Operations Manager R2.
MICROSOFT AND/OR ITS SUPPLIERS MAKE NO REPRESENTATIONS OR WARRANTIES ABOUT THE SUITABILITY, RELIABILITY OR ACCURACY OF THE INFORMATION CONTAINED IN THE DOCUMENTS AND RELATED GRAPHICS PUBLISHED ON THIS WEBSITE (THE “MATERIALS”) FOR ANY PURPOSE. THE MATERIALS MAY INCLUDE TECHNICAL INACCURACIES OR TYPOGRAPHICAL ERRORS AND MAY BE REVISED AT ANY TIME WITHOUT NOTICE.
TO THE MAXIMUM EXTENT PERMITTED BY APPLICABLE LAW, MICROSOFT AND/OR ITS SUPPLIERS DISCLAIM AND EXCLUDE ALL REPRESENTATIONS, WARRANTIES, AND CONDITIONS WHETHER EXPRESS, IMPLIED OR STATUTORY, INCLUDING BUT NOT LIMITED TO REPRESENTATIONS, WARRANTIES, OR CONDITIONS OF TITLE, NON INFRINGEMENT, SATISFACTORY CONDITION OR QUALITY, MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE, WITH RESPECT TO THE MATERIALS.