OpsMgr: Monitoring Alerts may not auto-resolve

Summary

System Center Operations Manager is designed to handle state changes from agents that arrive out of order by discarding state changes that are older than the last one received. This prevents the scenario of a monitor being marked in an incorrect state as a result of state changes arriving out of order.


This also carries over to alerts generated by a monitor that has the setting 'Automatically resolve the alert when the monitor returns to a healthy state.'It is implemented by checking the value in the database view AlertView.TimeResolutionStateLastModified with the time received from the agent that initiated the monitor state change. If the value in AlertView.TimeResolutionStateLastModified is newer than the time received in the state change, the Alert will not be automatically closed. 

Generally this is not a problem because the only entity to modify the ResolutionState of an Alert will be the monitor itself, and the time values will always use the same source (the agent) for the comparisons.

Problems arise when the ResolutionState is modified via an SDK call on a different computer. The time that is updated into AlertView.TimeResolutionStateLastModified will be the time from the computer that made the change via the SDK. Due to time drift between the source of the alert and the computer modifying it, it is possible that the value in the database is newer than a state change event that is just received from the monitor, and thus the alert will be left open, even though it's associated monitor is in a healthy state. 

More Information

SDK, Connector SDK, and the Get-Alert Operations Manager Powershell cmdlet retrieve instances of the Microsoft.EnterpriseManagement.Monitoring.MonitoringAlert class (described at http://msdn.microsoft.com/en-us/library/microsoft.enterprisemanagement.monitoring.monitoringalert.aspx)

The ResolutionState of these instances can be 0 (open), 255 (closed), or other values defined by an SDK application. When the ResolutionState of an alert is modified by a call to the SDK (powershell cmdlets also use the SCOM SDK), the time value that will be placed into AlertView.TimeResolutionStateLastModified will be that of the computer making the SDK call. Comparisons that occur will now have to different time sources (the computer making the change via the SDK and the original computer that hosts the monitor that generated the alert), with no way to know if the two sources are synchronized.

Other fields of an alert, for example 'Owner' and CustomFieldx (x=1-10), can be changed without exposing this issue. If it is possible avoid changing the ResolutionState of an Alert.

If an SDK app must modify the ResolutionState of an alert, then the SDK app must be responsible for closing the alert, as the mechanism to automatically close the alert when the monitor changes to healthy will no longer be reliable. 

This is not an issue for alerts generated by monitors without the option 'Automatically resolve the alert when the monitor returns to a healthy state' enabled, as these alerts will never be automatically resolved by monitor state changes.
Properties

Article ID: 2709521 - Last Review: 9 Jul 2012 - Revision: 1

Feedback