A System Center Operations Manager 2007 management server repeatedly logs 21042 events for agents on clustered servers

Symptoms

A System Center Operations Manager 2007 (OpsMgr 2007) management server repeatedly logs events that are similar to the following:

Event Type: Information
Event Source: OpsMgr Connector
Event Category: None
Event ID: 21042
Date:  9/22/2011
Time:  2:44:13 PM
User:  N/A
Computer: SERVERNAME
Description:
Operations Manager has discarded 5 items in management group <MG Name>, which came from <Agent FQDN>.  These items have been discarded because no valid route exists at this time.  This can happen when new devices are added to the topology but the complete topology has not been distributed yet.  The discarded items will be regenerated.

The agent names in the 21042 event descriptions are servers that run Microsoft Failover Clustering or Microsoft Cluster Service.

If you capture verbose trace logs on the management server, the native trace log (TracingGuidsNative.log) contains messages that are similar to the following:

[0] [1428] [1312] [09/22/2011-14:44:13.325] [MOMConnector] [] [Verbose] [] [CConnectorSolutionImpl::onIncomingRead] [ConnectorSolutionImpl_cpp2186]Processing message of datatype 6:0, destination <FQDN of cluster node>

[0] [1428] [1312] [09/22/2011-14:44:13.325] [MOMConnector] [] [Error] [] [CConnectorSolutionImpl::validateDataItemRouting] [ConnectorSolutionImpl_cpp5148]Received a DATATYPE_CONTRIBUTOR_STATE_REQUEST from <Health Service ID>

[0] [1428] [1312] [09/22/2011-14:44:13.325] [MOMConnector] [] [Verbose] [] [CConnectorSolutionImpl::validateDataItemRouting] [ConnectorSolutionImpl_cpp5151]<Jobs><Job ID="6C8BEDC9-5332-7143-D192-84C218878A5A" BatchID="2027CBF5-B248-4A0E-8F9A-32A0F2C56E41" TaskID="4BE723CD-BA53-F7FB-6A4A-4A5F062E77EF" TargetInstanceID="<Health Service ID>" JobCategory="0" ><Overrides><Override><Path>MonitorId</Path><Value>A6C69968-61AA-A6B9-DB6E-83A0DA6110EA</Value></Override></Overrides></Job></Jobs>

[0] [1428] [1312] [09/22/2011-14:44:13.325] [MOMConnector] [] [Error] [] [CConnectorSolutionImpl::onIncomingRead] [ConnectorSolutionImpl_cpp2210]Bad routing for a data item detected

[0] [1428] [1312] [09/22/2011-14:44:13.340] [Common] [] [Verbose] [] [Common::EventLogUtil::LogEvent] [EventLogUtil_cpp321]Logging informational event 21042 with args "5", "<MG Name>","<FQDN of a different node on the same cluster>", "NULL", "NULL", "NULL", "NULL", "NULL", "NULL"

In the above trace log messages, the FQDN in the first trace message and the FQDN in the last trace log message are different physical nodes in the same failover cluster.

Cause

This is a known issue in OpsMgr 2007 that occurs when an agent that is part of a cluster runs a dependency monitor. In some cases, unit monitors that contribute to the dependency monitor state may run on a different node in the cluster. This results in an agent sending state data to another agent via the management server. When this occurs, the management server discards the state data and logs a 21042.

Resolution

If the agent names in the 21042 events are cluster nodes, you can safely ignore these messages.

More Information

http://bugcheck/bugs/ServerManagement/171794
http://bugcheck/bugs/ServerManagement/158582

From 158582:
"Anytime an MP defines a dependency monitor hosted on an agent, and that agent is part of a cluster, dependency monitor registration is non-deterministic. This means that one agent may end up hosting the dependency monitor, and another agent may host the contributors. This leads to the contributors endlessly trying to send state information to the other agent (via the RMS). The agent owning the dependency monitor will always throw away that state change data (by design), as signaled by event 21042."
Properties

Article ID: 2622195 - Last Review: Oct 3, 2011 - Revision: 1

Feedback