Troubleshooting UNIX/Linux agent discovery in Operations Manager 2012

To monitor UNIX or Linux computers in System Center 2012 Operations Manager (OpsMgr 2012), the computers must first be discovered, and the OpsMgr 2012 agent must be installed. The Computer and Device Management Wizard is used to discover and install agents on UNIX and Linux computers. However, discovery may not always find all eligible clients.

If you experience client discovery issues, this guide is for you.

What does this guide do?

Troubleshoots problems in System Center 2012 Operations Manager where UNIX or Linux computers can’t be discovered.

Who is it for?

Admins of System Center 2012 Operations Manager who help resolve UNIX/Linux agent discovery issues.

How does it work?

We’ll begin by asking the type of issue you are facing. Then we’ll take you through a series of steps that are specific to your situation to resolve your issue.

Estimated time of completion:

30-45 minutes.

Welcome to the guide

Select the type of issue you are experiencing below.

Welcome to the guide

Select the type of issue you are experiencing below.

The target address is unreachable

In this situation you will typically receive an error similar to the following:

The WinRM client cannot complete the operation within the time specified. Check if the machine name is valid and is reachable over the network and firewall exception for Windows Remote Management service is enabled. 

Most likely causes include the following:

  • The host is unreachable due to incorrect name resolution, network outage or host outage.
  • A network or host-based firewall is blocking TCP port 1270 connectivity to the target host. 
To resolve this issue, verify that the Management Server can ping the agent host using its Fully-Qualified Domain Name (FQDN). Also verify that no network firewalls or host firewall is blocking TCP port 1270. 
Certificate Errors or Certificate Signing Errors

Select the type of certificate issue you are experiencing below.

Signed certificate verification operation was not successful

When certificate verification fails you will typically get an error similar to the following:

Agent verification failed. Error detail: The server certificate on the destination computer (lx1.contoso.com:1270) has the following errors:The SSL certificate could not be checked for revocation. The server used to check for revocation might be unreachable.The SSL certificate contains a common name (CN) that does not match the hostname.

One common cause of this error is that the agent certificate’s CN value does not match the provided or resolved Fully-Qualified Domain name. To verify this, confirm that that agent host’s hostname and domain name match the Fully-Qualified Domain Name resolved through DNS.

You can view the basic details of the certificate on the UNIX or Linux computer by entering the following command: 

openssl x509 -noout -in /etc/opt/microsoft/scx/ssl/scx.pem -subject -issuer -dates

When you do this, you will see output that is similar to the following:

subject= /DC=name/DC=newdomain/CN=newhostname/CN=newhostname.newdomain.nameissuer= /DC=name/DC=newdomain/CN=newhostname/CN=newhostname.newdomain.namenotBefore=Mar 25 05:21:18 2008 GMTnotAfter=Mar 20 05:21:18 2029 GMT

Using this information, validate the hostnames and dates and ensure that they match the name being resolved by the Operations Manager management server. If the hostnames do not match, use one of the following actions to resolve the issue:

  • If the UNIX or Linux hostname is correct but the Operations Manager management server is resolving it incorrectly, either modify the DNS entry to match the correct FQDN or add an entry to the hosts file on the Operations Manager server.
  • If the UNIX or Linux hostname is incorrect, do one of the following:
        • Change the hostname on the UNIX or Linux host to the correct one and create a new certificate.
        • Create a new certificate with the desired hostname.
If the certificate was created with an incorrect name, you can change the host name and re-create the certificate and private key. To do this, run the following command on the UNIX or Linux computer:
/opt/microsoft/scx/bin/tools/scxsslconfig -f -v

You can also change the hostname and domain name on the certificate by using the –h and –d switches, as in the following example:

/opt/microsoft/scx/bin/tools/scxsslconfig -f -h <hostname> -d <domain.name>

Once complete, restart the agent by running the following command:

/opt/microsoft/scx/bin/tools/scxadmin -restart

If you would rather add an entry to the hosts file, if the FQDN is not in Reverse DNS you can add an entry to the hosts file located on the management server to provide name resolution. The hosts file is located in the \Windows\System32\Drivers\etc folder. An entry in the hosts file is a combination of the IP address and the FQDN. For example, to add an entry for the host named “newhostname.newdomain.name” with an IP address of 192.168.1.1, add the following to the end of the hosts file:

192.168.1.1 newhostname.newdomain.name

Signed certificate verification operation was not successful

Another common cause of this error is that the certificate has been signed by untrusted authority, such as when multiple Management Servers are members of the Resource Pool used for discovery but certificate trust has not been configured between the Management Servers. To verify this, confirm that all Management Servers in the Resource Pool used for Discovery trust each other server’s certificate.

More information on how to manage resource pools for UNIX and Linux computers can be found in the following TechNet document:

Managing Resource Pools for UNIX and Linux Computers

Congratulations!

Your UNIX/Linux Agent Discovery issue is resolved.

Sorry

It appears that we are unable to resolve your issue by using this guide. For more help resolving this issue please see our TechNet support forum or contact Microsoft Support.

Certificate signing operation was not successful

When the certificate signing operation is not successful, it is usually caused by one of two problems:

The user account specified for discovery has insufficient privileges to perform file operations involved in signing. 

or

Sudo elevation privileges for the user account specified for discovery was not correctly configured.

Resolution:

Verify the user account by inspecting the StdErr output in the error details to identify the cause of the failure. 

Also verify the sudo privilege configuration for the account used for certificate signing.

Network Name Resolution Errors

Select the type of network name resolution issue you are experiencing below.

The target address is not resolvable

These issues typically fall into one of two categories:

  1. Error Description: Failed to resolve IP address 192.168.25.25 to name
    This can occur when an IP address for the host was entered for discovery but the IP address is not resolvable to a name in DNS (reverse lookup)
    To resolve this issue, correct name resolution (DNS) configuration for the reverse lookup zone, ensuring that an IP address to name mapping exists for the affected host.
  2. Error Description: Failed to resolve name server.contoso.com to IP address
    This can occur if an FQDN for the host was entered for discovery but the name is not resolvable to IP address in DNS (forward lookup)
    To resolve this issue, correct name resolution (DNS) configuration for forward lookup, ensuring that a host name to IP address mapping exists for the host.
DNS configuration: Forward DNS resolution does not match reverse DNS resolution

In this situation you will typically receive an error similar to the following:

The provided hostname ServerName resolved to the IP address of 10.137.216.102. The hostname ServerName.contoso.com returned by reverse lookup of the IP address 192.168.x.x did not match the provided hostname. Verify the DNS configuration and try the request again. 

The most common cause for this type of error is that the records for the host in the forward and reverse DNS lookup zones do not match.

To resolve this issue, correct the records in the forward and reverse lookup zones in DNS so that the host names and IP address match.

SSH Connectivity Errors

Select the error you are receiving below

Failed during SSH discovery. Exit code: -1073479162

Error Description:

Failed during SSH discovery. Exit code: -1073479162Standard Output:Standard Error:Exception Message:An exception (-1073479162) caused the SSH command to fail - No connection could be made because the target machine actively refused it. 

Possible Causes:

  • The ssh daemon is not running on the target system.
  • A network or host-based firewall is preventing ssh connections on TCP port 22.
Resolutions: 
  • Verify that the ssh daemon is running.
  • Verity that no network firewalls or host firewall is blocking TCP port 22.
Failed during SSH discovery. Exit code: -1073479118

Error Description:

Failed during SSH discovery. Exit code: -1073479118Standard Output:Standard Error:Exception Message:An exception (-1073479118) caused the SSH command to fail - Server sent disconnect message: type 2 (protocol error : Too many authentication failures for root) 

Possible Causes:

  • The user account specified for discovery is not permitted to login via ssh.
  • The user account specified for discovery was input with an invalid username or password
Resolutions:
  • Verify that the user is permitted to login via ssh.
  • Verify the input credentials and that the user is defined on the target host.
Failed during SSH discovery. Exit code: 1

Error Description:

Failed during SSH discovery. Exit code: 1Standard Output: Sudo path: /usr/bin/Standard Error: sudo: sorry, you must have a tty to run sudoException Message: 

Cause:

Sudo elevation was selected in the user credential input, however the requiretty option was not disabled for the user in sudoers.

Resolution:
Edit the sudoers file on the target host (using the visudo command) and add the following line, replacing with the name of the user account specified for discovery:
Defaults: <username>!requiretty
For more information see How to Configure sudo Elevation and SSH Keys in the Microsoft TechNet documentation library.
.[?1034hopsuser@lx1:~> su - root -c 'sh /tmp/scx-opsuser/GetOSVersion.sh

Error Description:

.[?1034hopsuser@lx1:~> su - root -c 'sh /tmp/scx-opsuser/GetOSVersion.sh; EC=$?; rm -rf/tmp/scx-opsuser;exit $EC'Password:exitsu: incorrect passwordopsuser@lx1:~> exitlogout

Possible Cause:

Su elevation was selected in the user credential input, however an invalid root password was provided for su elevation.

Resolution:

Verify the password input for root in the Elevation the configuration dialog.

Failed during SSH discovery. Exit code: -2147221248

One common -2147221248 exception error you might see is below.

Failed during SSH discovery. Exit code: -2147221248Standard Output:Standard Error: Could not chdir to home directory /home/username: No such file or directory 

Cause:

The user account specified for discovery does not have a home directory.

Resolution:

Verify that the user has a home directory at: /home/ and that the user is able to write to this directory.

Failed during SSH discovery. Exit code: -2147221248

Another common -2147221248 exception error you might see is below:

Failed during SSH discovery. Exit code: -2147221248Standard Output:Standard Error: root's password:Exception Message:Operation timed out

Cause:

Sudo elevation was selected in the user credential input, however the user account specified for discovery is not correctly configured to use passwordless sudo elevation, or the required sudo elevation privileges were not granted for the user account used in discovery. 

Resolution:

Review sudo elevation configuration documentation and verify user configuration for sudo. Note that passwordless sudo must be configured.

WSMan Connectivity Errors

Select the error you are receiving below

The agent responded to the request but the WSMan connection failed due to : Access is Denied

Possible causes of this error include:

  1. The agent is installed, and the agent certificate has been signed, however the user credential provided for agent verification is invalid.
  2. The user account specified for discovery was configured to authenticate with an SSH key, but the user credential provided for agent verification is invalid.
  3. There is a permission problem or incorrect PAM configuration on the UNIX side.
Resolution:
First verify that the username and password for agent verification were input correctly and that the user is a valid user on the target host. If correcting those does not resolve the issue, next verify that sudo elevation has been configured correctly. Also check the messages log on the UNIX/Linux computer. For example, in AIX you can find the log under /var/adm/messages. In other OSs the location may vary. Look for lines such as the following: 
Sep 3 14:49:07 server auth|security:debug /opt/microsoft/scx/bin/omiserver PAM: pam_authenticate: error Authentication failed. 

If you see similar lines in the messages log, it means that the PAM configuration file is missing information about OMIServer. The PAM configuration file can be found under /etc/pam.d

The easiest way to add the information about OMIServer back to the PAM configuration file is to reinstall the SCX agent from scratch on that computer. If that is not easily possible, you can copy the lines pertaining to OMI from a working computer to the non-working computer.

WSMan Only Discovery failed for 192.168.x.x

Possible causes of this issue include:

  • The Discovery Type option was set to “Only computers with an installed agent and signed certificate” and the target host has the agent installed, however the target host certificate has not been signed. In order to use the WSMan-only discovery option “Only computers with an installed agent and signed certificate”, the agent must be installed and the certificate manually signed.
  • The Discovery Type option was set to “Only computers with an installed agent and signed certificate” but the target host does not have the UNIX/Linux agent currently installed.
  • The Discovery Type option was set to “Only computers with an installed agent and signed certificate” but the UNIX/Linux agent is not currently running.
  • The Discovery Type option was set to “Only computers with an installed agent and signed certificate” but the target host is unreachable, a network or host-based firewall is preventing connectivity or the UNIX/Linux agent is currently down.
Resolutions:
  • Manually sign the certificate.
  • Verify that the UNIX/Linux agent has been installed.
  • Change the option to “Discover all computers” to allow the Discovery Wizard to perform the certificate signing.
  • Verify that the UNIX/Linux agent is running and that the target host is reachable.
  • Verify that no network firewalls or host firewall is preventing access on TCP port 1270. 
Other Errors

Please select your error

The task cannot be executed against the object(s) because the target of the task does not match any of the classes of the object

Error Description:

The task cannot be executed against the object(s) because the target of the task does not match any of the classes of the object. 

Cause:

In a System Center 2012 Operations Manager management group, this can occur if the UNIX/Linux management packs imported are Operations Manager 2007 R2 versions.

Resolution:

Import the System Center 2012 versions of the UNIX/Linux operating system management packs.

The agent is installed and the computer is already being monitored by Operations Manager

Error Description:

The agent is installed and the computer is already being monitored by Operations Manager. 

Cause:

The target host has already been discovered in this Management Group

Resolution:

No action is required. Agent upgrade or migration to an alternate resource pool can be performed from the UNIX/Linux Servers view in the Administration pane of the Operations Console.

Unable to enumerate Installable agent types

Error Description:

Unable to enumerate Installable agent types. The associated resource pool may still be initializing. If you had selected a newly created resource pool, please wait a few minutes before using it. 

Causes:

  • The Resource Pool used in discovery is not healthy (e.g. a majority of member servers are offline).
  • The Resource Pool used in discovery was recently created and it has not yet fully initialized.
Resolutions:
  • If the Resource Pool used in discovery was recently created, retry the discovery after several minutes to allow the pool to initialize. 
  • Otherwise, check the Operations Manager Event Log on the servers that are members of the Resource Pool used for discovery for indications of the source of the problem.
Failed to find a matching supported agent instance in the imported management packs

Error Description:

Failed to find a matching supported agent instance in the imported management packs.Import the Management Pack(s) for this platform in order to discover this computer. 

Possible Causes:

  • The target host is running an unsupported operating system.
  • The correct management pack for the target host’s operating system has not been imported.
  • The correct management pack for the operating system has recently been imported but has not yet fully loaded.
Resolutions:
  • Confirm that the target host is running a supported operating system.
  • Import the management pack for the target host’s operating system and version.
  • If the management pack was just imported it may still be loading. Wait a few minutes and rerun discovery.
內容

文章識別碼:10080 - 最後檢閱時間:2016年3月8日 - 修訂: 65

意見反應