Event 31551 when Operations Manager tries to connect to the Data Warehouse database


A System Center Operations Manager management server cannot connect to or communicate with the SQL cluster that hosts the Data Warehouse database. In this situation, event ID 31551 is logged in the Operations Manager log, together with a description that resembles the following for various workflow names:

Log Name: Operations Manager
Source: Health Service Modules
Event ID: 31551
Task Category: Data Warehouse
Level: Error
Keywords: Classic
User: N/A
Computer: server.Contoso.com
Failed to store data in the Data Warehouse. The operation will be retried.
Exception 'SqlException': A network-related or instance-specific error occurred while establishing a connection to SQL Server. The server was not found or was not accessible. Verify that the instance name is correct and that SQL Server is configured to allow remote connections. (provider: SQL Network Interfaces, error: 26 - Error Locating Server/Instance Specified)

One or more workflows were affected by this.

Workflow name: Microsoft.SystemCenter.DataWarehouse.CollectEventData
Instance name: server.Contoso.com
Instance ID: {8A13A832-776E-096E-32E7-DC479FCD6DBC}
Management group: SupportGroup


The focus here should be on the following string:

error: 26 - Error Locating Server/Instance Specified

This error is often thought to occur because a remote connection is not enabled on the server. However, this error is actually generated when the client cannot receive an SSRP response UDP packet from SQL Browser. This behavior typically occurs because UDP port communication is blocked between the management server and the SQL cluster that hosts the Operations Manager DW.

Be aware that this error occurs only when you try to connect to a SQL Server named instance. It should not occur when you connect to the default instance. This is, even if the connection attempt fails at this stage (for example, because of an error locating the specified server or instance), it will continue trying to connect by using the default values (for example, by using the default TCP port of 1433, the default pipe name for Named Pipes, and so on). Other error messages may be generated because of a later failure, but not this error message. 


To resolve this issue, you must resolve whatever problem is causing UDP port communication to fail between the management server and the SQL cluster. In most cases, it's fairly easy to isolate the problem by following these steps:
  1. Make sure that the server name is correct (for example, make sure there's no error in the name). 
  2. Make sure that the instance name is correct and that the instance actually exists on your target computer. Be aware that some applications convert \\ to \. If you are not sure about your application, try both "server\instance" and "server\\instance" in the connection string.
  3. Make sure that the server is reachable. Make sure that DNS can be resolved correctly and that you can ping the server.
  4. Make sure that the SQL Browser service is running on the server.
  5. If the firewall is enabled on the server, make sure there's an exception for sqlbrowser.exe and/or UDP port 1434.

You can download the PortQry utility from the following Knowledge Base article to test steps 4 and 5:

832919 New features and functionality in PortQry version 2.0

After you have PortQry, run the following command:

portqry.exe -n serverName -p UDP -e 1434

If this command returns information and contains the target instance, you can rule out the scenarios in steps 4 and 5. This means that SQL Browser is running and that the firewall is not blocking SQL Browser UDP packets. 

After you've finished with these steps, the error should no longer occur. The management server may still fail to connect to the SQL server, but if so, a different error message should be triggered t at this point. If the management server still fails to connect, replace "server\instance" with "tcp:server\instance" or with "np:server\instance," and then see whether that succeeds with either TCP or the NP protocol. 

More Information

This issue is caused by a combination of the following:
  • Windows Cluster specifics
  • How the SQL Server named instance is discovered
When you connect to SQL Server named instances, the client components rely on SQL Browser to discover the server and its parameters. The discovery process runs as follows: 

  • The client sends a UDP packet to SQL Browser on the target computer. When the named instance is on a Windows cluster, the packet is sent to the cluster IP, or more specifically, to the IP address that corresponds to the virtual machine that's running SQL Server. However, SQL Browser is not cluster-aware, and it listens on IP ANY.
  • When SQL Browser receives the UDP request packet, it sends a response UDP packet back the client. Although the destination IP address is the client's IP address, the source IP address is changed. It is now the IP address of the network adapter on the physical computer instead of the virtual SQL Server IP address. 
  • The source IP address of the response UDP packet is determined by the Windows OS, based on the routing table. Because both the virtual SQL Server IP address and the IP address that's attached to the physical network adapter are typically on the same subnet and therefore belong to same route, the physical IP address is selected. Depending on the security settings on the client and server computers, this response UDP packet may be dropped by either a third-party firewall or by IPsec because the peer IP address is changed. Note that Windows Firewall will not drop the packet.
  • Be aware that if the client is a Windows Vista-based computer, IPsec may drop the packet if IPsec policy is enabled on the client and if it cannot establish a trust connection between the client and server. To work around this problem, manually specify the TCP port or pipe name in the connection string.

Номер статьи: 3084547 — последний просмотр: 13 авг. 2015 г. — редакция: 1

Отзывы и предложения