Article ID: 2816845 - View products that this article applies to.
Expand all | Collapse all

On This Page

Summary

This update fixes the following resiliency issues that may occur when you manage a Microsoft HPC Pack 2008 R2 cluster that contains Microsoft Azure compute nodes. Many of these issues are fixed in HPC Pack 2012 and are described in this rollup package to support customers who cannot upgrade currently.

We recommend that you install this update if you use HPC Pack 2008 R2 together with Microsoft Azure.

More information

Issues that are fixed in this update

Job-scheduling issues

  • This update provides better tolerance for the following network latency and communication failures between HPC services and Microsoft Azure:
    • The job state is incorrect or seems to be "stuck."
    • Jobs remains in the running state when tasks are completed or failed.
    • Jobs fail with a "parent job cannot be validated" exception.
    • Jobs are finished successfully but are marked as failed after a high availability (HA) failover.
    • Jobs remain in the running state after database access errors.
    • Jobs remain in the draining state and prevent taking compute nodes offline.
    • Jobs in the running state cannot be canceled.
    • Jobs cannot be canceled when the compute node is running CPU-intensive jobs.
    • Tasks fail on Microsoft Azure compute nodes with an "Exception: Safe handle has been closed" error message when the task is created.
    • Clusrun jobs fail on compute nodes in Microsoft Azure.
  • The following issues that involve exceptions and memory leaks are addressed:
    • Crash in job scheduler during a large deployment to Microsoft Azure
    • Job scheduler memory leak after multiple jobs that are running in Microsoft Azure are canceled
    • Database time-out exceptions when there are large deployments to Microsoft Azure
    • Exception when a job state is viewed from the command line
    • Exception that the job identifier is invalid when a task is created
    • "Object reference not set to an instance of an object" error message for a failed job
    • Validation failure of job together with "Node AZURECN-xxxx specified in required/requested nodes could not be found. Check the required/requested nodes to ensure the names are correct and try again" message

Cluster-management issues

  • This update provides better tolerance for the following network latency and communication failures between HPC services and Microsoft Azure:
    • Compute nodes in Microsoft Azure appear unreachable but are available in the portal.
    • Microsoft Azure compute nodes remain in an online state and cannot be deleted or stopped if the head node in a high-availability cluster fails.
    • The list of Microsoft Azure compute nodes becomes out of sync between the management and job scheduler service for multiple deployments if one deployment fails.
    • Compute nodes in Microsoft Azure repeatedly change between the reachable and unreachable state because a wrong deployment ID was reported if there was a failure in creating the deployment and if the action is retried.
    • There is a long delay between trying to stop a Microsoft Azure compute node and the failure of the operation.
    • A deployed Microsoft Azure compute node in an offline state does not come online when an availability policy is enabled after the start time is passed.
    • Microsoft Azure compute nodes cannot be added after HA failover during a large deployment.
    • A configuration package is not applied after a Microsoft Azure compute node is ready.
    • A failure or time-out occurs when proxy certificates are uploaded to Microsoft Azure.
  • The following issues that involve exceptions and memory leaks are addressed:
    • Invalid XML in Microsoft Azure configuration file when the startup script parameter contains special characters
    • Memory leak or response stoppage during certain PowerShell cmdlet operations
    • Memory leak in hpcmanagement.exe when there are many node templates
    • Crash in administration console during reconnection to an HA head node by using the virtual cluster name

SOA runtime issues

  • When Microsoft Azure nodes start, they cannot synchronize the service-oriented architecture (SOA) service package or application packages from Microsoft Azure Storage. This issue is more likely to occur in deployments that have many compute node role instances or that have a large deployment package to upload. This update makes the synchronization more resistant to failures when Microsoft Azure Storage is accessed.
  • The session’s broker is running on a clustered broker node, the SOA message level preemption is enabled, and the SOA task is preempted. In this situation, the task does not exit as expected. When the task cancel grace period expires, the task is killed by the scheduler. This update resolves the problem by making the task exit gracefully.
  • When the session's broker is running on a clustered broker node, the "auto shrink" feature of SOA does not work because the broker cannot make a task exit. This update resolves the problem by making the task exit gracefully.
  • The BrokerResponseEnumerator.MoveNext method and the BrokerResponse.Result property return the error message "Heartbeat lost for broker node" when clients that use the SOA session API try to retrieve more than 632 responses.

Update information

This update is available from the following Microsoft websites:

Microsoft Download Center

The following file is available for download from the Microsoft Download Center:
Collapse this imageExpand this image
Download
Download the update package now.

Microsoft Support

A supported update is available from Microsoft Support. However, this update is intended to correct only the problems that are described in this article.

If the hotfix is available for download, there is a "Hotfix download available" section at the top of this Knowledge Base article. If this section does not appear, contact Microsoft Customer Service and Support to obtain the update.

Note If additional issues occur or if any troubleshooting is required, you might have to create a separate service request. The usual support costs will apply to additional support questions and issues that do not qualify for this specific hotfix. For a complete list of Microsoft Customer Service and Support telephone numbers or to create a separate service request, go to the following Microsoft website:
http://support.microsoft.com/contactus/?ws=support
Note The "Hotfix download available" form displays the languages for which the update is available. If you do not see your language, it is because an update is not available for that language.

Prerequisites

You must have Microsoft HPC Pack 2008 R2 Service Pack 4 installed to apply this update.

Restart information

You may have to restart the computer after you apply this update.

Replacement information

This update replaces the following update:

2802121 FIX: An update is available for HPC Pack 2008 R2 clusters that contain Microsoft Azure nodes

File information

The English version of this hotfix has the file attributes (or later file attributes) that are listed in the following table. The dates and times for these files are listed in Coordinated Universal Time (UTC). When you view the file information, it is converted to local time. To find the difference between UTC and local time, use the Time Zone tab in the Date and Time item in Control Panel.
For the x86-based version of Microsoft HPC Pack 2008 R2
Collapse this imageExpand this image
assets folding start collapsed
Collapse this tableExpand this table
File nameFile versionFile sizeDateTimePlatform
Adminui3.4.4226.07,440,67229-Mar-201322:34Not applicable
Clustermodel3.4.4226.0655,36029-Mar-201321:43Not applicable
Configurationservice3.4.4226.0233,47229-Mar-201321:43Not applicable
Hpc.internal3.4.4226.019,96829-Mar-201321:41Not applicable
Hpc.property3.4.4226.0245,76029-Mar-201322:29Not applicable
Hpc.session3.4.4226.0393,21629-Mar-201322:29Not applicable
Hpcpack.exe3.4.4226.0122,88029-Mar-201321:41x86
Hpcschedulercore.dll3.4.4226.01,024,00029-Mar-201321:43x86
Manageapi3.4.4226.0258,04829-Mar-201321:43Not applicable
Microsoft.hpc.azure.datamovement.dll3.4.4226.036,86429-Mar-201321:41x86
Microsoft.hpc.azuremanagementbroker.dll3.4.4226.0131,07229-Mar-201321:42x86
Microsoft.hpc.nodemanager.remotingexecutor.dll3.4.4226.0110,59229-Mar-201321:44x86
Microsoft.hpc.scheduler.communicator.azure.dll3.4.4226.0118,78429-Mar-201321:43x86
Microsoft.hpc.scheduler.communicator.dll3.4.4226.013,31229-Mar-201321:43x86
Microsoft.hpc.scheduler.communicator.remoting.dll3.4.4226.061,44029-Mar-201321:43x86
Microsoft.hpc.scheduler.communicator.remotingazure.dll3.4.4226.090,11229-Mar-201321:43x86
Microsoft.hpc.svcbroker.dll3.4.4226.0368,64029-Mar-201321:41x86
Patcher3.4.4226.036,86429-Mar-201321:46Not applicable
Collapse this imageExpand this image
assets folding end collapsed
For the x64-based version of Microsoft HPC Pack 2008 R2
Collapse this imageExpand this image
assets folding start collapsed
Collapse this tableExpand this table
File nameFile versionFile sizeDateTimePlatform
Adminui3.4.4226.07,440,67229-Mar-201322:21Not applicable
Clustermodel3.4.4226.0655,36029-Mar-201321:41Not applicable
Configurationservice3.4.4226.0233,47229-Mar-201321:41Not applicable
Hpc.internal3.4.4226.019,96829-Mar-201321:40Not applicable
Hpc.property3.4.4226.0245,76029-Mar-201322:15Not applicable
Hpc.session3.4.4226.0393,21629-Mar-201322:15Not applicable
Hpcazureruntime.cspkgNot applicable37,006,32929-Mar-201322:50Not applicable
Hpcpack.exe3.4.4226.0122,88029-Mar-201321:40x86
Hpcschedulercore.dll3.4.4226.01,024,00029-Mar-201321:42x86
Manageapi3.4.4226.0258,04829-Mar-201321:41Not applicable
Microsoft.hpc.azure.datamovement.dll3.4.4226.036,86429-Mar-201321:40x86
Microsoft.hpc.azuremanagementbroker.dll3.4.4226.0131,07229-Mar-201321:40x86
Microsoft.hpc.nodemanager.remotingexecutor.dll3.4.4226.0106,49629-Mar-201321:42x86
Microsoft.hpc.scheduler.communicator.azure.dll3.4.4226.0118,78429-Mar-201321:41x86
Microsoft.hpc.scheduler.communicator.dll3.4.4226.013,31229-Mar-201321:41x86
Microsoft.hpc.scheduler.communicator.remoting.dll3.4.4226.061,44029-Mar-201321:41x86
Microsoft.hpc.scheduler.communicator.remotingazure.dll3.4.4226.090,11229-Mar-201321:41x86
Microsoft.hpc.svcbroker.dll3.4.4226.0368,64029-Mar-201321:40x86
Patcher3.4.4226.036,86429-Mar-201321:44Not applicable
Collapse this imageExpand this image
assets folding end collapsed


References

For more information about software update terminology, see Description of the standard terminology that is used to describe Microsoft software updates.

Properties

Article ID: 2816845 - Last Review: June 20, 2014 - Revision: 3.0
Applies to
  • Microsoft HPC Pack 2008 R2
Keywords: 
kbautohotfix kbqfe kbhotfixserver kbfix kbbug kbexpertiseinter kbsurveynew atdownload KB2816845

Give Feedback

 

Contact us for more help

Contact us for more help
Connect with Answer Desk for expert help.
Get more support from smallbusiness.support.microsoft.com