Troubleshooting Exchange protection and recovery issues in DPM

What does this guide do?

Troubleshoots Exchange protection and recovery issues in System Center 2012 Data Protection Manager (DPM 2012 or DPM 2012 R2).

Who is it for?

Admins of System Center 2012 Data Protection Manager who help resolve issues with Exchange protection and recovery.

How does it work?

We’ll begin by asking the protection issue or recovery issue you are facing. Then we’ll take you through a series of steps that are specific to your situation to resolve your issue.

Estimated time of completion:

15-30 minutes.

Getting started

With System Center 2012 Data Protection Manager (DPM 2012 or DPM 2012 R2), the majority of the problems with Exchange datasources are a result of an issue on the Microsoft Exchange server itself. Investigation of the Exchange event logs during the time of the DPM failure will usually point towards the root cause. Some general examples are missing logs, database or copies not in a healthy state, or perhaps something preventing the Exchange writer or information store from truncating the logs after a backup is complete.

Because of this, this first step when troubleshooting Exchange protection issues is to check the Exchange server and resolve any issues.

If you are still having a problem after checking the Exchange server, select the type of Exchange protection and recovery issues below.

Getting started

With System Center 2012 Data Protection Manager (DPM 2012 or DPM 2012 R2), the majority of the problems with Exchange datasources are a result of an issue on the Microsoft Exchange server itself. Investigation of the Exchange event logs during the time of the DPM failure will usually point towards the root cause. Some general examples are missing logs, database or copies not in a healthy state, or perhaps something preventing the Exchange writer or information store from truncating the logs after a backup is complete.

Because of this, this first step when troubleshooting Exchange protection issues is to check the Exchange server and resolve any issues.

If you are still having a problem after checking the Exchange server, select the type of Exchange protection and recovery issues below.

Cannot Enumerate Exchange Nodes

Many times, protection problems lie with the inability to simply enumerate the Exchange nodes.

If you do not see any nodes or databases, verify that the agent is installed on the desired node. Also verify that the database you’re attempting to numerate exists on the node with the agent installed.


Did this solve your problem?

Failure to see Passive Nodes

When creating a protection group for an Exchange Database Availability Group (DAG), when you select a database that has multiple passive copies on other nodes you may find that DPM only displays the active node.

This is usually caused by a registry value that gets created outside DPM when some other method is employed to back up and restore Exchange data (e.g. Windows Server Backup).

The registry value that gets created is named EnableVSSWriter and it is located at:

HKLM\Software\Microsoft\ExchangeServer\v14\Replay\Parameters\

There is no EnableVSSWriter value by default in this location so if you see it then it was probably created manually and set to 0 (off).

If this is the case, change the registry value to 1 (hex) which turns it back on. As an alternative, you can simply remove the registry value.


Did this solve your problem?

Recovery Point Failures

A common cause of recovery point problems is an issue with port conflicts. When this occurs you will see an error similar to the following:

No connection could be made because the target machine actively refused it (0x8007274D)

You can get this error if network ports 5718 and 5719 on the Exchange server are being used by another program. You can verify this by running netstat –ano from a command prompt and identifying which process is using these ports. If another program is using the ports, a decision will have to be made as to which program will be configured to stop using those ports.

OPTION 1: Reconfigure DPM to use other ports

In DPM 2007 and DPM 2012, run SetAgentConfig to configure DPM to use another port. See the following articles for more information:

947682 - The DPM protection agent service cannot start in System Center Data Protection Manager 2007

2966014 - Update Rollup 3 for System Center 2012 R2 Data Protection Manager

OPTION 2: Reconfigure the conflicting application to use another port

This may be through setting the registry to restrict those ports or restarting\removing the conflicting application. Consult the application vendor for more information on reconfiguring port usage.

Note that DPM is the registered owner of ports 5718 and 5719: 

http://www.iana.org/assignments/service-names-port-numbers/service-names-port-numbers.xml


Did this solve your problem?

Passive node backups on Exchange 2007 fail with Event ID 2034

In a Microsoft Exchange Server 2007 Cluster Continuous Replication (CCR) cluster environment, when you run a Volume Shadow Copy Service (VSS) backup on a passive node, the backup might fail and the following event is logged in the Application Event Log:

Log Name: Application

Source: MSExchangeRepl

Event ID: 2034

Task Category: Exchange VSS Writer

Level: Error

Keywords: Classic

Description:

The Microsoft Exchange Replication Service VSS writer (instance ) failed with error code FFFFFFFC when processing the backup completion event.

This issue can occur if the Exchange Server incorrectly returns False when it calls the SetWriterFailure method of the CVssWriter class that has VSS_E_WRITERERROR_RETRYABLE specified.

In this scenario, install the following update rollup to resolve the issue:

2530488 - Description of Update Rollup 3 for Exchange Server 2007 Service Pack 3


Did this solve your problem?

Cyclic Redundancy Check errors

A problem related to Cyclic Redundancy Check errors with Exchange server protection is often times indicated by an error message similar to the one below:

DPM encountered an error while performing an operation for \\?\GLOBALROOTDevice\HarddiskVoIumeShadowCopyGUID.log on <servername>

 (ID 2033 Details Data error (cyclic redundancy check) (0x80070017))

The most likely cause of this error is a corrupt Exchange transaction log on the Exchange server. However, pay attention to the path in the error message. If the path looks similar to the one below then there is a possibility that the log is corrupt on the DPM server itself:

C:\Program Files\Microsoft DPM\DPM\Volumes\Replica\Microsoft Exchange Replica Writer\vol_<VOL GUID>\<GUID>\Full\C-Vol\E0400121F44.log

To resolve this issue, rename the log file that DPM mentions in the error message and copy a known good version from another DAG member. If the corrupt file is on the DPM server’s replica volume then replace the file in the same way and retry the Consistency Check.


Did this solve your problem?

The Replica is Inconsistent

If the replica is inconsistent you will likely see an error similar to the following:

Data consistency verification check failed for LOGS of Exchange Mailbox Database <database name>  on <Exchange Node>, (ID 30146 Details: Unknown error (0xfffffdf0) (0xFFFFFDF0))

You may also see the following in the DPM server Application Event Log:

Log Name: Application

Source: McLogEvent

Event ID: 259

Level: Error

Description: The file \??\Volume<DPM Volume>\<Guid>\Full\<log volume of Exchange><Database name> \Log#.log\<file> contains the <virus name>. Undetermined clean error, deleted successfully. Detected using scan engine version 5400.1158 DAT version 7285.0000

followed by

Log Name: Application

Source: ESE

Event ID: 518

Level: Error

Task Category: Logging/Recovery

Description: eseutil (12472) JetDBUtilities - 13804: The log file <\\?\<DPM Volume>\logs\<database name>\<Log #.log is missing (error -528 and cannot be used. If this log file is required for recovery, a good copy of the log file will be needed for recovery to complete successfully

This usually occurs if you have an anti-virus application installed and it is deleting log files that are flagged as a virus.

There are a few different options available to resolve this issue:

  1. Set DPM anti-virus exclusions per http://technet.microsoft.com/en-us/library/hh757911.aspx.
  2. Disable the anti-virus on the server.
  3. Run an integrity check on the Exchange logs files for the affected database.

NOTE: There are several other 30146 errors that could have slightly different resolutions, and many times it points to possible corruption on the Exchange server side. See http://technet.microsoft.com/en-us/library/hh859405.aspx for more information.


Did this solve your problem?

The Replica is Inconsistent

If the replica is inconsistent then you may also see an error similar to the following:

Type: Recovery point

Status: Failed

Description: Backup failed as another copy of 'user' database is currently being backed up. (ID 32628 Details: Internal error code: 0x80990D51)

Typically this is caused when the BackupInProgress flag is set to True on an Exchange database. To verify this, run the following PowerShell command:

Get-MailboxDatabase ‘dbname’ –Status | fl

If this is not the case then there is most likely something wrong with the Exchange writer or services.

On the Exchange server, the store itself must be restarted, or the MSExchangeIS service restarted. Conversely, you can move the database ownership to another DAG member to get a cleared DB and allow backups to run.


Did this solve your problem?

Synchronization failures

When synchronization failures occur you will see errors similar to the following:

DPM has detected a discontinuity in the log chain for Exchange Mailbox Database <mailboxname>on<Servername> since the last synchronization. (ID 30216 Details: Unspecified error (0x80004005))

The most likely culprit here is a break in the Exchange transaction logs.

There are two options to resolve this issue:

Option 1: Turn on circular logging and flush the logs

  1. Disable DPM protection.
  2. Turn on circular logging on the DB.
  3. If needed, mount and dismount the DB.
  4. Turn off circular logging.
  5. Enable DPM protection and run an Express full.

Option 2: Clear logs older then the missing log file

  1. Run eseutil /k "x:\path\path\ENN" > output.txt. Note that ENN is the exchange checkpoint file sample E01.chk
  2. Examine the output file to identify gaps in the log stream. For example:
    E010000A.log
    E010000B - E01000E.log missing
    E010000E.log
  3. Identify the latest gap and remove all log files prior to it to a different directory.
  4. Confirm that the remaining logs are in order by running the eseutil /k command as specified in the initial step.
  5. Run DPM Express-Full

Did this solve your problem?

General Mailbox Issues

Many Exchange server recovery issues involve user mailboxes that are not visible in the DPM console recovery tab.

If you try to enumerate user mailboxes on the recovery tab of the DPM console and nothing is there, most likely the SG\DB was renamed, which is not supported. To correct the issue in DPM 2012, complete the following:

  1. Remove protection while retaining data for the Exchange database.
  2. Ensure that the Exchange services or server has been restarted, as the VSS writers will not update even after renaming in Exchange until this is done.
  3. Re-protect the datasource and run a Consistency Check. Recoverable items should reappear.

Note: There will be a gap in the enumeration for any days that backups occurred and the Exchange writer metadata was not updated. These cannot be restored but will eventually be removed as retention expires.


Did this solve your problem?

Some User Mailboxes are Missing

If you try to enumerate user mailboxes on the recovery tab of the DPM console and only some of the mailboxes are listed as recoverable items, it’s possible that the Read Exchange Information permission is missing for Authenticated Users under Security -> Advanced for the particular users in Active Directory.

All mailbox configured users require Read Exchange Information permission so to resolve this complete the following:

  1. In Active Directory, go into the properties of the OU -> Advanced and add an additional Authenticated Users group.
  2. Click on Edit.
  3. On the Permission entry for User Structure Windows, selected Properties.
  4. Select Descendant User Objects.
  5. Select the permission Read Exchange Information
  6. Click OK three times to complete the operation.
  7. Create a recovery point with Express full backup.
At this point all mailboxes should be visible.

Did this solve your problem?

Restore to Original Location Fails

When trying to restore to the original location , you may discover that the option does not appear in the DPM console. Most of the time this is caused by an incorrect selection.

Ensure All Protected Exchange Data is highlighted and that Latest is selected for the recovery time. Then select the DB and then Recover.

Note: The DB being overwritten needs to have the Allow overwrite from recovery flag selected in the Exchange database properties for the restore to work.


Did this solve your problem?

Congratulations!

Your Exchange protection or recovery issue is resolved.

Sorry

It appears that we are unable to resolve your issue by using this guide. For more help resolving this issue please see our TechNet support forum or contact Microsoft Support.

خصائص

رقم الموضوع: 10061 - آخر مراجعة: 24‏/02‏/2016 - المراجعة: 46

تعليقات