Exchange Offline Address Book (OAB) generation failures caused by Attributes containing stale or bad data: events 9126 9330 and 9339 with error 8004010e cited
The Exchange Offline Address Book (OAB) generation process runs periodically on Exchange 2010 mailbox servers and generates the following events:
OALGen encountered error 8004010e while calculating the offline address list for address list '\Global Address List'.
OALGen encountered error 8004010e (internal ID 500139c) accessing Active Directory ContosoHUB03 for '\Global Address List'.
Active Directory ContosoHUB03 returned error 8004010e while generating the offline address list for '\Global Address List'. The last recipient returned by the Active Directory was 'UserName'. This offline address list will not be generated.
The call to QueryRows is failing with error 0x8004010e MAPI_E_NOT_ENOUGH_RESOURCES (0x8004010e).
Oabgen uses a QueryRows function to return data (attribute values) from Active Directory in order to generate the Offline Address List (OAL). If the returned data is invalid, an event 9339 with error 8004010e will be logged. These are the most common causes why invalid data would be returned:
Attributes on user or group objects contain references to:
- Unresolvable Distinguished Names (DN) The DN in the attribute points to an object not present in the directory.
- Attribute values containing DNs that have been DEL mangled
- Attribute values containing DNs that point to an object that was removed from AD but references to that object were never cleaned up
You will see this latter scenario referred to as one of the following:
- Lingering Links
- Lingering Linked Values
More specifically, Single- and Multi-valued linked attributes, such as "Manager" on a user account or “Member” on a group object, contain stale references to objects that are no longer present in Active Directory. Such stale references can occur on many attributes and object classes. As of today, this problem most commonly occurs on the following objects and attributes:
Complete Attribute list that may contain stale references:
The lack of end-to-end replication of directory partitions defined in the forest within a rolling tombstone lifetime number of days or time jumps which prematurely purge knowledge of deletes before end-to-end replication can result in AD database divergence amongst DCs. Such long term conditions can cause Lingering Objects. Lingering objects are very common and can cause this problem. However, there are other potential causes of “bad data” in Active Directory that are often confused with Lingering Objects. These are lesser-known and do not show up in a check for lingering objects (when running repadmin /removelingeringobjects).
Other potential causes of invalid data in AD:
A linked attribute contains the DN of an object that no longer exists in Active Directory. These stale references are referred to as lingering links.
An object created on one DC that never got replicated to other DCs hosting a writable copy of the NC but does get replicated to DCs/GCs hosting a read-only copy of the NC. The originating DC goes offline prior to replicating the originating write to other DCs that contain a writable copy of the partition.
An object deleted on one DC that never got replicated to other DCs hosting a writable copy of the NC for that object. The deletion replicates to DCs/GCs hosting a read-only copy of the NC. The DC that originated the object deletion goes offline prior to replicating the change to other DCs hosting a writable copy of the partition.
There are two major problems to contend with that can lead to considerable time to resolution:
Problem 1: Identify all objects and/or attributes containing bad data that would cause oabgen to fail.
Problem 2: If lingering objects were identified, then proceed with lingering object removal. However, if the identification phase reveals lingering links, proceed with Attribute cleanup.
This stale data may exist on objects residing in read-only Global Catalogs, on DCs with writable copies of a directory partition or both.
Once the attributes causing Oabgen to fail have been identified, your first goal should be to vet the validity and consistency of attribute values on forward link across all replicas hosting writable copies of the objects home directory partition. Then you focus on DCs hosting a read-only copy of the NC.
- Identify all attributes on all objects that contain stale references causing oabgen to fail
- Determine whether any DC hosting a writable copy of the NC for the object also contains attributes with invalid references
- If they do, then delete the bad reference (DN) from the attribute
- If the DCs that are writable for this object do not contain the invalid references and they only exist on DCs hosting a read-only copy of the partition, then additional steps are required
Event 9339 reports one object leading to the problem. However, the problem is usually much more wide-spread than this. The challenge here is to identify all users/groups containing invalid references that will lead to the errors.
Potential identification mechanisms:
- OABValidateThis is the best tool to use when the problem is wide-spread. This tool was very recently enhanced to address this specific problem.
- Nspitool This tool will run until it discovers one object with a problem. You re-run nspitool against that specific object to have it dump all values. The values that it errors out on will be the cause of the 9339 event. You will then have to re-run NSPItool to identify more objects with problems after your correct the problem found in the first object.
- OABinteg This tool is not valid for troubleshooting this particular issue.
- CSVDE or LDIFDE export of the OAB and then look for DEL mangled references (DEL mangled references are only one example of bad data, so this is usually not a good method of identification).
- LDP dumpdatabase (Microsoft support assistance may be required).
In some cases NSPItool and/or oabvalidate will fail to identify a problematic attribute. You may be able to identify the attribute with an LDP database dump of ntds.dit:
Use LDP to dump the database with the dumpdatabase command. Find the Distinguished Name Tag (DNT) of the object reported in the event. Look at the BDNTs for this object. Go to the DNT entry for each BDNT and identify any that have a value of False.
A script that parses the text from the database dump would make this an easier task.
- Look for Object value of False (Object is a phantom and not present in the DB)
- CNT = Reference count CNT > 0 (means someone still references this phantom)
- Look at BDNT (Backlink DNT) -ignore Deleted Objects container
- Create object hierarchy using DNT and PDNT stopping at DNT 2 (root object)
- List all objects that meet these conditions. List all objects that reference these objects.
- Report Name and ObjectGUID of both in CSV importable format.
- Use repadmin /showattr * and / or repadmin /showobjmeta * to report data for the object. Compare differences.
Workaround until cleanup can be performed:
- Continue to use Exchange 2003 or Exchange 2007 mailbox server for OAL generation.
Determine whether any writable DCs contain objects with attributes containing invalid references.
Search all DCs by object DN or objectGUID. Repadmin /showobjmeta can be used for issues with group membership, otherwise use repadmin /showattr:
- Repadmin /showattr * “<GUID=ObjectGUID>” /atts /allvalues /gc /long >attr.txt
- Repadmin /showobj * “<GUID=ObjectGUID>” >objmeta.txt
If there is a single DC hosting a writable copy of the partition where the object exists with improper attribute references, then cleanup may be as simple as:
- Delete or clear the invalid reference on this DC and outbound replicate the changes.
However, if the problem only exists on the GCs hosting a read-only copy of the partition where the groups exist, then there is quite a bit of work to do:
There is no easy resolution to this problem. The following are viable workarounds and each has its own pros and cons. Review the following four methods and the table below to help you choose the best solution for your environment.
Method 1: Delete and recreate
Delete the object. Verify that the object no longer exists on all DCs. Recreate the object and repopulate attribute values. If the objects are security principals, then the object will have a new SID with this method. If objects or files are permissioned with the old SID then this method is not desirable.
Method 2: Delete and restore with an Authoritative Restore
Delete the objects. Verify that the objects no longer exist on all DCs. Perform an authoritative restore of the objects on a DC that hasn't processed the deletion.
Objects are completely restored to the state that exists on the recovery DC. This method also restores backlinks (i.e. where a group was a member of another group).
Note If the DCs are running Windows Server 2003, then they will all most likely need to be patched with a QFE version of ntdsa.dll before implementing recovery procedures. The recovery DC will need an updated version of ntdsutil.exe.
- Use LDP to obtain the following for each affected object: ObjectGUID and Distinguished Name
- Use repadmin to generate replication metadata for an object on all DCs
Repadmin /showobjmeta * “DNofObject” >c:\ALLDCsmetab4deletion.txt
- Identify and prepare a recovery DC
Verify object and valid attribute values exist on a DC hosting a writable copy of the partition.
Use repadmin to disable inbound replication and then boot this DC into DC Restore Mode. (or stop the Active Directory Domain Services service on Server 2008 or later)
- Delete the object on another DC hosting a writable copy of the NC
- Allow end-to-end replication of the deletion to take place
- Verify object's removal with repadmin /showobjmeta *
To verify the objects no longer exist on the GCs:
repadmin /showobjmeta * “DNofObject” >c:\ALLDCsmetaAfterdeletion.txt
* All DCs that host the partition the object was in should report status 8333 “Directory Object Not Found”
* All DCs that don’t host the partition will report status 8439 “The distinguished name specified for this replication operation is invalid”
* If metadata is returned you must wait until all DCs process the deletion
* If a different status code is returned you will need to investigate on a per DC basis
- Perform an authoritative restore of the object(s) on the DC that is booted into DS Restore mode
- Boot the recovery DC into normal mode and allow replication of the changes to occur
- Import any ldifde files that were created as part of the authoritative restore process
- Re-enable inbound replication on the recovery DC
Method 3: Delete and restore with adrestore.exe
SID is retained but most attributes will have to be repopulated. If backlinks are present and need to be restored then a Microsoft internal utility may need to be used prior to object deletion. (Microsoft Commercial Technical Support assistance may be required)
Method 4: Global Re-host
Un-host the partition from all GCs in the forest simultaneously. Re-host from DCs hosting a writable copy of the partition where the objects exist.
The following un-host and re-host procedures will need to be performed on all DCs that contain a read-only copy of the partition in the forest. Failing to cleanup even one GC in the environment can cause the problem to recur in the environment after the cleanup steps have been performed
- Verify that all DCs that host a writable copy of the NC have valid attribute values for the affected objects
- Repadmin /unhost DSA <Naming Context>
Verify that no other GCs host the partition prior to re-hosting the partition. There should be an event ID 1660 logged in the Directory Services event log on every DC where the partition was un-hosted.
Event ID 1658 is the status event logged in the Directory Services event log to indicate how many objects still need to be removed before the partition is completely removed. Event ID 1660 is logged in the Directory Services event log when the partition has been successfully removed from the database.
Repadmin /options <DSA> +disable_ntdsconn_xlate
Repadmin /replicate <Dest DSA> <Source DSA> <Naming Context> /readonly
Repadmin /replicate <Dest DSA> <Source DSA> <Naming Context> /readonly
Alternatively you could do the following:
- Verify that all DCs that host a writable copy of the NC have valid attribute values for the affected objects
- Disable outbound replication on all DCs that host a read-only copy of the partition
- Run the following on each of these DCs
- Repadmin /rehost DSA <Naming Context> <Good Source DSA Address>
- Verify the issue has been resolved on each DC using repadmin /showobjmeta or repadmin /showattr
- Re-enable outbound replication on all DCs that host a read-only copy of the partition.
There are multiple ways to resolve this problem. The following table lists both valid and invalid ways to resolve the issue. Invalid methods are displayed so that time is not wasted performing them.
Invalid attribute value exists on a writable copy of the NC
Remove just the invalid attribute values from the attribute in question from a DC hosting a writable copy of the NC
This is the preferred solution. If this is an option, then performing this step should also resolve the issue on DCs hosting a read-only copy of the partition.
This will only work if the bad data exist on an attribute for an object contained on a DC hosting a writable copy of the partition. (not in a GC’s read-only copy of the partition)
Invalid attribute value exists only on a read-only copy of the NC
Check for and remove lingering objects
Easy step to implement if the problem is caused by lingering objects (check with /advisory_mode first)
Won't clean up all conditions including abandoned objects and lingering linked values. Requires you to be in strict mode. If a GC considers an abandoned object, strict mode doesn’t block inbound replication of abandoned objects.
Initiate a full replication cycle using repadmin with a known good source (you will need to create a replication connection using repadmin /add if one doesn't already exist then run: repadmin /replicate destinationDC sourceDCFQDN PartitionDN /readonly /full)
Easy step to implement. If this does not correct the attribute data then a rehost or object deletion may be required)
This command may take a very long time to complete if the partition in question contains a large amount of objects
Unhost and rehost the partition from a known good source
Ensures GC hosts a valid copy of the partition. Good solution to the problem in small environments or where data divergence is limited to a few DCs.
Is challenging and time-consuming in a large environment with this method as it may require all GCs to be cleaned up at the same time. (and it may be necessary to disable outbound replication on the same GCs during the duration of the cleanup procedure as it may be possible for a "clean" GC to re-replicate bad data from a "dirty" GC.
Delete the object from a DC containing a writable copy of the NC
Easy solution where the problem is isolated to attribute values on a single object
Depending on the object type, this solution many additiona problems
Delete and then authoritatively restore the object on a DC containing a writable copy of the NC.
1. prior to object deletion: Verify object and valid attribute values exist on a secondary DC and then boot this DC into DS Restore Mode.
2. Delete the object on another DC hosting a writable copy of the NC.
3. Allow end-to-end replication of the deletion to take place.
4. Verify object's removal with repadmin /showobjmeta *
5. Perform an authoritative restore of the object(s) on the DC that is booted into DS Restore mode)
This will resolve the problem as long as you correctly identified all objects containing attributes with invalid data. LDIFDE files will be created automatically during the authoritative restore that will aid in complete recovery of forward-link / back-link pairs.
There is down-time associated with this while the objects are in their deleted state. This may require you to install several QFEs on the recovery DC and replica DCs to update ntdsa.dll and ntdsutil.exe
Delete the object and then use adrestore.exe to un-delete the object from a DC containing a writable copy of the NC. Then re-populate attribute values using ldifde.
This will resolve the problem as long as you correctly identified all objects containing attributes with invalid data.
There is down-time associated with this while the objects are in their deleted state. This action requires a good export of the object. In the case where groups are nested, you would also need an export of that groups membership to correct backlinks. (groupadd.exe can help with this part)
Replfix solution documented in KB 914024
The solution provided in 914024 does not resolve this issue.
This solution was created for one specific customer and this fails to resolve the problem
Only works if both source and destination DC host a writable copy of the partition
NULL out the attribute values on the object from a DC hosting a writable copy of the NC
This will not remove lingering link values if the Forest Functional Level is 2003 or later (as Link -value replication (LVR) will be enabled)
Sample experience with issue caused by Lingering-linked values:
An Active Directory forest consists of root domain Contoso.com with child domain corp.Contoso.com, grandchild domain na.corp.contoso.com and tree domain fabrikam.com. A universal group (which could also be a distribution or security enabled group) is created in the contoso.com domain and the membership consists of
Viewing the member attribute for the universal group shows 4 members. The fabrikam.com domain gets force demoted and the user object na.corp.contoso.com\kim is deleted from the na.corp.contoso.com domain, at a time when end-to-end replication does not take place for TSL number of days. On GCs hosting a read-only copy of the NC, the member attribute of the universal group continues to show 4 members in the group when only two of the 4 listed members, contoso.com\adam and corp.contoso.com\john are valid.
Note the sample problem above involves users added to groups in the domain partition but the problem themselves exists for both single and mult-valued attributes on objects in any writable domain partition.
Group object DN: CN=FailBoatDL,OU=Groups,DC=contoso,DC=com
DNs referenced in Attribute: (Group membership)
Object exist in this NC (naming context / domain): contoso.com
After domain deletion and the deletion of another user object:
Group membership on DCs hosting a writable copy of the NC:
Group membership on DCs hosting a read-only copy of the NC:
Exchange 2010’s oabgen fails, citing the object “MayberryFailBoatDL” as the culprit.
Article ID: 2553698 - Last Review: 09/06/2011 18:50:00 - Revision: 10.0