ADDS: Deploying the 1st W2K8 R2 or later DC in an existing forest may temporarily halt AD replication to strict mode destination DCs for up to 12 hours


Symptoms


Abstract

This article was written to describe a behavior where adding the 1st post Windows Server 2008 domain controller to a given domain may temporarily halt Active Directory replication to strict mode destination DCs. Windows Server 2008 R2 and later domain controllers that support the Active Directory Recycle Bin feature stamp an "isdeleted" attribute on all objects located the deleted objects container. The act of stamping such attributes creates a replication event to partner DCs. Each domain controller independently deletes objects when garbage collection executed every 12 hours after the last boot. That DCs last booted at different times can lead to a condition where "isdeleted" time stamps are being outbound replicated to strict mode destination DCs that have already purged those same objects from their copy of the Active Directory database. The condition is resolved within 12 hours or less once all domain controllers garbage collect the same objects.

A somewhat related problem is that the addition of new schema changes for new OS and application versions can add new indexes to Active Directory databases which can generate high disk utilization on the NTDS.DIT volume. To mitigate this problem, KB 2846725 describes deferred index creation on Windows Server 2008 R2 DCs that is built into Windows Server 2012 and later OS versions.

Secondly, in addition to causing strict mode replication failures for up to 12 hours due to the way that domain controllers individually purge tombstoned objects, the stamping of isrecycled on large populations of deleted objects can create a significant replication event that can result in a replication event log.

Finally, Exchange Servers and other server roles register as change notification clients for 2 DNTs in the configuration partition. The introduction of the 1st recycle bin aware DC to an AD forest would write isrecycled on objects subject to change notification.

Symptoms

  1. An existing forest consists exclusively of pre-Windows Server 2008 R2 domain controllers in an Active Directory forest. Strict replication is enabled on at least one domain controller in the forest.

  2. Objects, including user accounts, are deleted from Active Directory partitions. These deleted objects transition to the deleted objects container and are removed from Active Directory by the garbage collection daemon tombstone lifetime (TSL) number of days in the future.

  3. Windows Server 2008 R2 or a later version of ADPREP /FORESTPREP is executed.

  4. The first post-Windows Server 2008 DC is added to the forest, which has the side-effect of stamping the isRecycled attribute on live objects AND deleted objects that reside in the deleted objects container.  This includes objects that are at the cusp of TSL expiration and about to be garbage collected. This update triggers an outbound replication event to replica DCs hosting common partitions.

  5. Shortly after step 4, NTDS Replication Event 1988 is logged on destination DCs where strict mode destination DCs that received a request to inbound replicate an update to an object from the source DC cited in the event that the destination DC has already seen, deleted and garbage collected. The DN path in the 1988 event for this scenario are all "delete mangled". Text from a sample 1988 event is shown below:

    Event Type:       Error
    Event Source:    NTDS Replication
    Event Category: Replication
    Event ID:           1988
    Date:                <date>
    Time:               <time>
    User:                NT AUTHORITY\ANONYMOUS LOGON
    Computer:         <hostname of DC that logged event - i.e. the destination DC in the context of replication>
    Description:
    Active Directory Replication encountered the existence of objects in the following partition that have been deleted from the local domain controllers (DCs) Active Directory database.  Not all direct or transitive replication partners replicated in the deletion before the tombstone lifetime number of days passed.  Objects that have been deleted and garbage collected from an Active Directory partition but still exist in the writable partitions of other DCs in the same domain, or read-only partitions of global catalog servers in other domains in the forest are known as "lingering objects". 
     
    This event is being logged because the source DC contains a lingering object which does not exist on the local DCs Active Directory database.  This replication attempt has been blocked.
     
     The best solution to this problem is to identify and remove all lingering objects in the forest.
     
    Source DC (Transport-specific network address):
    <object guid of source DCs NTDS Settings object or CNAME record in DNS>._msdcs.<forest root domain>
    Object:
    <DN path of updated object being outbound by source DC that has been seen deleted and garbage collected by destination DC>
    Object GUID:
    <32-character long object GUID for object being updated by source DC that has been garbage collected by the destination DC>

  6. REPADMIN /SHOWOBJMETA output, run against the objects cited in the 1988 events, shows that the isRecycled attribute is populated on objects that were deleted on the cusp of approximate TSL number of days in the past. A closer check reveals that the object cited in the 1988 is temporarily "live" (for up to 12 hours) on the source DCs cited in the 1988 events but missing on destination DCs that logged the 1988 event.  

Cause


The introduction of a Windows Server 2008 R2 or later domain controller updates the isRecycled attribute on live objects as well as deleted objects currently residing in the deleted objects container, including those deleted approximately TSL number of days before the promotion of the 1st Windows Server 2008 R2 DC into an AD forest.

The garbage collection daemon on each Active Directory domain controller independently purges objects deleted TSL number of days, every 12 hours after the last OS startup.

When the first Windows Server 2008 R2 DC is promoted, it updates the isRecycled attribute on all deleted objects with a value of "1".  The problem occurs when this change is replicated to down-stream DCs that do not have the object, because the object has already been garbage collected.


An up-to 12 hour race condition exists that can block AD replication, when source DCs that have not yet garbage collected objects (deleted at the cusp of TSL expiration) outbound replicates isRecycled stamps to strict mode destination domain controllers that have seen, deleted and garbage collected those same objects.

Resolution


How to determine if your experiencing this problem vs. generic lingering objects

This problem is identified by the isRecycled attribute being stamped on an object that was deleted approximately TSL number of days in the past.  This object is "live" on the source DC (the first Windows Server 2008 R2 or later DC promoted or upgraded into the forest) but has been seen, deleted and garbage collected by a strict mode destination DC.

Use the steps below to determine if you are experiencing this exact scenario as opposed to lingering objects caused by lack of end-to-end replication in the preceeding TSL number of days.

  1. Record the TSL value for the forest:

    c:\>repadmin /showattr . "CN=Directory Service,CN=Windows NT,CN=Services,CN=Configuration,DC=forest root domain,DC=TLD> /atts:tombstonelifetime

  2. Use the PING -a command to resolve the fully qualified CNAME address in the 1988 event to its hostname

    Example: if the value in the "Source DC (Transport Specific network address):" field of the 1988 event is "fae66221-a00b-41a5-b83e-3908d3eea3f5._msdcs.contoso.com", type

    c:\>ping -a abc1234-d1b2-ef56-a6b8-901a2bba3f5._msdcs.contoso.com

    Record the host name of the domain controller returned by the PING command

  3. Run repadmin /showobjmeta against the source DC and the ObjectGuid cited in the same 1988 event logged by the destination DC

    c:\>repadmin /showobjmeta <hostname of source DC> "<GUID=Object GUID cited in same 1988 event used in step 1>"

    The date that the object was deleted is denoted by the date in the "LastKnownParent" entry

    Note: Objects residing in the deleted objects container can only be found by their object guid. Do not try to locate deleted objects by DN path.

    "IsRecylced" has the same "last modified" date stamp as the last modified date on the IsDeleted Attribute. The Windows Server 2008 R2 domain controller made the last originating change to IsRecycled.

    The object in question may be live on the source DC for up to 12 hours after the 1988 event is logged on the destination DC. If the object cannot be found on the source DC by its object GUID, it may have been permanently deleted by the garbage collection daemon.

  4. Use a calender to count the number of days between the datestamps for "lastKnownParent" and IsRecycled from the SHOWOBJMETA output in step 3 where


      LastKnownParent = the date that the object was deleted
      IsRecycled = attribute updated by the introduction of a Windows Server 2008 R2 DC


    If the date difference between the date for "IsRecycled" and the date in "LastKnownParent" is right at the forest-wide value for TombStoneLifetime then you've likely exhibited this issue.

    You can confirm this by running the same repadmin command from step 3 against the destination DC that logged the 1988 event:

    C:\>repadmin /showobjmeta <hostname of destination DC> "<GUID=Object GUID cited in same 1988 event used in step 1>"

    If the object cited in the 1988 event exists on the source DC but not the destination DC, identified by the object not being found by /showobjmeta, and the date difference between LastKnownParent and IsRecycled is right at the cusp of the TSL value for your forest, then you've encountered this issue.

    Proceed to "resolution steps" to solve the problem.


Resolution steps 

  1. Wait up to 12 hours for all doman controllers logging the NTDS Replication 1988 event to garbage collect lingering objects

    OR

  2. Accelerate the execution of garbage collection on DCs that have yet to garbage collect objects deleted TSL number of days in the past on the source DCs referened in the 1988 events using the ROOTDSE "DoGarbageCollection" control:

    1. In Ldp.exe, when you click Browse on the Modify menu, leave the Distinguished name box empty

    2. In the Edit Entry Attribute box, type "DoGarbageCollection" (without the quotation marks)

    3. In the Values box, type "1" (without the quotation marks)

    4. Set the Operation value set to Add and click the Enter button, and then click Run.

    You can trigger RootDSE mods with repadmin /showattr from the console of any given DC using the syntax

    >repadmin /setattr "" "" doGarbageCollection add "1" 

    OR

  3. Increase the garbage collection interval prior to the introduction of the first Windows Server 2008 R2 DC. 

    The garbage collection interval can be configure by entering a value in the garbageCollPeriod attribute at:

    CN=Directory Service,CN=Windows NT,CN=Services,CN=Configuration,DC=forest,DC=root

               Default - 12 hours
               Minimum - 1 hour (tested)
               Maximum - Not documented.  (24 hours tested)