How to troubleshootTo troubleshoot this issue, following these steps:
- In the replication monitor on the Distribution Agent for the subscriber, extract the transaction sequence number and command ID that encountered the error:
Note You can obtain the same transaction sequence number from the distributor server by using the following query:
- Extract the commands that map to the transaction sequence number on the distributor server. Use the transaction sequence number from step 1 as the parameter for this command:
- From the output of step 2, identify the command that is failing by using the command ID from step 1. Refer to the command_id column in the result set.
- Validate the article information on the publisher. Use the article ID that you obtain from step 2, and check the details of the article that you are trying to update:
- Validate the primary key on the publisher.
You have two pieces of information: the table on which you are trying to perform the update, and the primary key value. You can query the table by using the primary key value, and locate the row on the publisher database. For example:SELECT * FROM tbl_sample WHERE column_name = <primary_key_value>
- Check the problem at the subscriber.
Execute the same query on the subscriber database, and compare it to the result that you receive from the publisher database.
- Manually insert the missing row at the subscriber. This may enable the Distribution Agent to retry the failed command and move forward with the replication.
Note There may be other rows that are missing and that have to be manually inserted at the subscriber if there are more failures.
- Instruct the Distribution Agent to skip this error and continue to replicate the rest of the changes. The Distribution Agent accepts the skiperrors parameter. You can use this parameter to pass error code 20598. This may keep the replication setup intact while you wait for an opportunity to manually synchronize the missing rows.
Note You have to carefully evaluate the downstream effects of referential integrity and the triggers that are present on the affected table before you continue.
Data collection to investigate the cause of this issueIf this problem occurs repeatedly, you should collect the following data for analysis by the Microsoft SQL Server support team so that they can try to identify the cause of the problem:
- The backup of the distribution database when this problem occurs. (This should be after the error is reported and before the subscription is reinitialized.)
- Transaction log backups of the publisher and the subscriber. (These should be for at least the 24 hours that led up to the time of the problem.)
- Profiler traces that show the activity of the replication agents on the publisher, the subscriber, and the distributor. (Please make sure that the profiler is running even before the problem starts. Ideally, you want to start the profiler around the same time as the reindex job start time.)
- Outputs from the previous five steps for identifying the affected table and the primary key value that is missing
- Output of the catalog views from both the publisher and the subscriber databases:
- Verbose outputs of replication agent logs
Known issues that are addressedThe following problems occur in older versions of SQL Server:
- KB 982857 Fix: Distribution Agent fails with error code 20598 when a publication database is configured with the Read Committed Snapshot option
- KB 2922526 FIX: Missing rows on subscriber when you drop and re-add article from another publication or add subscription
- KB 2484392 FIX: "xact_seqno" column in the "MSrepl_errors" table does not record a skipped row in SQL Server 2008 or in SQL Server 2008 R2 if you use the "-SkipErrors" parameter in the Distribution Agent
- KB 956376 FIX: The SQL Server 2005 Distribution Agent and the SQL Server 2008 Distribution Agent do not skip error 20598 when you configure a SQL Server 2000 transactional publication to skip error 20598
- KB 959893 FIX: Some changes may not be replicated to the subscriber correctly in SQL Server 2005 or in SQL Server 2008 when the sum of the parameters exceeds 2,100
- KB 959032 FIX: When you run the "dbcc dbreindex" command or the "alter index" command, some transactions are not replicated to the subscribers in a transactional replication in SQL Server 2005 or in SQL Server 2008
- KB 954054 FIX: Some changes are not replicated to the subscribers when you synchronize data for a transactional replication in SQL Server 2005