Databases contain unsynchronized data
Generally, during replication:
A stored procedure runs on server A.
Server A puts that stored procedure into its queue.
When it is able to talk to server B, it will attempt to transmit the contents of the queue to server B.
If server B verifies that the transmission occurred correctly, server B inserts the data into its receive queue.
Server B then responds to server A stating that it received the data correctly.
Server A then removes the data from its queue.
Server B in the mean time starts processing the data in its receive queue.
Problems can occur when:
The queue is deleted or corrupted.
A server is taken out of replication for a while and as a result, updates to it are not queued.
Something goes wrong and the stored procedure is not executed properly on the secondary server. In this case the problem may fix itself the next time the data is updated.
The \<instance>\db\ directory contains two files that can be used to investigate why certain nodes do not have the same set of data as other nodes.
The file iddb-failed-procs-receivequeue.log
keeps track of all failed replication procedures that were successfully received from the sending node, but were not processed on the local instance. These include (but not limited to) procedure errors such as:
Invalid key constraints
Accessing none existent records
Full databases
The files iddb-failed-procs-<node ID> .log
are used to track replication procedures that failed on the sender side, mainly because the procedures were not added to the queue to be sent to the receiving server. These kinds of errors include (but not limited to) :
Corrupt queue files
Corrupt header files
You can use these files to identify the procedure causing a problem, then determine how to correct it, either by adding/correcting the failed record, or be re-playing the stored procedure. The appropriate values needed for the failed stored procedure can be found in the iddb-failed-procs-* files.
The following is an example of the contents of these files:
2011-07-06\ 22:21:52,\ PslStoreAdd,\ "namespace" "key" "S" "value" "4e14e000",\ constraint violation
The format of this file is:
Date
Time
Procedure name
Arguments to the procedure
Database error message
These logs are not automatically scrubbed, so they could grow without bound if the database experiences many problems.