Skip to main content

Resynchronizing databases

You can resynchronize the primary node’s database to any other node if they become unsynchronized.

Determine if resynchronization is needed

To determine if (or how much) the databases are out of sync, you can use an SQL query (ran at the same time on all databases), to count the rows of each table in the database. The following is a suggested query (run under an account with the sysadmin role):

SELECT OBJECT_NAME(id) AS [TableName], rowcnt AS [RowCount]
FROM sysindexes
WHERE indid IN (1,0) AND OBJECT_NAME(id) not like '%_stg' AND OBJECT_NAME(id) not like '%_cache'
AND OBJECTPROPERTY(id, 'IsUserTable') = 1 order by rowcnt desc;

The primary node will have data in staging tables (*_stg) that the secondaries never receive. Except for those tables, row counts should match (with a margin of 5-10 rows; current operations on one server may not have replicated yet, or the query on one database ran slightly earlier than on another database). If counts don’t match, send the row counts to support@bravurasecurity.com to determine if the differences are important enough to require a resynchronization.

You can use the limitedsynccheck to checks the consistency of a limited subset of tables across all nodes.

Preparation

Before resynchronizing, take the following steps:

  1. Check the size of the database files (<instance>.mdf) on the primary (sending) backend database; make sure that both the sending application node and the receiving nodes have at least twice that much space. In Microsoft SQL Studio you can check file size by expanding Databases, right-clicking the database, then selecting Properties > Files.

    The space available at the location of the temp files (.ldf) on the backend database since the resynchronization process will send most of the large tables over.

    Bravura Security Fabric saves the database to disk on the sending side and loads it from disk on the receiving side where it has to be loaded in the temp/transaction file (.ldf) then flushed to the main db file (.mdf). When the database has been saved, there has to be enough space left for the database engine and the Database Service to function properly. The Database Service requires by default to have at least 10% free space to be able to replicate. When both the backend database and the Bravura Security Fabric node are storing their files on the same disk/partition/share, the free space on that disk has to be at least:

    • On the sending node, 10% of the disk size plus twice the size of the sending database’s .mdf files

    • On the receiving node(s), 10% of the disk size plus three times the size of the .mdf files.

  2. Identify the application nodes that will have to go down and exclude them from any load balancer that end users connect through, so that started sessions are not interrupted by the Database Service going down. When the Database Service is down on a node, anyone attempting to login to that node’s Web UI will get a "product not available" error.

  3. If your replication architecture is hybrid (more than one application node using the same database), stop the Database Service on any nodes that share the source and destination’s databases and also remove those nodes from the load balancer.

    Stopping the Database Service will stop all product services dependent on it. After that happens, start only the database service, so it can perform the resynchronization operation.

    Note:

    • Reduce logged errors by stopping the IIS service and the Health check task in Windows Task Scheduler on all the nodes that should not contact their databases.

    • Remember to start up all Bravura Security Fabric services, IIS and Health check after the resynchronization process ends successfully.

Version note

Resynchronize (limited) option implemented in Bravura Security Fabric 12.5.0.

Resynchronizing

Once you have completed the preparatory steps listed above, start the resynchronization:

  1. Log onto the Web UI of your primary node (or the node that contains the most data) as a product administrator.

    To identify the current primary node, go to Manage the system > Maintenance > Scheduled jobs > PSUPDATE, and see the node in the Run task on this node drop-down list. Disable the PSUPDATE job until you verify the resynchronization succeeded. Login to the primary application node as the server administrator and disable the Windows Task Scheduler’s External database replicator task.

    Also disable the Health check task if it was enabled so it does not generate logged errors while the Database Service is down.

  2. Verify in the browser address field (or in the middle of the web-based interface footer) that you are logged into the node that contains the data you want to send to the other nodes

    Note

    If you log into the web-based interface of a new node that you are just adding to replication and click Resynchronize, the data from the old nodes will be returned to defaults, as the initial data from the new node will be copied over to all other nodes selected for resynchronization, overwriting any data there.

  3. Click Manage the system > Maintenance > Database replication .

  4. Select the destination node or nodes that must be resynchronized on the Database replication page.

  5. Resynchronize .

    • Click Resynchronize.

      This creates a database export of the source node database and sends it to the destination node. The source node will be temporarily offline during the export process. This button is only enabled on the primary node.

    • Click Resynchronize (limited).

      This creates a database export of the source node database avoiding transfer of larger audit data to make resynchronization faster between nodes. Full resynchronization must be performed at a later date to synchronize this additional data.

  6. Wait while the destination node imports the data.

    Depending on many variables, primarily the size of the dataset being resynchronized and the network speed, the import operation may take several hours.

    The destination node will be offline during the import process.

  7. In idmsuite.log on the sending node, check that all tables that start to be written to disk are successfully written. On the receiving node, check that all tables that are received are also loaded to the backend database (the largest tables will finish loading last). If there is anything preventing some the tables from loading, the resynchronization process failed; depending on the table that failed to load, contact support@bravurasecurity.com immediately.

  8. Decide which node will remain the primary node, and enable the PSUPDATE task on that node (see step 1). Enable the Windows Task Scheduler’s External database replicator on the primary node and make sure it is disabled on all secondary nodes.

    If you have configured the Health check task and are normally acting on its warning emails, re-enable the Health check task on all application nodes’ Windows Task Scheduler.

    Warning

    When resynchronization is initiated, the data on the destination node is dropped and replaced with the data from the primary node.