Skip to main content

Single Bravura Security Fabric database goes offline

From the point of view of the Bravura Security Fabric server, this is indistinguishable from a failed link to the database. It is possible that system monitoring logs will discern whether the problem is connectivity to the database or the database itself.

What stops working

What continues to work

Possible Causes

Data loss

Resolution

  • Users can no longer log into this server.

  • Users can no longer retrieve passwords from this server.

  • This server can no longer push password updates to target systems for which it is responsible.

  • Other servers detect that replication is impossible to this server, so start queuing updates to this server and displaying alarm messages, indicating that when the queue fills, they will stop functioning normally.

  • If the queue is allowed to fill – which could take several hours to several days, depending on activity level and queue size – other servers will suspend services; users will be unable to log in (since logins are logged in a replicated fashion) and will be unable to checkout passwords.

    Effectively, the entire system will go into an alert state until the dead.

Other servers continue to function normally, unless their replication queues reach their limit.

In the event that the queue is full on other servers, they switch to DB COMMIT SUSPEND mode. In that case the only possible action is to remove the non-functional server from replication.

A problem occurs on the database server used as a back end for a single Bravura Security Fabric server. This takes the database offline and incapacitates the Bravura Security Fabric server in question.

No data loss, or minimal data loss if updates on target systems were not yet committed to the database when the damaged server went offline.

Database problems may be due to hardware or OS on the database server (assuming that it is separate from the Bravura Security Fabric server). They may be as simple as a full file system or may be more complex.

Diagnostics of database problems are outside the scope of this document. Repair the database if possible (see Time available to fix problems ).

If the database link cannot be fixed in time, remove the affected Bravura Security Fabric server from the replication configuration on other Bravura Security Fabric servers promptly. Instructions for this are in Removing a node from replication . At a later date, the server should be returned to the replicating set using instructions from Synchronizing a new node with an existing set of Bravura Security Fabric replicas .