Redundancy

Introduction

This section covers the use of Redundancy to increase reliability of the system.

Contents



Overview

Redundancy is the duplication of critical components or functions of a system with the intention of increasing reliability of the system, usually in the form of a backup or fail-safe. The Witness software incorporates redundancy at all levels of the system. This ensures that if any individual software component fails, another part of the system will take over that role until the system can be repaired.

Management Servers

For our Management Servers, there are two components: primary and alternate. Crucially, each must be on a physically different, separate server. They both run similar software, but the output from the alternate remains inactive during normal operation. The primary monitors itself and periodically pings an activity message to the back-up when healthy - a heartbeat. All outputs from the primary stop, including the activity message, when the primary detects a fault. The alternate activates its output and takes over from the primary after a brief delay when the activity message ceases. 

Track Engines

For our Track Engines, all are active but each backs up the other. This can be configured in cyclical or paired back-ups. Each Track Engine must also be on physically different servers.

Redundancy

  • Management Servers: Each Management Server is duplicated, and so has a back-up Management Server. Between the primary Management Sever and the alternate Management Server, a heartbeart pings back and forth every two seconds. If a heartbeat does not return within 10 seconds, the receiving Management Server marks the other as 'down'. If the primary Management Server goes down, the back-up Management Server swiftly takes control of the primary's Track Engines and Radar, ensuring no break or failure in the system.

  • Track Engines: Track Engines back-up one another. Under Topology, you can configure which Track Engines back up which. In this example, the Track Engines back-up one another cyclically, which is probably the most logical back up configuration. If one Track Engine fails, the designated back-up Track Engine takes control of the failed Track Engine's Radar to ensure no break or failure in the system. However, a Track Engine is only able to support double its allocated number of Radar: it cannot back-up two 'downed' Track Engine. 

  • Topology Manager: This is the software that constantly manages the heartbeat, and initiates a back-up Management Server take-over if the primary Management Serve fails. 

Schematic

Third Party Systems

If you have your own management system connected to the Management Server, it is necessary that your software can identify if and when the primary Management Server fails and the system is transferred to the back-up Management Server. This is because your management system will also have to transfer to the back-up Management Server. Third party management systems cannot be connected to both the primary and alternate Management Servers at any one time, because only the primary will allow external connections. 


 

Safety is everything.