Fault Tolerant Systems


Fault Tolerant Systems enable a system to continue operating properly in the event of the failure of (or one or more faults within) some of its components. If it’s operating quality decreases at all, the decrease is proportional to the severity of the failure, as compared to a naïvely designed system in which even a small failure can cause total breakdown. Fault tolerance is particularly sought after by our clients with high-availability or infrastructure-critical systems.

Portland Engineering’s fault-tolerant control engineering design enables a system to continue its intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails. Our computer systems are designed to continue more or less fully operational with, perhaps, a reduction in throughput or an increase in response time in the event of some partial failure. That is, the system as a whole is not stopped due to problems either in the hardware or the software.

Within the scope of an individual system, we achieve fault tolerance by anticipating exceptional conditions and building the system to cope with them, and, in general, aiming for self-stabilization so that the system converges towards an error-free state. However, if the consequences of a system failure are catastrophic, or the cost of making it sufficiently reliable is very high, a better solution may be to use some form of redundancy in its place. In any case, if the consequence of a system failure is so catastrophic, the system must be able to use reversion to fall back to a safe mode. This is similar to roll-back recovery but can be a human action if humans are present in the loop.

PEIs Fault Tolerant Systems offer no single point of failure; fault isolation to the failing component; fault containment to prevent propagation of the failure; and availability of reversion modes. Our Fault Tolerant Systems are characterized in terms of both planned service outages and unplanned service outages and are fundamentally based on the concept of redundancy.

 

Advanced Control | Automated Assembly | Batch Control & Batch Processing | Computer Aided Design (CAD) & Computer Aided Manufacturing (CAM) | Cloud Based Applications & Software | Control System Design | Control Panel Design | Data Processing, Collection, Reporting, Management & Analytics | Distributed Control Systems (DCS) & DCS Migration | Dedicated Controls | Discrete Control | Energy Management | Ethernet/IP | Factory Automation | Fault tolerant Systems | Field Service | Flow Control | HMI/OI | Industrial Engineering | Industrial Ethernet | Information Integration | Information Systems | Input & Output Modules | Installation & Startup | Level Control | Machine Design, Control, Repair & Maintenance | Manufacturing Execution Systems (MES) | Modbus TCP | Motors, Drives & Motion Control | Networking & Communications | Programmable Automation Controllers (PACs) | Programmable Logic Controllers (PLCs) | Pressure Control | Process Control | Process Engineering | Product Tracking, Identification, RFID, Barcodes & Matrix Codes | Profibus & Profinet | Project Management | Pumps, Compressors & Turbines | Supervisory Control and Data Acquisition (SCADA) Systems | Sensors | Systems Engineering | Radio, Wireless & Cellular Telemetry | Temperature Control | Maintenance Training & Education | Operations Training & Education | Virutalization | Wireless | Wireless Ethernet