Safe failure fraction is an arbitrary metric used within IEC 61508 (and other guidance) in order to set architectural rules (or constraints). This is a metric based on diagnostic coverage. It has been introduced as a result of standards in the safety-related systems area. It combines the proportion of revealed ‘dangerous’ failures with those that are not ‘dangerous’.
What is Safe Failure Fraction?
Safe Failure Fraction is a measure derived from the ratio of safe failures and diagnosed dangerous failures to the total number of failures. Mathematically:
SFF= (Safe Failures + Diagnosed Dangerous Failures) / Total Failures (Safe + Dangerous)
A higher SFF indicates a system with a lower likelihood of undetected dangerous failures, thus making it more reliable and safer for use.
Breaking Down the Key Components
Safe Failures: A ‘safe’ failure is a failure of an element and/or subsystem and/or system that plays a part in implementing the safety function that:
- results in the spurious operation of the safety function to put the EUC (or part thereof) into a safe state or maintain a safe state; or
- increases the probability of the spurious operation of the safety function to put the EUC (or part thereof) into a safe state or maintain a safe state
Diagnosed Dangerous Failures: These are failures that could potentially jeopardize system safety but are identified through diagnostic mechanisms before leading to a hazardous situation.
Total Failures: The sum of all safe and dangerous failures within the system.
Applying Safe Failure Fraction
The calculation of SFF varies depending on the complexity of components:
- Type A Components: These are well-defined components with predictable failure modes and accessible fault data. Examples include simple mechanical parts or electrical circuits.
- Type B Components: More complex components that may lack comprehensive failure data or predictable fault behavior, such as microprocessors or programmable devices.
In the following Tables ‘m’ refers to the number of items which need to succeed. The Tables provide the maximum Safety Integrity Level (SIL) which can be claimed for each safe failure fraction band. The word simplex infers no redundancy and is referred to as Hardware Fault Tolerance 0. The expression ‘m + 1’ implies 1 out of 2, 2 out of 3 etc redundancy. It is referred to as Hardware Fault Tolerance 1. Similarly (m + 2) infers 1 out of 3, 2 out of 4 etc and is referred to as Hardware Fault Tolerance 2.