Pattern for Fault
Tolerant Software
Chapter 5. Detection Patterns
The first phase of fault
tolerance is detection
The dimensions of detection
A Priori Detection
use constraints
System states, Result, Side effects
If nothing is known about the range of
results this method will obviously not work.
Comparing Redundant
Results
Redundancy ( 3 , chapter 4)
The value to be compared or
The context to enable the identification of
the faulty component
To learn about correct system behavior
ex) Bayesian learning technique
Just determining that
one is incorrect is helpful,
but insufficient to fix the fault and
prevent a failure from occurring
Detect an Error >Detect a failure
The error is detected
automatically and corrected
before it becomes a failure
‘fail-silent’ and crash failure mode
element stops without informing them that it is stopping
detecting that an element has stopped functioning
vs
determining if an element has stopped operating correctly
Test function return codes
try/catch
Detecting Errors
Next - Fault Correlation

[EVA] 5. Detection Patterns - Patterns for Fault Tolerant Software