SlideShare a Scribd company logo
1 of 69
Testing Safety Critical Systems
Theory and Experiences
J.vanEkris@Delta-Pi.nl
Jaap van Ekris
Agenda
• The challenge
• Process and Organization
• System design
• Verification Techniques
• Trends
• Reality
3
THE CHALLENGE
Why is testing safety critical systems so hard?
Some people live on the edge…
How would you feel if you were getting
ready to launch and knew you were
sitting on top of two million parts -- all
built by the lowest bidder on a
government contract.
John Glenn
Actually, we all do…
We even accept loss...
• Lost/misdirected luggage:
Chance of failure 10-2 per
suitcase
• Airplane: Chance of crash
10-8 per flight hour
• Storm Surge Barrier: Chance
of failure 10-7 per usage
• Nuclear power plant: As
Low As Reasonably Possible
(ALARP)
7
Are the software risks acceptable?
To put things in perspective…
• Getting killed in traffic: 10-2 per year
• Having a drunk pilot: 10-2 per flight
• Hurt yourself when using a chainsaw: 10-3 per use
• Considered being posessed by satan: 10-4 per lifetime
• Dating a supermodel: 10-5 in a lifetime
• Drowning in a bathtub: 10-7 in a lifetime
• Being hit by falling airplane parts: 10-8 in a lifetime
• Being killed by lighting: 10-9 per lifetime
• Your house being hit by a meteor: 10-15 per lifetime
We might have become overprotective…
Nonetheless software is dangerous...
and the odds are against us…
• Capers-Jones: at least 2 high severity
errors per 10KLoc
• Industry concensus is that software
will never be more reliable than
– 10-5 per usage
– 10-9 per operating hour
The value of testing
Program testing can be used to show the
presence of bugs, but never to show
their absence!
Edsger W. Dijkstra
PROCESS AND ORGANIZATION
Who does what in safety critical software development?
IEC 61508: Safety Integrity Level and
acceptable risk
IEC61508: Risk distribution
IEC 61508: A process for safety critical functions
Process or personal commitment?
• Romans put the architect
under the arches when
removing the scaffolding
• Boeing and Airbus put all
lead-engineers on the first
test-flight
• Dijkstra put his
“rekenmeisjes” on the
opposite dock when
launching ships
It is about keeping your back straight…
• Thomas Andrews, Jr.
• Naval architect in charge of RMS Titanic
• He recognized regulations were
insufficient for ship the size of Titanic
• Decisions “forced upon him” by the client:
– Limit the range of double hulls
– Limit the number of lifeboats
• He was on the maiden voyage to spot
improvements
• He knowingly went down with the ship,
saving as many as he could
SYSTEM DESIGN
What do safety critical systems look like?
An introduction into storm surge barriers…
Design Principles
• Keep it simple...
• Risk analysis drives design (decissions)
• Safety first (production later)
• Fail-to-safe
• There shall be no single source of
(catastrophic) failure
A simple design of a storm surge barrier
Relais
(€10,00/piece)
Waterdetector
(€17,50)
Design documentation
(Sponsored by Heineken)
Risk analysis
Relais failure
Chance: small
Cause: aging
Effect: catastophic
Waterdetector fails
Change: Huge
Oorzaken: Rust, driftwood,
seaguls (eating, shitting)
Effect: Catastophic
Measurement errors
Chance: Collossal
Causes: Waves, wind
Effect: False Positive
Broken cable
Chance: Medium
Cause: digging, seaguls
Effect: Catastophic
System Architecture
Risk analysis
Typical risks identified
• Components making the wrong decissions
• Power failure
• Hardware failure of PLC’s/Servers
• Network failure
• Ship hitting water sensors
• Human maintenance error
27
Risk ≠ system crash
• Wrongful functional
behaviour
• Data accuracy
• Lack of response speed
• Understandability of
the GUI
• Tolerance towards
unlogical inputs
Systems do misbehave...
Can be late…
Risks can be external as well
Nihilating risk isn’t the goal…
No matter how well the
environment analysis
has been:
• Some scenarios will be
missed
• Some scenarios are
too expensive to
prevent:
– Accept risk
– Communicate to stakeholders
Risks can be contradictionary…
Availability of the service Safety of the installation
VS.
Risk reality does change over time...
9/11...
• Really tested our “test
abortion” procedure
• Introduced a
fundamental new risk
to ATC systems
• Changed the ATC
system dramatically
• Doubled our testcases
overnight
StuurX: Component architecture design
Stuurx::Functionality, initial global design
Init
Start_D
“Start” signal to Diesels
Wacht
Waterlevel < 3 meter
Waterlevel> 3 meter
W_O_D
“Diesels ready”
Sluit_?
“Close Barrier”
Waterlevel
Stuurx::Functionality, final global design
Stuurx::Functionality,
Wait_For_Diesels, detailed design
VERIFICATION
What is getting tested, and how?
The end is nigh...
Challenge: time and resource limitations
• 64 bits input isn’t that
uncommon
• 264 is the global rice
production in 1000
years, measured in
individual grains
• Fully testing all binary
inputs on a 64-bits
stimilus response system
takes 2 centuries
Goals of testing safety critical systems
• Verify contractually agreed functionality
• Verify correct functional safety-behaviour
• Verify safety-behaviour during degraded and
failure conditions
An example of safety critical components
IEC 61508 SIL4: Required verification activities
Design Validation and Verification
• Peer reviews by
– System architect
– 2nd designer
– Programmers
– Testmanager system testing
• Fault Tree Analysis / Failure Mode and Effect
Analysis
• Performance modeling
• Static Verification/ Dynamic Simulation by
(Twente University)
Programming (in C/C++)
• Coding standard:
– Based on “Safer C”, by Les Hutton
– May only use safe subset of the compiler
– Verified by Lint and 5 other tools
• Code is peer reviewed by 2nd developer
• Certified and calibrated compiler
Unit tests
• Focus on conformance to specifications
• Required coverage: 100% with respect to:
– Code paths
– Input equivalence classes
• Boundary Value analysis
• Probabilistic testing
• Execution:
– Fully automated scripts, running 24x7
– Creates 100Mb/hour of logs and measurement data
• Upon bug detection
– 3 strikes is out  After 3 implementation errors it is build by another developer
– 2 strikes is out  Need for a 2nd rebuild implies a redesign by another designer
Representative testing is difficult
Integration testing
• Focus on
– Functional behaviour of chain of components
– Failure scenarios based on risk analysis
• Required coverage
– 100% coverage on input classes
• Probabilistic testing
• Execution:
– Fully automated scripts, running 24x7, speed times 10
– Creates 250Mb/hour of logs and measurement data
• Upon detection
– Each bug  Rootcause-analysis
Redundancy is a nasty beast
• You do get functional
behaviour of your
entire system
• It is nearly impossible
to see if all your
components are
working correctly
51
System testing
• Focus on
– Functional behaviour
– Failure scenarios based on risk analysis
• Required coverage
– 100% complete environment (simultation)
– 100% coverage on input classes
• Execution:
– Fully automated scripts, running 24x7, speed times 10
– Creates 250Mb/hour of logs and measurement data
• Upon detection
– Each bug  Rootcause-analysis
Acceptance testing
• Acceptance testing
1. Functional acceptance
2. Failure behaviour, all top 50 (FMECA) risks tested
3. A year of operational verification
• Execution:
– Tests performed on a working stormsurge barrier
– Creates 250Mb/hour of logs and measurement data
• Upon detection
– Each bug  Root cause-analysis
Endurance testing
• Look for the “one in a
million times” problem
• Challenge:
– Software is deterministic
– execution is not (timing,
system load, bit-errors)
• Have an automated
script run it over and
over again
GUI Acceptance testing
• Looking for
– quality in use for interactive
systems
– Understandability of the
GUI
• Structural investigation of
the performance of the
system-human interactions
• Looking for “abuse” by the
users
• Looking at real-life handling
of emergency operations
Avalanche testing
• To test the capabilies of
alarming and control
• Usually starts with one
simple trigger
• Generally followed by
millions of alarms
• Generally brings your
network and systems
to the breaking point
Crash and recovery procedure testing
• Validation of system
behaviour after massive
crash and restart
• Usually identifies many
issues about emergency
procedures
• Sometimes identifies issues
around power supply
• Usually identifies some
(combination of) systems
incapable of unattended
recovery...
Testing safety critical functions is
dangerous...
A risk analysis to testing
• There should always be
a way out of a test
procedure
• Some things are too
dangerous to test
• Some tests introduce
more risks than they
try to mitigate
Root-cause analysis
• A painfull process, by
design
• Is extremely thorough
• Assumes that the error
found is a symptom of an
underlying collection of
(process) flaws
• Searches for the underlying
causes for the error, and
looks for possible similar
errors that might have
followed a similar path
Failed gates of a potential deadlock
TRENDS
What is the newest and hottest?
Model Driven Design
A real-life example
A root-cause analysis of this flaw
REALITY
What are the real-life challenges of a testmanager of safety critical systems?
Testing in reality
It requires a specific breed of people
The faiths of developers and
testers are linked to safety
critical systems into
eternity
Conclusions
• Stop reading newspapers
• Safety Critical Testing is a
lot of work, making sure
nothing happens
• Technically it isn’t that
much different, we’re just
more rigerous and use a
specific breed of
people....

More Related Content

What's hot

Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment isc2-hellenic
 
Kirkpatrick.paul
Kirkpatrick.paulKirkpatrick.paul
Kirkpatrick.paulNASAPMC
 
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)James Clause
 
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...Priyanka Aash
 
Csw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgenerationCsw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgenerationCanSecWest
 
Testing: ¿what, how, why?
Testing: ¿what, how, why?Testing: ¿what, how, why?
Testing: ¿what, how, why?David Rodenas
 
50 Shades of Fuzzing by Peter Hlavaty & Marco Grassi
50 Shades of Fuzzing by Peter Hlavaty & Marco Grassi50 Shades of Fuzzing by Peter Hlavaty & Marco Grassi
50 Shades of Fuzzing by Peter Hlavaty & Marco GrassiShakacon
 

What's hot (7)

Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment Building next gen malware behavioural analysis environment
Building next gen malware behavioural analysis environment
 
Kirkpatrick.paul
Kirkpatrick.paulKirkpatrick.paul
Kirkpatrick.paul
 
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)Enabling and Supporting the Debugging of Software Failures (PhD Defense)
Enabling and Supporting the Debugging of Software Failures (PhD Defense)
 
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
Over-the-Air: How we Remotely Compromised the Gateway, BCM, and Autopilot ECU...
 
Csw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgenerationCsw2016 d antoine_automatic_exploitgeneration
Csw2016 d antoine_automatic_exploitgeneration
 
Testing: ¿what, how, why?
Testing: ¿what, how, why?Testing: ¿what, how, why?
Testing: ¿what, how, why?
 
50 Shades of Fuzzing by Peter Hlavaty & Marco Grassi
50 Shades of Fuzzing by Peter Hlavaty & Marco Grassi50 Shades of Fuzzing by Peter Hlavaty & Marco Grassi
50 Shades of Fuzzing by Peter Hlavaty & Marco Grassi
 

Viewers also liked

61508 Compliance of actuators and Life cycle considerations (Eng)
61508 Compliance of actuators and Life cycle considerations (Eng) 61508 Compliance of actuators and Life cycle considerations (Eng)
61508 Compliance of actuators and Life cycle considerations (Eng) ie-net ingenieursvereniging vzw
 
IEC 61508-3 SW Engineering
IEC 61508-3 SW EngineeringIEC 61508-3 SW Engineering
IEC 61508-3 SW EngineeringHongseok Lee
 
Functional Safety and Security: ICS Cyber Security is Part of Functional Safety
Functional Safety and Security: ICS Cyber Security is Part of Functional SafetyFunctional Safety and Security: ICS Cyber Security is Part of Functional Safety
Functional Safety and Security: ICS Cyber Security is Part of Functional SafetyWalt Boyes
 
ISO26262-6 Software development process (Ver 3.0)
ISO26262-6 Software development process (Ver 3.0)ISO26262-6 Software development process (Ver 3.0)
ISO26262-6 Software development process (Ver 3.0)Hongseok Lee
 

Viewers also liked (6)

61508 Compliance of actuators and Life cycle considerations (Eng)
61508 Compliance of actuators and Life cycle considerations (Eng) 61508 Compliance of actuators and Life cycle considerations (Eng)
61508 Compliance of actuators and Life cycle considerations (Eng)
 
IEC 61508-3 SW Engineering
IEC 61508-3 SW EngineeringIEC 61508-3 SW Engineering
IEC 61508-3 SW Engineering
 
Functional safety standards_for_machinery
Functional safety standards_for_machineryFunctional safety standards_for_machinery
Functional safety standards_for_machinery
 
Iec61508 guide
Iec61508 guideIec61508 guide
Iec61508 guide
 
Functional Safety and Security: ICS Cyber Security is Part of Functional Safety
Functional Safety and Security: ICS Cyber Security is Part of Functional SafetyFunctional Safety and Security: ICS Cyber Security is Part of Functional Safety
Functional Safety and Security: ICS Cyber Security is Part of Functional Safety
 
ISO26262-6 Software development process (Ver 3.0)
ISO26262-6 Software development process (Ver 3.0)ISO26262-6 Software development process (Ver 3.0)
ISO26262-6 Software development process (Ver 3.0)
 

Similar to Testing safety critical systems: Practice and Theory (14-05-2013, VU Amsterdam)

2016-04-28 - VU Amsterdam - testing safety critical systems
2016-04-28 - VU Amsterdam - testing safety critical systems2016-04-28 - VU Amsterdam - testing safety critical systems
2016-04-28 - VU Amsterdam - testing safety critical systemsJaap van Ekris
 
2017 03-10 - vu amsterdam - testing safety critical systems
2017 03-10 - vu amsterdam - testing safety critical systems2017 03-10 - vu amsterdam - testing safety critical systems
2017 03-10 - vu amsterdam - testing safety critical systemsJaap van Ekris
 
Understanding container security
Understanding container securityUnderstanding container security
Understanding container securityJohn Kinsella
 
Risk management and business protection with Coding Standardization & Static ...
Risk management and business protection with Coding Standardization & Static ...Risk management and business protection with Coding Standardization & Static ...
Risk management and business protection with Coding Standardization & Static ...Itris Automation Square
 
Safety and security in distributed systems
Safety and security in distributed systemsSafety and security in distributed systems
Safety and security in distributed systemsEinar Landre
 
Safety and security in distributed systems
Safety and security in distributed systems Safety and security in distributed systems
Safety and security in distributed systems Einar Landre
 
IANS information security forum 2019 summary
IANS information security forum 2019 summaryIANS information security forum 2019 summary
IANS information security forum 2019 summaryKarun Chennuri
 
An In-depth look at application containers
An In-depth look at application containersAn In-depth look at application containers
An In-depth look at application containersJohn Kinsella
 
how-to-bypass-AM-PPL
how-to-bypass-AM-PPLhow-to-bypass-AM-PPL
how-to-bypass-AM-PPLnitinscribd
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Securitysedukull
 
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical SystemsTest Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical SystemsLionel Briand
 
Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015
Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015
Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015Adi Bolboaca
 
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly Davidoff
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly DavidoffDevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly Davidoff
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly DavidoffDevSecCon
 
Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...
Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...
Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...Mozaic Works
 
Why Kubernetes Freedom Requires Chaos Engineering to Shine in Production
Why Kubernetes Freedom Requires Chaos Engineering to Shine in ProductionWhy Kubernetes Freedom Requires Chaos Engineering to Shine in Production
Why Kubernetes Freedom Requires Chaos Engineering to Shine in ProductionScyllaDB
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Lionel Briand
 
Software engineering
Software engineeringSoftware engineering
Software engineeringRohan Bhatkar
 

Similar to Testing safety critical systems: Practice and Theory (14-05-2013, VU Amsterdam) (20)

2016-04-28 - VU Amsterdam - testing safety critical systems
2016-04-28 - VU Amsterdam - testing safety critical systems2016-04-28 - VU Amsterdam - testing safety critical systems
2016-04-28 - VU Amsterdam - testing safety critical systems
 
2017 03-10 - vu amsterdam - testing safety critical systems
2017 03-10 - vu amsterdam - testing safety critical systems2017 03-10 - vu amsterdam - testing safety critical systems
2017 03-10 - vu amsterdam - testing safety critical systems
 
Understanding container security
Understanding container securityUnderstanding container security
Understanding container security
 
Risk management and business protection with Coding Standardization & Static ...
Risk management and business protection with Coding Standardization & Static ...Risk management and business protection with Coding Standardization & Static ...
Risk management and business protection with Coding Standardization & Static ...
 
Safety and security in distributed systems
Safety and security in distributed systemsSafety and security in distributed systems
Safety and security in distributed systems
 
Safety and security in distributed systems
Safety and security in distributed systems Safety and security in distributed systems
Safety and security in distributed systems
 
FTA.pptx
FTA.pptxFTA.pptx
FTA.pptx
 
IANS information security forum 2019 summary
IANS information security forum 2019 summaryIANS information security forum 2019 summary
IANS information security forum 2019 summary
 
An In-depth look at application containers
An In-depth look at application containersAn In-depth look at application containers
An In-depth look at application containers
 
how-to-bypass-AM-PPL
how-to-bypass-AM-PPLhow-to-bypass-AM-PPL
how-to-bypass-AM-PPL
 
Design For Testability
Design For TestabilityDesign For Testability
Design For Testability
 
Code Quality - Security
Code Quality - SecurityCode Quality - Security
Code Quality - Security
 
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical SystemsTest Case Prioritization for Acceptance Testing of Cyber Physical Systems
Test Case Prioritization for Acceptance Testing of Cyber Physical Systems
 
Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015
Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015
Architecture for Disaster Resistant Systems @I TAKE Unconference 29 05 2015
 
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly Davidoff
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly DavidoffDevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly Davidoff
DevSecCon Tel Aviv 2018 - End2End containers SSDLC by Vitaly Davidoff
 
Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...
Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...
Adi Bolboacă: Architecture For Disaster Resistant Systems at I T.A.K.E. Unco...
 
Why Kubernetes Freedom Requires Chaos Engineering to Shine in Production
Why Kubernetes Freedom Requires Chaos Engineering to Shine in ProductionWhy Kubernetes Freedom Requires Chaos Engineering to Shine in Production
Why Kubernetes Freedom Requires Chaos Engineering to Shine in Production
 
Software Security and IDS.pptx
Software Security and IDS.pptxSoftware Security and IDS.pptx
Software Security and IDS.pptx
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
 
Software engineering
Software engineeringSoftware engineering
Software engineering
 

More from Jaap van Ekris

2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giants
2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giants2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giants
2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giantsJaap van Ekris
 
2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebben
2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebben2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebben
2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebbenJaap van Ekris
 
2018-11-08 risk and reslience festival
2018-11-08 risk and reslience festival2018-11-08 risk and reslience festival
2018-11-08 risk and reslience festivalJaap van Ekris
 
2015 10-08 Uitwijken, het hoe, waarom en de consequenties
2015 10-08 Uitwijken, het hoe, waarom en de consequenties2015 10-08 Uitwijken, het hoe, waarom en de consequenties
2015 10-08 Uitwijken, het hoe, waarom en de consequentiesJaap van Ekris
 
2016 11-15 - nvrb - software betrouwbaarheid
2016 11-15 - nvrb - software betrouwbaarheid2016 11-15 - nvrb - software betrouwbaarheid
2016 11-15 - nvrb - software betrouwbaarheidJaap van Ekris
 
2016-05-30 risk driven design
2016-05-30 risk driven design2016-05-30 risk driven design
2016-05-30 risk driven designJaap van Ekris
 
2016 02-15 - IASTED Innsbruck 2016 - the role and decompesition of delivery ...
2016 02-15 -  IASTED Innsbruck 2016 - the role and decompesition of delivery ...2016 02-15 -  IASTED Innsbruck 2016 - the role and decompesition of delivery ...
2016 02-15 - IASTED Innsbruck 2016 - the role and decompesition of delivery ...Jaap van Ekris
 
TOPAAS Versie 2.0, een praktische inleiding
TOPAAS Versie 2.0, een praktische inleidingTOPAAS Versie 2.0, een praktische inleiding
TOPAAS Versie 2.0, een praktische inleidingJaap van Ekris
 
Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)
Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)
Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)Jaap van Ekris
 
What the hack happened to digi notar (28-10-2011)
What the hack happened to digi notar (28-10-2011)What the hack happened to digi notar (28-10-2011)
What the hack happened to digi notar (28-10-2011)Jaap van Ekris
 
Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)
Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)
Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)Jaap van Ekris
 
2011-04-29 - Risk management conference - Technische IT risico's in de praktijk
2011-04-29 - Risk management conference - Technische IT risico's in de praktijk2011-04-29 - Risk management conference - Technische IT risico's in de praktijk
2011-04-29 - Risk management conference - Technische IT risico's in de praktijkJaap van Ekris
 
2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...
2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...
2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...Jaap van Ekris
 
2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…
2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…
2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…Jaap van Ekris
 
2010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 7
2010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 72010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 7
2010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 7Jaap van Ekris
 
2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemen
2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemen2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemen
2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemenJaap van Ekris
 
2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...
2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...
2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...Jaap van Ekris
 
2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...
2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...
2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...Jaap van Ekris
 
2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patterns
2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patterns2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patterns
2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patternsJaap van Ekris
 
2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…
2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…
2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…Jaap van Ekris
 

More from Jaap van Ekris (20)

2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giants
2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giants2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giants
2021 08-28, QONFEST 2021 - Reliability cenetered maintenance for sleeping giants
 
2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebben
2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebben2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebben
2020 09-08 - sdn - waarom klanten een hekel aan software ontwikkelaars hebben
 
2018-11-08 risk and reslience festival
2018-11-08 risk and reslience festival2018-11-08 risk and reslience festival
2018-11-08 risk and reslience festival
 
2015 10-08 Uitwijken, het hoe, waarom en de consequenties
2015 10-08 Uitwijken, het hoe, waarom en de consequenties2015 10-08 Uitwijken, het hoe, waarom en de consequenties
2015 10-08 Uitwijken, het hoe, waarom en de consequenties
 
2016 11-15 - nvrb - software betrouwbaarheid
2016 11-15 - nvrb - software betrouwbaarheid2016 11-15 - nvrb - software betrouwbaarheid
2016 11-15 - nvrb - software betrouwbaarheid
 
2016-05-30 risk driven design
2016-05-30 risk driven design2016-05-30 risk driven design
2016-05-30 risk driven design
 
2016 02-15 - IASTED Innsbruck 2016 - the role and decompesition of delivery ...
2016 02-15 -  IASTED Innsbruck 2016 - the role and decompesition of delivery ...2016 02-15 -  IASTED Innsbruck 2016 - the role and decompesition of delivery ...
2016 02-15 - IASTED Innsbruck 2016 - the role and decompesition of delivery ...
 
TOPAAS Versie 2.0, een praktische inleiding
TOPAAS Versie 2.0, een praktische inleidingTOPAAS Versie 2.0, een praktische inleiding
TOPAAS Versie 2.0, een praktische inleiding
 
Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)
Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)
Cloud Security (11-09-2012, (ISC)2 Secure Amsterdam)
 
What the hack happened to digi notar (28-10-2011)
What the hack happened to digi notar (28-10-2011)What the hack happened to digi notar (28-10-2011)
What the hack happened to digi notar (28-10-2011)
 
Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)
Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)
Windows Phone 7 and the cloud, the good, the bad and the ugly (17-06-2011, SDN)
 
2011-04-29 - Risk management conference - Technische IT risico's in de praktijk
2011-04-29 - Risk management conference - Technische IT risico's in de praktijk2011-04-29 - Risk management conference - Technische IT risico's in de praktijk
2011-04-29 - Risk management conference - Technische IT risico's in de praktijk
 
2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...
2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...
2011-03-12 - PDAtotaal Usergroup meeting - Ervaringen met Windows Phone 7 in ...
 
2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…
2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…
2010-09-21 - (ISC)2 - Protecting patient privacy while enabling medical re…
 
2010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 7
2010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 72010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 7
2010-04-17 - PDAtotaal Usergroup meeting - Introductie in Windows Phone 7
 
2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemen
2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemen2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemen
2009-07-09 - DNV - Risico en betrouwbaarheid van ICT systemen
 
2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...
2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...
2009-02-18 - IASTED Innsbruck 2009 - Factors in project management influencin...
 
2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...
2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...
2009-02-12 - VU Amsterdam - Customer Satisfaction and dealing with customers ...
 
2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patterns
2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patterns2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patterns
2008-10-09 - Bits and Chips Conference - Embedded Systemen Architecture patterns
 
2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…
2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…
2008-07-15 - (ISC)2 - Mobile Phone Security, you have to let go in order t…
 

Recently uploaded

The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4DianaGray10
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdfThe Good Food Institute
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2DianaGray10
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updateadam112203
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxKaustubhBhavsar6
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechProduct School
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameKapil Thakar
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsDianaGray10
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationKnoldus Inc.
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Alkin Tezuysal
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024Brian Pichman
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfTejal81
 

Recently uploaded (20)

The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4UiPath Studio Web workshop series - Day 4
UiPath Studio Web workshop series - Day 4
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf2024.03.12 Cost drivers of cultivated meat production.pdf
2024.03.12 Cost drivers of cultivated meat production.pdf
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2UiPath Studio Web workshop series - Day 2
UiPath Studio Web workshop series - Day 2
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
Patch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 updatePatch notes explaining DISARM Version 1.4 update
Patch notes explaining DISARM Version 1.4 update
 
How to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptxHow to become a GDSC Lead GDSC MI AOE.pptx
How to become a GDSC Lead GDSC MI AOE.pptx
 
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - TechWebinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
Webinar: The Art of Prioritizing Your Product Roadmap by AWS Sr PM - Tech
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First Frame
 
Automation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projectsAutomation Ops Series: Session 2 - Governance for UiPath projects
Automation Ops Series: Session 2 - Governance for UiPath projects
 
Introduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its applicationIntroduction to RAG (Retrieval Augmented Generation) and its application
Introduction to RAG (Retrieval Augmented Generation) and its application
 
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024CyberSecurity - Computers In Libraries 2024
CyberSecurity - Computers In Libraries 2024
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdfQ4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
Q4 2023 Quarterly Investor Presentation - FINAL - v1.pdf
 

Testing safety critical systems: Practice and Theory (14-05-2013, VU Amsterdam)

  • 1. Testing Safety Critical Systems Theory and Experiences J.vanEkris@Delta-Pi.nl
  • 3. Agenda • The challenge • Process and Organization • System design • Verification Techniques • Trends • Reality 3
  • 4. THE CHALLENGE Why is testing safety critical systems so hard?
  • 5. Some people live on the edge… How would you feel if you were getting ready to launch and knew you were sitting on top of two million parts -- all built by the lowest bidder on a government contract. John Glenn
  • 7. We even accept loss... • Lost/misdirected luggage: Chance of failure 10-2 per suitcase • Airplane: Chance of crash 10-8 per flight hour • Storm Surge Barrier: Chance of failure 10-7 per usage • Nuclear power plant: As Low As Reasonably Possible (ALARP) 7
  • 8. Are the software risks acceptable?
  • 9. To put things in perspective… • Getting killed in traffic: 10-2 per year • Having a drunk pilot: 10-2 per flight • Hurt yourself when using a chainsaw: 10-3 per use • Considered being posessed by satan: 10-4 per lifetime • Dating a supermodel: 10-5 in a lifetime • Drowning in a bathtub: 10-7 in a lifetime • Being hit by falling airplane parts: 10-8 in a lifetime • Being killed by lighting: 10-9 per lifetime • Your house being hit by a meteor: 10-15 per lifetime
  • 10. We might have become overprotective…
  • 11. Nonetheless software is dangerous...
  • 12. and the odds are against us… • Capers-Jones: at least 2 high severity errors per 10KLoc • Industry concensus is that software will never be more reliable than – 10-5 per usage – 10-9 per operating hour
  • 13. The value of testing Program testing can be used to show the presence of bugs, but never to show their absence! Edsger W. Dijkstra
  • 14. PROCESS AND ORGANIZATION Who does what in safety critical software development?
  • 15. IEC 61508: Safety Integrity Level and acceptable risk
  • 17. IEC 61508: A process for safety critical functions
  • 18. Process or personal commitment? • Romans put the architect under the arches when removing the scaffolding • Boeing and Airbus put all lead-engineers on the first test-flight • Dijkstra put his “rekenmeisjes” on the opposite dock when launching ships
  • 19. It is about keeping your back straight… • Thomas Andrews, Jr. • Naval architect in charge of RMS Titanic • He recognized regulations were insufficient for ship the size of Titanic • Decisions “forced upon him” by the client: – Limit the range of double hulls – Limit the number of lifeboats • He was on the maiden voyage to spot improvements • He knowingly went down with the ship, saving as many as he could
  • 20. SYSTEM DESIGN What do safety critical systems look like?
  • 21. An introduction into storm surge barriers…
  • 22. Design Principles • Keep it simple... • Risk analysis drives design (decissions) • Safety first (production later) • Fail-to-safe • There shall be no single source of (catastrophic) failure
  • 23. A simple design of a storm surge barrier Relais (€10,00/piece) Waterdetector (€17,50) Design documentation (Sponsored by Heineken)
  • 24. Risk analysis Relais failure Chance: small Cause: aging Effect: catastophic Waterdetector fails Change: Huge Oorzaken: Rust, driftwood, seaguls (eating, shitting) Effect: Catastophic Measurement errors Chance: Collossal Causes: Waves, wind Effect: False Positive Broken cable Chance: Medium Cause: digging, seaguls Effect: Catastophic
  • 27. Typical risks identified • Components making the wrong decissions • Power failure • Hardware failure of PLC’s/Servers • Network failure • Ship hitting water sensors • Human maintenance error 27
  • 28. Risk ≠ system crash • Wrongful functional behaviour • Data accuracy • Lack of response speed • Understandability of the GUI • Tolerance towards unlogical inputs
  • 31. Risks can be external as well
  • 32. Nihilating risk isn’t the goal… No matter how well the environment analysis has been: • Some scenarios will be missed • Some scenarios are too expensive to prevent: – Accept risk – Communicate to stakeholders
  • 33. Risks can be contradictionary… Availability of the service Safety of the installation VS.
  • 34. Risk reality does change over time...
  • 35. 9/11... • Really tested our “test abortion” procedure • Introduced a fundamental new risk to ATC systems • Changed the ATC system dramatically • Doubled our testcases overnight
  • 37. Stuurx::Functionality, initial global design Init Start_D “Start” signal to Diesels Wacht Waterlevel < 3 meter Waterlevel> 3 meter W_O_D “Diesels ready” Sluit_? “Close Barrier” Waterlevel
  • 40. VERIFICATION What is getting tested, and how?
  • 41. The end is nigh...
  • 42. Challenge: time and resource limitations • 64 bits input isn’t that uncommon • 264 is the global rice production in 1000 years, measured in individual grains • Fully testing all binary inputs on a 64-bits stimilus response system takes 2 centuries
  • 43. Goals of testing safety critical systems • Verify contractually agreed functionality • Verify correct functional safety-behaviour • Verify safety-behaviour during degraded and failure conditions
  • 44. An example of safety critical components
  • 45. IEC 61508 SIL4: Required verification activities
  • 46. Design Validation and Verification • Peer reviews by – System architect – 2nd designer – Programmers – Testmanager system testing • Fault Tree Analysis / Failure Mode and Effect Analysis • Performance modeling • Static Verification/ Dynamic Simulation by (Twente University)
  • 47. Programming (in C/C++) • Coding standard: – Based on “Safer C”, by Les Hutton – May only use safe subset of the compiler – Verified by Lint and 5 other tools • Code is peer reviewed by 2nd developer • Certified and calibrated compiler
  • 48. Unit tests • Focus on conformance to specifications • Required coverage: 100% with respect to: – Code paths – Input equivalence classes • Boundary Value analysis • Probabilistic testing • Execution: – Fully automated scripts, running 24x7 – Creates 100Mb/hour of logs and measurement data • Upon bug detection – 3 strikes is out  After 3 implementation errors it is build by another developer – 2 strikes is out  Need for a 2nd rebuild implies a redesign by another designer
  • 50. Integration testing • Focus on – Functional behaviour of chain of components – Failure scenarios based on risk analysis • Required coverage – 100% coverage on input classes • Probabilistic testing • Execution: – Fully automated scripts, running 24x7, speed times 10 – Creates 250Mb/hour of logs and measurement data • Upon detection – Each bug  Rootcause-analysis
  • 51. Redundancy is a nasty beast • You do get functional behaviour of your entire system • It is nearly impossible to see if all your components are working correctly 51
  • 52. System testing • Focus on – Functional behaviour – Failure scenarios based on risk analysis • Required coverage – 100% complete environment (simultation) – 100% coverage on input classes • Execution: – Fully automated scripts, running 24x7, speed times 10 – Creates 250Mb/hour of logs and measurement data • Upon detection – Each bug  Rootcause-analysis
  • 53. Acceptance testing • Acceptance testing 1. Functional acceptance 2. Failure behaviour, all top 50 (FMECA) risks tested 3. A year of operational verification • Execution: – Tests performed on a working stormsurge barrier – Creates 250Mb/hour of logs and measurement data • Upon detection – Each bug  Root cause-analysis
  • 54. Endurance testing • Look for the “one in a million times” problem • Challenge: – Software is deterministic – execution is not (timing, system load, bit-errors) • Have an automated script run it over and over again
  • 55. GUI Acceptance testing • Looking for – quality in use for interactive systems – Understandability of the GUI • Structural investigation of the performance of the system-human interactions • Looking for “abuse” by the users • Looking at real-life handling of emergency operations
  • 56. Avalanche testing • To test the capabilies of alarming and control • Usually starts with one simple trigger • Generally followed by millions of alarms • Generally brings your network and systems to the breaking point
  • 57. Crash and recovery procedure testing • Validation of system behaviour after massive crash and restart • Usually identifies many issues about emergency procedures • Sometimes identifies issues around power supply • Usually identifies some (combination of) systems incapable of unattended recovery...
  • 58. Testing safety critical functions is dangerous...
  • 59. A risk analysis to testing • There should always be a way out of a test procedure • Some things are too dangerous to test • Some tests introduce more risks than they try to mitigate
  • 60. Root-cause analysis • A painfull process, by design • Is extremely thorough • Assumes that the error found is a symptom of an underlying collection of (process) flaws • Searches for the underlying causes for the error, and looks for possible similar errors that might have followed a similar path
  • 61. Failed gates of a potential deadlock
  • 62. TRENDS What is the newest and hottest?
  • 65. A root-cause analysis of this flaw
  • 66. REALITY What are the real-life challenges of a testmanager of safety critical systems?
  • 68. It requires a specific breed of people The faiths of developers and testers are linked to safety critical systems into eternity
  • 69. Conclusions • Stop reading newspapers • Safety Critical Testing is a lot of work, making sure nothing happens • Technically it isn’t that much different, we’re just more rigerous and use a specific breed of people....

Editor's Notes

  1. Copyright CIBIT Adviseurs|Opleiders 2005 Jaap van Ekris, Veiligheidskritische systemen Werkveld: Kerncentrales Luchtverkeersleiding Stormvloedkeringen Fouten kosten veel mensenlevens
  2. Voordeel van Glen was dat het maar 1 keer hoefde te werken...... En dat waren de 60er jaren (toen kon dat nog), en astronauten hadden nog lef Bron: http://www.historicwings.com/features98/mercury/seven-left-bottom.html
  3. When I started my career, my mentor told me: “From now on, your goal is to stay off the frontpage of the newspapers” I can tell you it is hard, but so far I’ve succeeded.
  4. Please note that these failure rates include electromechanical failure as well!! Electrocution by a light switch: Change of 10 -5 per usage, which is the exact chance of dating a supermodel as well. 25 April 2013
  5. Please note that pilots have a redundant counterpart…. 25 April 2013
  6. 25 April 2013
  7. Maar we leven (onwetend) nog steeds in die wereld..... 25 April 2013
  8. Voordeel van Glen was dat het maar 1 keer hoefde te werken...... Bron: http://www.historicwings.com/features98/mercury/seven-left-bottom.html
  9. FTA en FMEA zijn tegenpolen, goede controlemechanismen van elkaar (NASA) Alhoewel NASA geen feilloos trackrecord heeft….
  10. Aquaduct of Segovia, peninsula of Iberia. Build by the Romans in 50AC to 125AC. Architects overengineered their equipment that heavily, it is still standing 2000 years later People do have to realize that commitment of people to get it right the first time is essential. At Eurocontrol, we mentioned a projected deathtoll on every bug 25 April 2013
  11. Doel: mag maar eens in de 10.000 jaar
  12. Je begint met je primary concern Proces is simpel: je hakt je probleem zover op todat je die 2 miljoen onderdelen hebt, en je weet wat de bijdrage is van elke component Je pakt de belangrijkste 10, of 100 en neemt gericht maatregelen
  13. Tickles security: hard van buiten, boterzacht van binnen
  14. De perfecte “single point of failure”
  15. Als we rekening gaan houden met deadlocks en redundantie ziet ons plaatje er zo uit: niet echt simpel meer……
  16. There is a bug in this one: this code is NOT fail-safe because it has a potential catastrophic deadlock (when the Diesels don’t report Ready)..... 25 April 2013
  17. Please be reminded: the presented code has a deadlock! 25 April 2013
  18. FTA en FMEA zijn tegenpolen, goede controlemechanismen van elkaar (NASA) Alhoewel NASA geen feilloos trackrecord heeft….
  19. Do you know the difference between validation and verification? Validation = meets external expectations, does what it is supposed to do Verification = meets internal expectations, conforming to specs 25 April 2013
  20. Funny example: printing screen....
  21. Most beautifull example: UPSes using too much power to charge, killing all fuses.... Current example: found out that identity management server was a single point of failure.... Eurocontrol example: control unit wasn’t ready for the CWPs, and after that got overloaded 25 April 2013
  22. FTA en FMEA zijn tegenpolen, goede controlemechanismen van elkaar (NASA) Alhoewel NASA geen feilloos trackrecord heeft….
  23. FTA en FMEA zijn tegenpolen, goede controlemechanismen van elkaar (NASA) Alhoewel NASA geen feilloos trackrecord heeft….
  24. This is functional nonsense: DirMsgResponse is sent to the output, whatever what. 25 April 2013
  25. FTA en FMEA zijn tegenpolen, goede controlemechanismen van elkaar (NASA) Alhoewel NASA geen feilloos trackrecord heeft….
  26. Our successes are unknown, our failures make the headlines…. When a system fails in production, it is actual blood on our hands. At eurocontrol, each bug had a bodycount attachted to it.....