Safety and security in distributed systems
Einar Landre
Statoil
Context
Industries with the potential to injure
or kill people or to do serious
damage on the environment
hazardous industry
Require high-integrity systems
and safety management
processes to ensure safety
high integrity systems
Systems where failure could lead to an
accident and for which high reliability are
claimed
- Pressure boundaries
- Oil & Gas wells
- Boilers
- Instrumentation & Control Systems
- Emergency shutdown
- Fire and gas leak detection
- Life supporting devices
- Pacemakers
- Infusion pumps
system criticality
Non - Critical
Useful system
- Low dependability
- System does not
need to be trusted
Business - Critical Mission - Critical Safety - Critical
High Availability
- Focus on cost s of
failure caused by
system downtime,
cost of spares, repair
equipment and
personnel and
warranty claims
High Reliability
- Increase the
probability of failure
free system
operation over a
specified time in a
given environment
for a given purpose
High Safety &
Integrity Level
- High reliability
- High availability
- High security
- Focus is not on cost,
but on preserving life
and nature
safeguarding integrity
Risk / threat based approach
Things
Troll A, 472 meters, the largest man made “thing” ever moved
Software was an alien concept
things anno 1995
things anno 2015
Asgard subsea compression runs on software
Size = a football field
things anno 2025
The subsea factory will be an Internet of Everything
Fallacies of distributed computing:
1. The network is reliable
2. Latency is zero
3. Bandwidth is infinite
4. The network is secure
5. Topology doesn’t change
6. There is one administrator
7. Transport cost is zero
networked everything's
A distributed system is one in which the failure of a computer you
didn’t even know existed can render your own computer unusable.
Leslie Lamport
Software
software is ubiquitous
Defines the behaviour of
1. Mobile devices
2. Medical devices
3. Computer Networks
4. Industrial control systems
5. Supply chains and logistics
6. Robots, cars & aircrafts
7. Human-Machine Interfaces
Institutionalizes our insights and knowledge
before software
Tangible control logic
• Design level
• Implementation level
• Verification & test level
No cyber threats
• Intrusion
• Viruses
• Theft
• Identity
two unique properties
Inspection & Test
• Software can’t be inspected and
tested as analogous components
CPU – the single point of failure
• All signals are threaded through the
one single element.
• Execution sequence is un-known
• Same defect is systemized across
multiple instances
Impacts how we must manage software for critical systems
some specific challenges
Common mode failure
Malware, Viruses and Hacking
Human Factors
Blurred boundaries
Identity management
common mode failure
“results from an event which
because of dependencies
causes a coincidence of failure
states of components in two or
more separate channels of a
redundancy system, leading to
the defined systems failing to
perform its intended function”.
Ariane 5 test launch, 1996
malware, viruses and hacking
Motivated by financial, political, criminal or idealistic interests
Software created to cause harm
• Change of system behaviour
• Steal / destroy data or machines
Exploits weaknesses in
• Human character
• Technical designs
Horror stories:
• Stuxnet and the Iranian centrifuges (Siemens control system)
• Saudi Aramco hack of 35000 computers (Windows back office)
human factors
How to minimize the probability?
Mistakes occur everywhere
• Specification
• Design
• Implementation
• Deployment
• Operations
Humans make mistakes
• By commission
• By omission
• By carelessness
blurred boundaries
Conflicting interests, divergent
situational understanding across
disciplines and roles.
Architects thinks and designs in terms of hierarchy and layering
Programmers thinks and designs in terms of threads of execution
Users need systems that works and solves a real world problems
Operations needs to get the job done
identity
How to ensure that a thing or person is the one they
claim to be?
What are the impacts on
- Security
- Safety
- Integrity
- Availability
- Reliability
Tools
systems engineering
Architecture centric
• Design
• Implementation
• Deployment
• Usage
Risk based
• Requirements
• Design
• Implementation
• Commissioning
• Usage
Holistic and remember higher order effects
safety & security architecture
Separation and protection of critical functions
Human brain - planets most sophisticated
and vulnerable decision maker
human factors
• Emotions trumps facts (irrationality)
• Limited processing capacity
• Need to rest, easily bored
• Inconsistency across exemplars
• Creative, easily distracted
• Values (ethics and morale)
• Mental illness
Address our inherent weaknesses from day one
• I have to make frequent decisions and many of
them depend upon readings from sensors that
can be correct, noisy, random, unavailable, or
in some other state.
• The decisions I have to make often have safety
consequences, they certainly have economic
consequences, and some are irreversible.
• At any point in time there may be three or four
actions I could take based on my sense of
what’s happening on the rig
• I would like better support to determine how
trustworthy my readings are, what the possible
situations are and the consequences of each
action.
What is the best action
to take?
enhance human decision making
use and adhere to standards
IEC 61508 Functional safety of safety instrumented systems for the process industry sector
IEC 61511 Safety instrumented systems for the process industry sector
DO-178C Software considerations in airborne systems and equipment certification
The good thing about standards is that there are so many to choose from
Andrew S. Tanenbaum
Not sufficient on their own
Represents insights
Must be tailored to be useful
build & use safety (security) cases
Thanks to professor Tim Kelly @ University of York
Summary
summary
Heading toward a world of interconnected every-things
Some of these things support hazardous industries and critical functions
Exposed to the inherent vulnerabilities in computers and software
Hazardous industries need high-integrity systems
Non-critical software practice fails for critical systems
Rigorous Systems Engineering, Safety & Security Architecture and Standards
Human factors must be addressed from day one
Through engineering and operations and use
Safety and security in distributed
systems
Einar Landre
Leader
E-mail einla@statoil.com
Tel: +4741470537
www.statoil.com
Thank you

Safety and security in distributed systems

  • 1.
    Safety and securityin distributed systems Einar Landre
  • 2.
  • 3.
  • 4.
    Industries with thepotential to injure or kill people or to do serious damage on the environment hazardous industry Require high-integrity systems and safety management processes to ensure safety
  • 5.
    high integrity systems Systemswhere failure could lead to an accident and for which high reliability are claimed - Pressure boundaries - Oil & Gas wells - Boilers - Instrumentation & Control Systems - Emergency shutdown - Fire and gas leak detection - Life supporting devices - Pacemakers - Infusion pumps
  • 6.
    system criticality Non -Critical Useful system - Low dependability - System does not need to be trusted Business - Critical Mission - Critical Safety - Critical High Availability - Focus on cost s of failure caused by system downtime, cost of spares, repair equipment and personnel and warranty claims High Reliability - Increase the probability of failure free system operation over a specified time in a given environment for a given purpose High Safety & Integrity Level - High reliability - High availability - High security - Focus is not on cost, but on preserving life and nature
  • 7.
    safeguarding integrity Risk /threat based approach
  • 8.
  • 9.
    Troll A, 472meters, the largest man made “thing” ever moved Software was an alien concept things anno 1995
  • 10.
    things anno 2015 Asgardsubsea compression runs on software Size = a football field
  • 11.
    things anno 2025 Thesubsea factory will be an Internet of Everything
  • 12.
    Fallacies of distributedcomputing: 1. The network is reliable 2. Latency is zero 3. Bandwidth is infinite 4. The network is secure 5. Topology doesn’t change 6. There is one administrator 7. Transport cost is zero networked everything's A distributed system is one in which the failure of a computer you didn’t even know existed can render your own computer unusable. Leslie Lamport
  • 13.
  • 14.
    software is ubiquitous Definesthe behaviour of 1. Mobile devices 2. Medical devices 3. Computer Networks 4. Industrial control systems 5. Supply chains and logistics 6. Robots, cars & aircrafts 7. Human-Machine Interfaces Institutionalizes our insights and knowledge
  • 15.
    before software Tangible controllogic • Design level • Implementation level • Verification & test level No cyber threats • Intrusion • Viruses • Theft • Identity
  • 16.
    two unique properties Inspection& Test • Software can’t be inspected and tested as analogous components CPU – the single point of failure • All signals are threaded through the one single element. • Execution sequence is un-known • Same defect is systemized across multiple instances Impacts how we must manage software for critical systems
  • 17.
    some specific challenges Commonmode failure Malware, Viruses and Hacking Human Factors Blurred boundaries Identity management
  • 18.
    common mode failure “resultsfrom an event which because of dependencies causes a coincidence of failure states of components in two or more separate channels of a redundancy system, leading to the defined systems failing to perform its intended function”. Ariane 5 test launch, 1996
  • 19.
    malware, viruses andhacking Motivated by financial, political, criminal or idealistic interests Software created to cause harm • Change of system behaviour • Steal / destroy data or machines Exploits weaknesses in • Human character • Technical designs Horror stories: • Stuxnet and the Iranian centrifuges (Siemens control system) • Saudi Aramco hack of 35000 computers (Windows back office)
  • 20.
    human factors How tominimize the probability? Mistakes occur everywhere • Specification • Design • Implementation • Deployment • Operations Humans make mistakes • By commission • By omission • By carelessness
  • 21.
    blurred boundaries Conflicting interests,divergent situational understanding across disciplines and roles. Architects thinks and designs in terms of hierarchy and layering Programmers thinks and designs in terms of threads of execution Users need systems that works and solves a real world problems Operations needs to get the job done
  • 22.
    identity How to ensurethat a thing or person is the one they claim to be? What are the impacts on - Security - Safety - Integrity - Availability - Reliability
  • 23.
  • 24.
    systems engineering Architecture centric •Design • Implementation • Deployment • Usage Risk based • Requirements • Design • Implementation • Commissioning • Usage Holistic and remember higher order effects
  • 25.
    safety & securityarchitecture Separation and protection of critical functions
  • 26.
    Human brain -planets most sophisticated and vulnerable decision maker human factors • Emotions trumps facts (irrationality) • Limited processing capacity • Need to rest, easily bored • Inconsistency across exemplars • Creative, easily distracted • Values (ethics and morale) • Mental illness Address our inherent weaknesses from day one
  • 27.
    • I haveto make frequent decisions and many of them depend upon readings from sensors that can be correct, noisy, random, unavailable, or in some other state. • The decisions I have to make often have safety consequences, they certainly have economic consequences, and some are irreversible. • At any point in time there may be three or four actions I could take based on my sense of what’s happening on the rig • I would like better support to determine how trustworthy my readings are, what the possible situations are and the consequences of each action. What is the best action to take? enhance human decision making
  • 28.
    use and adhereto standards IEC 61508 Functional safety of safety instrumented systems for the process industry sector IEC 61511 Safety instrumented systems for the process industry sector DO-178C Software considerations in airborne systems and equipment certification The good thing about standards is that there are so many to choose from Andrew S. Tanenbaum Not sufficient on their own Represents insights Must be tailored to be useful
  • 29.
    build & usesafety (security) cases Thanks to professor Tim Kelly @ University of York
  • 30.
  • 31.
    summary Heading toward aworld of interconnected every-things Some of these things support hazardous industries and critical functions Exposed to the inherent vulnerabilities in computers and software Hazardous industries need high-integrity systems Non-critical software practice fails for critical systems Rigorous Systems Engineering, Safety & Security Architecture and Standards Human factors must be addressed from day one Through engineering and operations and use
  • 32.
    Safety and securityin distributed systems Einar Landre Leader E-mail einla@statoil.com Tel: +4741470537 www.statoil.com Thank you

Editor's Notes

  • #5 Macondo: A difficult well & reservoir The latest and greatest technology Human operators did not understand system messages and alarms Focus on making things work No trust in the IT systems 50 minutes from first anomaly to blow-out False-positives is probably one of the most important threats toward humans building trust to technical systems. For a system with a high frequency of false-positive alarms, the real alarms will not be detected. Cancelling out false-positives before they reach the human operator is one of the most vital HSE measures in complex systems.
  • #8 Fjern figur. Avslutt med spørsmål Hvordan beskytte integritetsnivået? Hvordan forstå trusselbildet?
  • #13 Peter Deutch phrased the fallacies of distributed computing when he was at SUN Microsystems back in 1994.
  • #16 For those who have seen Apollo 13, that is an excersize in how to program an analog computer, bringing electronical circuits alive by switches.