Alan Tatourian
Intel Automotive
ighly-Dependable Automotive Softwar
Auto-ISAC 2018
Auto-ISAC 2018
Progress in Technology has been Astonishing
Every generation of technology has enabled remarkable outcomes
Apollo 11
2048 words RAM (16-bit word) ~4KB
36,864 words ROM
Average Smartphone
256MB – 512MB Cache
2GB – 64GB RAM
Next 10 to 20 years
???
45 years
62M x RAM
Cognitive Systems
???
???
Auto-ISAC 2018
• Design Goals
• Security Goals
• Advanced Design
• Summary
Agenda
Auto-ISAC 2018
I always talk about this to folks at Microsoft, especially to developers. What’s the most important operating system you’ll write
applications for? Ain’t Windows, or the Macintosh, or Linux. It’s Homo Sapiens Version 1.0. It shipped about a hundred thousand
years ago. There’s no upgrade in sight. But it’s the one that runs everything.
– Bill Buxton from Microsoft Research
Economic Utility
There is an axiom in economics called
economic utility, it says that feature value
with time tend to zero. As soon as you put
a feature (product) on a shelf it starts to
depreciate.
The goal of any well-defined process
including SDL is ‘continuous improvement’.
Auto-ISAC 2018
Architecture Goals
1. The most obvious approach might be to imagine the future you want and build it.
Unfortunately, that doesn’t work that well because technology co-evolves with people.
It’s a two step—technology pushes people to move forward and then people move past
technology and it has to catch up. The way we see the future is constantly evolving and
the path you take to get there matters. In technical terms we can call this ‘continuous
improvement.’
2. Establish modular and composable design making it possible to (1) use your system in
different (standardized) configurations and applications and (2) evolve it as the
requirements and technologies change.
3. Control (or manage) and reduce complexity!
Civilization advances by extending the number of important operations we can perform without thinking about them.
– Alfred North Whitehead
Auto-ISAC 2018
Complexity, Safety, Security . . .
• 3.22 trillion miles (US, 2016)
• 40,200 fatalities (US, 2016) – roughly 100 people each day
• 1 fatality per 80 million miles
• 1 in 625 chance of dying in car crash (in your lifetime)
• Human error is approximately 0.000,001%  this is what AI
needs to improve on!!!Source of the images: Stanford and Wikipedia
1993 Accident to Airbus A320-211 Aircraft in Warsaw
Question: How safe do autonomous vehicles need to be?
• As safe as human-driven cars (7 death every 109 miles)
• As safe as busses and trains (0.1-0.4 death every 109 miles)
• As safe as airplanes (0.07 death every 109 miles)
I. Savage, “Comparing the fatality risks in United States transportation across modes and over time”, Research in
Transportation Economics, 2013
Auto-ISAC 2018
As the complexity of a system increases, the accuracy of any single agent's own model of
that system decreases rapidly.
Technical debt is a runaway complexity. For example, if it takes you enormous effort and
money to upgrade your system you have accumulated huge technical debt. Remember
that value of your system is inversely proportional to its maintainability.
Dark debt is a form of technical debt that is invisible until it causes failures.
Dark debt is found in complex systems and the anomalies it generates are complex
system failures. Dark debt is not recognizable at the time of creation. … It arises from the
unforeseen interactions of hardware or software with other parts of the framework. …
Unlike technical debt, which can be detected and, in principle at least, corrected by
refactoring, dark debt surfaces through anomalies.
Technical & Dark Debt
Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.
– Antoine de Saint-Exupery
Auto-ISAC 2018
New challenges brought by AI
A single bit-flip error leads to a misclassification of image by DNN
From research by Karthik Pattabiraman
University of British Columbia
Auto-ISAC 2018
• Design Goals
• Security Goals
• Vehicle architectures in the future: Software Defined
• Security, Functional Safety, Reliability
• Summary
Agenda
Auto-ISAC 2018
Information Security Goals
1. Secure boot
2. Secure auditing and logging
3. Authentication and authorization
4. Session Management
5. Input validation and output encoding
6. Exception management
7. Key management, cryptography, integrity, and availability
8. Security of data at rest
9. Security of data in motion
10.Configuration management
11.Incidence response and patching
Together, these formulate the end-to-end security architecture for the product and thus should be considered alongside
one another—not in isolation. Also, each of the categories has many sub-topics within it. For example, under
authentication and authorization there are aspects of discretionary access controls and mandatory access controls to
consider. Security policies for the product are an outcome of the implementation decisions made during development
across these nine categories.
We already know that a “control” strategy fails
worse than a “resilience” strategy.
Auto-ISAC 2018
Cyberattacks to CPS Control Layers
Control Layer
Regulatory Control Supervisory Control
Deception attacks
Spoofing, replay Set-point change
Measurement substitution Controller substitution
DoS attacks
Physical jamming Network flooding
Increase in latency Operational disruption
Estimation of CPS risks by naively aggregating risks due to reliability and security
failures does not capture the externalities,
and can lead to grossly suboptimal responses to CPS risks.
To thwart the outcomes that follow sentient opponent actions,
diversity of mechanism is required.
Auto-ISAC 2018
The Honeymoon Affect
Design specifications miss important security details that appear only in code.
For most programmers it's hard enough to get the code into a state where the compiler
reads it and correctly interprets it; worrying about making human-readable code is a
luxury.
The software industry needs to change its outlook from trying to achieve code perfection
to recognizing that code will always have security bugs.
FailureRate
Number of Months
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
1 2 43 5 6 7 8 109 11
VulnerabilitiesperMonth
Months since Release
Current Software Engineering literature supports the
Brooks life-cycle model - image taken from “Post-
release reliability growth in software products”, ACM
Trans. Softw. Eng Methodol. 2008
Auto-ISAC 2018
Cryptography ≠ Security
Whoever thinks his problem can be solved using cryptography, doesn’t understand his problem and doesn’t understand cryptography.
– Attributed by Roger Needham and Butler Lampson to each other
Cryptography rots, just like food. Every key and every algorithm has shelf time. Some have very short shelf time.
• How long do you need your cryptographic keys or algorithms to be secure? – this is cryptography shelf life (x years)
• How long will it take to extract secrets out of your system? – this is the end of honeymoon (z years)
• What are your parameters to reduce attack surface and to update keys or algorithms? -  (pronounced Xi)
𝐼𝑓 𝑧 < 𝑥 + 𝜉, 𝑖𝑚𝑝𝑟𝑜𝑣𝑒 𝑦𝑜𝑢𝑟 𝑎𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒 𝑎𝑛𝑑 𝑖𝑛𝑓𝑟𝑎𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒!
Cryptographic Agility
Auto-ISAC 2018
Anti-Virus and other security SW
On a recent software vulnerability watch list, about one-third of the reported software
vulnerabilities were in the security software itself.
The average time it takes to identify a cybersecurity incident discovery is 197 days.
From DARPA High-Assurance Cyber Military Systems (HACMS) Proposer’s Day Brief.
Auto-ISAC 2018
1. Restrict all code to very simple control flow constructs, do not use goto statements, setjmp or longjmp constructs, direct or indirect recursion.
2. Give all loops a fixed upper bound. It must be trivially possible for a checking tool to prove statically that the loop cannot exceed a preset upper bound on
the number of iterations. If a tool cannot prove the loop bound statically, the rule is considered violated.
3. Do not use dynamic memory allocation after initialization.
4. No function should be longer than what can be printed on a single sheet of paper in a standard format with one line per statement and one line per
declaration. Typically, this means no more than about 60 lines of code per function.
5. The code’s assertion density should average to minimally two assertions per function. Assertions must be used to check for anomalous conditions that
should never happen in real-life executions. Assertions must be side effect-free and should be defined as Boolean tests. When an assertion fails, an
explicit recovery action must be taken, such as returning an error condition to the caller of the function that executes the failing assertion. Any assertion for
which a static checking tool can prove that it can never fail or never hold violates this rule.
6. Declare all data objects at the smallest possible level of scope.
7. Each calling function must check the return value of non-void functions, and each called function must check the validity of all parameters provided by the
caller.
8. The use of the preprocessor must be limited to the inclusion of header files and simple macro definitions. Token pasting, variable argument lists (ellipses),
and recursive macro calls are not allowed. All macros must expand into complete syntactic units. The use of conditional compilation directives must be
kept to a minimum.
9. The use of pointers must be restricted. Specifically, no more than one level of dereferencing should be used. Pointer dereference operations may not be
hidden in macro definitions or inside typedef declarations. Function pointers are not permitted.
10.All code must be compiled, from the first day of development, with all compiler warnings enabled at the most pedantic setting available. All code must
compile without warnings. All code must also be checked daily with at least one, but preferably more than one, strong static source code analyzer and
should pass all analyses with zero warnings.
NASA’s Ten Principles of Safety-Critical Code
Gerard J Holzmann. The power of 10: rules for developing safety-critical code. Computer, 39(6):95–99, 2006.
Auto-ISAC 2018
No single point of failure—this means that no component should be exclusively dependent
on the operation of another component. Service-oriented architectures and middleware
architectures often do not have a single point of failure.
Diagnosing the problems—the diagnostics of the system should be able to detect
malfunctioning of the components, so mechanisms like heartbeat synchronization should be
implemented. The layered architectures support the diagnostics functionality as they allow us
to build two separate hierarchies—one for handling functionality and one for monitoring it.
Timeouts instead of deadlocks—when waiting for data from another component, the
component under operation should be able to abort its operation after a period of time
(timeout) and signal to the diagnostics that there was a problem in the communication.
Service-oriented architectures have built-in mechanisms for monitoring timeouts.
Reliability and Fault Tolerance
Auto-ISAC 2018
Example: Dynamic heap memory allocation shall not be used.
This rule in practice prohibits dynamic memory allocations for the variables. The rationale behind this
rule is the fact that dynamic memory allocations can lead to memory leaks, overflow errors and failures
which occur randomly.
Taking just the defects related to the memory leaks can be very difficult to trace and thus very costly. If
left in the code, the memory leaks can cause undeterministic behavior and crashes of the software.
These crashes might require restart of the node, which is impossible during the runtime of a safety-
critical system.
Following this rule, however, also means that there is a limit on the size of the data structures that can be
used, and that the need for memory of the system is predetermined at design time, thus making the use
of this software “safer”.
Programming of Safety-Critical Systems
Auto-ISAC 2018
• Design Goals
• Security Goals
• Advanced Design
• Summary
Agenda
Auto-ISAC 2018
Three Pillars of Autonomous systems
Autonomous vehicles are a key example where
designers are challenged with the simultaneous
integration of three critical areas:
1. supercomputing complexity,
2. hard real-time embedded performance
3. functional safety.
Auto-ISAC 2018
The Four Pillars of CPS
The four key pillars driving cyber-physical systems are:
1. Connectivity,
2. Monitoring,
3. Prediction, and
4. Self-Optimization.
While the first two have experienced recent technological enablement, prediction
and optimization are expected to radically change every aspect of our society.
Components associated with physical
control of the vehicle
Components associated with safety
Components associated with
entertainment and convenience
Auto-ISAC 2018
Ultra-Reliable Systems
Air Force F-15 flying despite the absence of one of its wings.
The image demonstrates why self-repairing flight control systems play vital role in aircraft
control.
From The Story of Self-repairing Flight Control
Systems by James E. Tomayko
NASA photo (EC 88203-6) shows an Air Force F-15
flying despite the absence of one of the wings.
Auto-ISAC 2018
3-Dimensional Structure of Digital Security
Defense in Depth
Defense in Diversity
4 i‘s
Isolation
Inoperability
Incompatibility
Independence
But eventually everything fails. You have to make it fail in a predictable way.
Temporal Redundancy
Information Redundancy
Majority voting
Software and Services
Hardware security services
Hardware security building blocks
Security features in the silicon
Analog security monitoring under the CPU
HardwareRootofTrust
Self-Healing
Two-tier architecture is required!
Auto-ISAC 2018
Self-* and High Dependability
Self-healing is the ability of the system to autonomously change its structure so that its
behavior stays the same.
Trend of using self-adaptation is used increasingly in safety-critical systems as it allows us
to change the operation of a component in the presence of errors and failures.
Self-Monitor Self-Diagnosis
Anomalous Event
Deployment
Self-Testing
Candidate Fix
Generation
Self-Adaptation
Fault Identification
Auto-ISAC 2018
• Design Goals
• Security Goals
• Advanced Design
• Summary
Agenda
Auto-ISAC 2018
Summary
1. Absolutely secure systems are impossible, with enough money and
commitment any system can be broken
2. Assume your system is compromised and build it so that it can recover
3. Strive for continuous incremental improvement, not perfection
4. We do not know how to build 100% reliable systems, we only know how to
manage risk – your system will fail and your design has to ensure that it
fails in a predictable way.
Thank you.
Legal Disclaimer
This presentation contains the general insights and opinions of Intel Corporation (“Intel”). The information in this presentation is provided for
information only and is not to be relied upon for any other purpose than educational. Use at your own risk! Intel makes no representations or
warranties regarding the accuracy or completeness of the information in this presentation. Intel accepts no duty to update this presentation based on
more current information. Intel is not liable for any damages, direct or indirect, consequential or otherwise, that may arise, directly or indirectly, from
the use or misuse of the information in this presentation.
Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries.
* Other names and brands may be claimed as the property of others.
© 2018 Intel Corporation.

Highly dependable automotive software

  • 1.
    Alan Tatourian Intel Automotive ighly-DependableAutomotive Softwar Auto-ISAC 2018
  • 2.
    Auto-ISAC 2018 Progress inTechnology has been Astonishing Every generation of technology has enabled remarkable outcomes Apollo 11 2048 words RAM (16-bit word) ~4KB 36,864 words ROM Average Smartphone 256MB – 512MB Cache 2GB – 64GB RAM Next 10 to 20 years ??? 45 years 62M x RAM Cognitive Systems ??? ???
  • 3.
    Auto-ISAC 2018 • DesignGoals • Security Goals • Advanced Design • Summary Agenda
  • 4.
    Auto-ISAC 2018 I alwaystalk about this to folks at Microsoft, especially to developers. What’s the most important operating system you’ll write applications for? Ain’t Windows, or the Macintosh, or Linux. It’s Homo Sapiens Version 1.0. It shipped about a hundred thousand years ago. There’s no upgrade in sight. But it’s the one that runs everything. – Bill Buxton from Microsoft Research Economic Utility There is an axiom in economics called economic utility, it says that feature value with time tend to zero. As soon as you put a feature (product) on a shelf it starts to depreciate. The goal of any well-defined process including SDL is ‘continuous improvement’.
  • 5.
    Auto-ISAC 2018 Architecture Goals 1.The most obvious approach might be to imagine the future you want and build it. Unfortunately, that doesn’t work that well because technology co-evolves with people. It’s a two step—technology pushes people to move forward and then people move past technology and it has to catch up. The way we see the future is constantly evolving and the path you take to get there matters. In technical terms we can call this ‘continuous improvement.’ 2. Establish modular and composable design making it possible to (1) use your system in different (standardized) configurations and applications and (2) evolve it as the requirements and technologies change. 3. Control (or manage) and reduce complexity! Civilization advances by extending the number of important operations we can perform without thinking about them. – Alfred North Whitehead
  • 6.
    Auto-ISAC 2018 Complexity, Safety,Security . . . • 3.22 trillion miles (US, 2016) • 40,200 fatalities (US, 2016) – roughly 100 people each day • 1 fatality per 80 million miles • 1 in 625 chance of dying in car crash (in your lifetime) • Human error is approximately 0.000,001%  this is what AI needs to improve on!!!Source of the images: Stanford and Wikipedia 1993 Accident to Airbus A320-211 Aircraft in Warsaw Question: How safe do autonomous vehicles need to be? • As safe as human-driven cars (7 death every 109 miles) • As safe as busses and trains (0.1-0.4 death every 109 miles) • As safe as airplanes (0.07 death every 109 miles) I. Savage, “Comparing the fatality risks in United States transportation across modes and over time”, Research in Transportation Economics, 2013
  • 7.
    Auto-ISAC 2018 As thecomplexity of a system increases, the accuracy of any single agent's own model of that system decreases rapidly. Technical debt is a runaway complexity. For example, if it takes you enormous effort and money to upgrade your system you have accumulated huge technical debt. Remember that value of your system is inversely proportional to its maintainability. Dark debt is a form of technical debt that is invisible until it causes failures. Dark debt is found in complex systems and the anomalies it generates are complex system failures. Dark debt is not recognizable at the time of creation. … It arises from the unforeseen interactions of hardware or software with other parts of the framework. … Unlike technical debt, which can be detected and, in principle at least, corrected by refactoring, dark debt surfaces through anomalies. Technical & Dark Debt Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away. – Antoine de Saint-Exupery
  • 8.
    Auto-ISAC 2018 New challengesbrought by AI A single bit-flip error leads to a misclassification of image by DNN From research by Karthik Pattabiraman University of British Columbia
  • 9.
    Auto-ISAC 2018 • DesignGoals • Security Goals • Vehicle architectures in the future: Software Defined • Security, Functional Safety, Reliability • Summary Agenda
  • 10.
    Auto-ISAC 2018 Information SecurityGoals 1. Secure boot 2. Secure auditing and logging 3. Authentication and authorization 4. Session Management 5. Input validation and output encoding 6. Exception management 7. Key management, cryptography, integrity, and availability 8. Security of data at rest 9. Security of data in motion 10.Configuration management 11.Incidence response and patching Together, these formulate the end-to-end security architecture for the product and thus should be considered alongside one another—not in isolation. Also, each of the categories has many sub-topics within it. For example, under authentication and authorization there are aspects of discretionary access controls and mandatory access controls to consider. Security policies for the product are an outcome of the implementation decisions made during development across these nine categories. We already know that a “control” strategy fails worse than a “resilience” strategy.
  • 11.
    Auto-ISAC 2018 Cyberattacks toCPS Control Layers Control Layer Regulatory Control Supervisory Control Deception attacks Spoofing, replay Set-point change Measurement substitution Controller substitution DoS attacks Physical jamming Network flooding Increase in latency Operational disruption Estimation of CPS risks by naively aggregating risks due to reliability and security failures does not capture the externalities, and can lead to grossly suboptimal responses to CPS risks. To thwart the outcomes that follow sentient opponent actions, diversity of mechanism is required.
  • 12.
    Auto-ISAC 2018 The HoneymoonAffect Design specifications miss important security details that appear only in code. For most programmers it's hard enough to get the code into a state where the compiler reads it and correctly interprets it; worrying about making human-readable code is a luxury. The software industry needs to change its outlook from trying to achieve code perfection to recognizing that code will always have security bugs. FailureRate Number of Months 0.09 0.08 0.07 0.06 0.05 0.04 0.03 0.02 0.01 0 1 2 43 5 6 7 8 109 11 VulnerabilitiesperMonth Months since Release Current Software Engineering literature supports the Brooks life-cycle model - image taken from “Post- release reliability growth in software products”, ACM Trans. Softw. Eng Methodol. 2008
  • 13.
    Auto-ISAC 2018 Cryptography ≠Security Whoever thinks his problem can be solved using cryptography, doesn’t understand his problem and doesn’t understand cryptography. – Attributed by Roger Needham and Butler Lampson to each other Cryptography rots, just like food. Every key and every algorithm has shelf time. Some have very short shelf time. • How long do you need your cryptographic keys or algorithms to be secure? – this is cryptography shelf life (x years) • How long will it take to extract secrets out of your system? – this is the end of honeymoon (z years) • What are your parameters to reduce attack surface and to update keys or algorithms? -  (pronounced Xi) 𝐼𝑓 𝑧 < 𝑥 + 𝜉, 𝑖𝑚𝑝𝑟𝑜𝑣𝑒 𝑦𝑜𝑢𝑟 𝑎𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒 𝑎𝑛𝑑 𝑖𝑛𝑓𝑟𝑎𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒! Cryptographic Agility
  • 14.
    Auto-ISAC 2018 Anti-Virus andother security SW On a recent software vulnerability watch list, about one-third of the reported software vulnerabilities were in the security software itself. The average time it takes to identify a cybersecurity incident discovery is 197 days. From DARPA High-Assurance Cyber Military Systems (HACMS) Proposer’s Day Brief.
  • 15.
    Auto-ISAC 2018 1. Restrictall code to very simple control flow constructs, do not use goto statements, setjmp or longjmp constructs, direct or indirect recursion. 2. Give all loops a fixed upper bound. It must be trivially possible for a checking tool to prove statically that the loop cannot exceed a preset upper bound on the number of iterations. If a tool cannot prove the loop bound statically, the rule is considered violated. 3. Do not use dynamic memory allocation after initialization. 4. No function should be longer than what can be printed on a single sheet of paper in a standard format with one line per statement and one line per declaration. Typically, this means no more than about 60 lines of code per function. 5. The code’s assertion density should average to minimally two assertions per function. Assertions must be used to check for anomalous conditions that should never happen in real-life executions. Assertions must be side effect-free and should be defined as Boolean tests. When an assertion fails, an explicit recovery action must be taken, such as returning an error condition to the caller of the function that executes the failing assertion. Any assertion for which a static checking tool can prove that it can never fail or never hold violates this rule. 6. Declare all data objects at the smallest possible level of scope. 7. Each calling function must check the return value of non-void functions, and each called function must check the validity of all parameters provided by the caller. 8. The use of the preprocessor must be limited to the inclusion of header files and simple macro definitions. Token pasting, variable argument lists (ellipses), and recursive macro calls are not allowed. All macros must expand into complete syntactic units. The use of conditional compilation directives must be kept to a minimum. 9. The use of pointers must be restricted. Specifically, no more than one level of dereferencing should be used. Pointer dereference operations may not be hidden in macro definitions or inside typedef declarations. Function pointers are not permitted. 10.All code must be compiled, from the first day of development, with all compiler warnings enabled at the most pedantic setting available. All code must compile without warnings. All code must also be checked daily with at least one, but preferably more than one, strong static source code analyzer and should pass all analyses with zero warnings. NASA’s Ten Principles of Safety-Critical Code Gerard J Holzmann. The power of 10: rules for developing safety-critical code. Computer, 39(6):95–99, 2006.
  • 16.
    Auto-ISAC 2018 No singlepoint of failure—this means that no component should be exclusively dependent on the operation of another component. Service-oriented architectures and middleware architectures often do not have a single point of failure. Diagnosing the problems—the diagnostics of the system should be able to detect malfunctioning of the components, so mechanisms like heartbeat synchronization should be implemented. The layered architectures support the diagnostics functionality as they allow us to build two separate hierarchies—one for handling functionality and one for monitoring it. Timeouts instead of deadlocks—when waiting for data from another component, the component under operation should be able to abort its operation after a period of time (timeout) and signal to the diagnostics that there was a problem in the communication. Service-oriented architectures have built-in mechanisms for monitoring timeouts. Reliability and Fault Tolerance
  • 17.
    Auto-ISAC 2018 Example: Dynamicheap memory allocation shall not be used. This rule in practice prohibits dynamic memory allocations for the variables. The rationale behind this rule is the fact that dynamic memory allocations can lead to memory leaks, overflow errors and failures which occur randomly. Taking just the defects related to the memory leaks can be very difficult to trace and thus very costly. If left in the code, the memory leaks can cause undeterministic behavior and crashes of the software. These crashes might require restart of the node, which is impossible during the runtime of a safety- critical system. Following this rule, however, also means that there is a limit on the size of the data structures that can be used, and that the need for memory of the system is predetermined at design time, thus making the use of this software “safer”. Programming of Safety-Critical Systems
  • 18.
    Auto-ISAC 2018 • DesignGoals • Security Goals • Advanced Design • Summary Agenda
  • 19.
    Auto-ISAC 2018 Three Pillarsof Autonomous systems Autonomous vehicles are a key example where designers are challenged with the simultaneous integration of three critical areas: 1. supercomputing complexity, 2. hard real-time embedded performance 3. functional safety.
  • 20.
    Auto-ISAC 2018 The FourPillars of CPS The four key pillars driving cyber-physical systems are: 1. Connectivity, 2. Monitoring, 3. Prediction, and 4. Self-Optimization. While the first two have experienced recent technological enablement, prediction and optimization are expected to radically change every aspect of our society. Components associated with physical control of the vehicle Components associated with safety Components associated with entertainment and convenience
  • 21.
    Auto-ISAC 2018 Ultra-Reliable Systems AirForce F-15 flying despite the absence of one of its wings. The image demonstrates why self-repairing flight control systems play vital role in aircraft control. From The Story of Self-repairing Flight Control Systems by James E. Tomayko NASA photo (EC 88203-6) shows an Air Force F-15 flying despite the absence of one of the wings.
  • 22.
    Auto-ISAC 2018 3-Dimensional Structureof Digital Security Defense in Depth Defense in Diversity 4 i‘s Isolation Inoperability Incompatibility Independence But eventually everything fails. You have to make it fail in a predictable way. Temporal Redundancy Information Redundancy Majority voting Software and Services Hardware security services Hardware security building blocks Security features in the silicon Analog security monitoring under the CPU HardwareRootofTrust Self-Healing Two-tier architecture is required!
  • 23.
    Auto-ISAC 2018 Self-* andHigh Dependability Self-healing is the ability of the system to autonomously change its structure so that its behavior stays the same. Trend of using self-adaptation is used increasingly in safety-critical systems as it allows us to change the operation of a component in the presence of errors and failures. Self-Monitor Self-Diagnosis Anomalous Event Deployment Self-Testing Candidate Fix Generation Self-Adaptation Fault Identification
  • 24.
    Auto-ISAC 2018 • DesignGoals • Security Goals • Advanced Design • Summary Agenda
  • 25.
    Auto-ISAC 2018 Summary 1. Absolutelysecure systems are impossible, with enough money and commitment any system can be broken 2. Assume your system is compromised and build it so that it can recover 3. Strive for continuous incremental improvement, not perfection 4. We do not know how to build 100% reliable systems, we only know how to manage risk – your system will fail and your design has to ensure that it fails in a predictable way.
  • 26.
  • 27.
    Legal Disclaimer This presentationcontains the general insights and opinions of Intel Corporation (“Intel”). The information in this presentation is provided for information only and is not to be relied upon for any other purpose than educational. Use at your own risk! Intel makes no representations or warranties regarding the accuracy or completeness of the information in this presentation. Intel accepts no duty to update this presentation based on more current information. Intel is not liable for any damages, direct or indirect, consequential or otherwise, that may arise, directly or indirectly, from the use or misuse of the information in this presentation. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and/or other countries. * Other names and brands may be claimed as the property of others. © 2018 Intel Corporation.

Editor's Notes

  • #3 Every 30 years there is a new wave of things that computers do. Around 1950 they began to model events in the world (simulation), and around 1980 to connect people (communication). Since 2010 they have begun to engage with the physical world in a non-trivial way (embodiment – giving them bodies).
  • #5 https://tatourian.blog/2014/03/06/interview-with-bill-buxton-from-microsoft-research/ Your system architecture has to be adaptable and evolvable. Requirements and technologies change. You have to design your system for that change!
  • #6 If you have to kiss a lot of frogs to find a prince, find more frogs and kiss them faster and faster.
  • #7 Miles All drivers: 10,658 miles (29.2 miles per day) Rural drivers: 12,264 miles Urban drivers: 9,709 miles   Fatalities: Fatal crashes: 29,989 All fatalities: 32,675 Car occupants: 12,507 SUV occupants: 8,320 Pedestrians: 4,884 Motorcycle: 4,295 Bicyclists: 720 Large trucks: 587 Each day 29 people in the United States die in an alcohol-impaired driving crash; that is one person every 49 minutes. On average since 1982, one-third of all traffic fatalities were alcohol-impaired driving fatalities with more than 10,400 people killed in 2016. Almost 40 percent of alcohol-impaired driving fatalities are victims other than the drinking driver. 214 children aged 14 years or younger were killed in alcohol-impaired driving crashes in 2016. Rural areas are disproportionally affected by alcohol-impaired driving crashes and fatalities. The total economic cost of alcohol-impaired driving crashes was $121.5 billion in 2010 (including medical costs, earnings losses, productivity losses, legal costs, and vehicle damage). The more complex the system, the more potential anomalies hidden in the corners. While these anomalies may be rare, there are more than a billion car trips per day in the United States, greater than ten thousand times the number of daily airline flights. Google engineers speak about the “lazy driver,” the 93 percent of car accidents estimated to derive from human error. (Of course, human-factors specialists have long understood that human errors often are the result of poor system design and poor work practices.) Example of Interacting Requirements 1993 Accident to Airbus A320-211 Aircraft in Warsaw wet runway, crosswind Aircraft banked into crosswind Left wheels touched down 9 seconds after right pilot applied reverse thrust and spoilers but they were disabled until left gear compressed Why? Reverse thrust and spoilers must be disabled in the air Landing logic requires compression of both L&R gear Spoilers activate above 72 kts wheel speed or if both landing gear struts are compressed http://www.rvs.uni-bielefeld.de/publications/Incidents/DOCS/ComAndRep/Warsaw/warsaw-report.html
  • #8 https://www.johndcook.com/blog/2018/03/01/dark-debt
  • #9 Resilience and Security in Cyber-Physical Systems: Self-Driving Cars and Smart Devices Karthik Pattabiraman University of British Columbia 2017 https://youtu.be/O6NKY2oE99M This is a joint Microsoft/Nvidia research. The first half of the talk is entirely on functional safety and resilience of DNNs, the second describes invariant-based Intrusion Detection System. The future will be defined by autonomous computer systems that are tightly integrated with the environment, also known as Cyber-Physical systems (CPS). Resilience and security become extremely important in these systems, as a single error or security attack can have catastrophic consequences. In this talk, I will consider the resilience and security challenges of CPS, and how to protect them at low costs. I will give examples of two recent projects from my group, one on improving the resilience of Deep Neural Network (DNN) accelerators deployed in self-driving cars, and the other on deploying host-based intrusion detection systems on smart embedded devices such as smart meters and smart medical devices. Finally, I will discuss some of our ongoing work in this area, and the challenges and opportunities. This is joint work with my students and industry collaborators.
  • #12 In cybernetics and control theory, a setpoint (also set point, set-point) is the desired or target value for an essential variable, or process value of a system.[1] Departure of such a variable from its setpoint is one basis for error-controlled regulation using negative feedback for automatic control. [2]. The set point is usually abbreviated to SP, and the process value is usually abbreviated to PV.[3] https://en.wikipedia.org/wiki/Setpoint_(control_system)
  • #13 Familiarity breeds contempt: the honeymoon effect and the role of legacy code in zero-day vulnerabilities https://www.semanticscholar.org/paper/Familiarity-breeds-contempt%3A-the-honeymoon-effect-Clark-Frei/1148f37a8ca0a5ca0a26178c7d85a063bd539725
  • #15 DARPA High-Assurance Cyber Military Systems (HACMS) Proposer’s Day Brief. The average time it takes to identify a cybersecurity incident discovery is 197 days, according to the 2018 Cost of a Data Breach Study from the Ponemon Institute, sponsored by IBM. Companies who contain a breach within 30 days have an advantage over their less-responsive peers, saving an average of $1 million in containment costs.
  • #16 Gerard J Holzmann. The power of 10: rules for developing safety-critical code. Computer, 39(6):95–99, 2006.
  • #20 Developers need to efficiently produce systems that meet safety and other key system-level requirements. This approach facilitates flexible and efficient integration of internal, 3rd party, and/or customer IP subsystems to support late design changes and potentially customer-specific technology/IP requirements.
  • #22 I sum up this model as design for security, ship, analyze, self-heal or quarantine, and treat (if required). Hackers too can generally pivot faster than product-makers so our approach must be anticipatory, flexible and resilient. I can see a world where we will have put hackers out of business. – Simon Segars, CEO, Arm From The Story of Self-repairing Flight Control SYstems
  • #23 But eventually everything fails. You have to make it fail in a predictable way. Here, there are two strong links and one weak link. In case of failure, the weak link will disintegrate before the two strong links fail and detonate the warhead. Two strong links are made using different architecture (incompatibility). We already know that a “control” strategy fails worse than a “resilience” strategy. Temporal Redundancy: Read commands multiple times, Use median voting Information Redundancy: Process values multiple times, Store several copies in memory Use majority voting to schedule control commands independence – Design of subsystems to prevent common-mode and common-cause failures such that the failure of one subsystem does not affect the failure of another subsystem Incompatibility – the use of energy or information that will not be duplicated inadvertently Isolation – the predictable separation of weapon elements from compatible energy Inoperability – the predictable inability of weapon elements to function
  • #24 Despite considerable work in fault tolerance and reliability, software remains notoriously buggy and crash-prone. The current approach to ensuring the security and availability of software consists of a mix of different techniques: Proactive techniques seek to make the code as dependable as possible, through a combination of safe languages (e.g., Java [5]), libraries [6] and compilers [7, 8], code analysis tools and formal methods [9,10,11], and development methodologies. Debugging techniques aim to make post-fault analysis and recovery as easy as possible for the programmer that is responsible for producing a fix. Runtime protection techniques try to detect the fault using some type of fault isolation such as StackGuard [12] and FormatGuard [13], which address specific types of faults or security vulnerabilities. Containment techniques seek to minimize the scope of a successful exploit by isolating the process from the rest of the system, e.g., through use of virtual machine monitors such as VMWare or Xen, system call sandboxes such as Systrace [14], or operating system constructs such as Unix chroot(), FreeBSD’s jail facility, and others [15, 16]. Byzantine fault-tolerance and quorum techniques rely on redundancy and diversity to create reliable systems out of unreliable components [17, 1, 18]. These approaches offer a poor tradeoff between assurance, reliability in the face of faults, and performance impact of protection mechanisms. In particular, software availability has emerged as a concern of equal importance as integrity.
  • #27 Thank you page : DO NOT REMOVE.
  • #28 Legal Disclaimer Page : DO NOT REMOVE!