This document provides an overview of key topics in automotive software and security:
1. Cars now contain over 1 gigabyte of software code due to increasing automation, connectivity and data analytics capabilities.
2. As vehicles become more connected and automated, software complexity and security risks will continue growing substantially over the next 10-20 years.
3. Developing highly reliable and secure automotive software requires addressing challenges across computing, embedded systems, and functional safety.
2. SSG SecCon 2018
Progress in Technology has been Astonishing
Every generation of technology has enabled remarkable outcomes
Apollo 11
2048 words RAM (16-bit word) ~4KB
36,864 words ROM
Average Smartphone
256MB – 512MB Cache
2GB – 64GB RAM
Next 10 to 20 years
???
45 years
62M x RAM
Cognitive Systems
???
???
3. SSG SecCon 2018
Cars are the most software-intensive systems
in the universe
Source of the image: BlackBerry
For example the amount of software in the 1990s was a few megabytes of binary code (e.g. Volvo S80) and today
reaches over one gigabyte, excluding maps and other user data (e.g. Volvo XC90 of 2016).
4. SSG SecCon 2018
Automation, Connectivity, Analytics
Vehicle Connectivity
Vehicle Automation
Data Analytics
Limited but Expanding
(Telematics, Infotainment)
Developing/Immature
(Partial/Semi-Autonomous)
Focus on Vehicle
Performance/Location
Fully Connected Environment
(V2V, V2I, V2X)
Pervasive/Highly Developed
Focus on Consumer
Experience/Personal Data
Current State
Low Complexity
Future State
High Complexity
Risk is increasing and will continue to grow
Source of the image: Auto-ISAC
Toward 500 Million Lines of Code!!!
100 Million Lines of Code
is Low Complexity!
6. SSG SecCon 2018
Vehicle Security Reference Architecture
4G/5G DSRC
Connectivity
Gateway
(OTA)
Head Unit
Display
SDC ECU
Instrument
Cluster
Display
Central
Gateway
ADAS/AD ECU
Powertrain
DC
Body DC
Chassis DC
EDR
Smart
Charging
Laptop
Tablet
Smart
Phone
Secure off-board communication
Secure on-board communication
Secure boot, storage, cryptographic services
Firewall
Download Manager (OTA)
Intrusion Detection & Prevention System (IDPS)
Secure Monitoring & Logging
Secure Synchronized Time Manager
TCU
7. SSG SecCon 2018
How safe do autonomous vehicles need to be?
• 3.22 trillion miles (US, 2016)
• 40,200 fatalities (US, 2016) – roughly 100 people each day
• 1 fatality per 80 million miles
• 1 in 625 chance of dying in car crash (in your lifetime)
• Human error is approximately 0.000,001% this is what AI
needs to improve on!!!Source of the images: Stanford
1993 Accident to Airbus A320-211 Aircraft in Warsaw
Question: How safe do autonomous vehicles need to be?
• As safe as human-driven cars (7 death every 109 miles)
• As safe as busses and trains (0.1-0.4 death every 109 miles)
• As safe as airplanes (0.07 death every 109 miles)
I. Savage, “Comparing the fatality risks in United States transportation across modes and over time”, Research in
Transportation Economics, 2013
8. SSG SecCon 2018
Information Security Goals
1. Secure boot
2. Secure auditing and logging
3. Authentication and authorization
4. Session Management
5. Input validation and output encoding
6. Exception management
7. Key management, cryptography, integrity, and availability
8. Security of data at rest
9. Security of data in motion
10.Configuration management
11.Incidence response and patching
Together, these formulate the end-to-end security architecture for the product and thus should be considered alongside
one another—not in isolation. Also, each of the categories has many sub-topics within it. For example, under
authentication and authorization there are aspects of discretionary access controls and mandatory access controls to
consider. Security policies for the product are an outcome of the implementation decisions made during development
across these nine categories.
We already know that a “control” strategy fails
worse than a “resilience” strategy.
9. SSG SecCon 2018
Cyberattacks to CPS Control Layers
Control Layer
Regulatory Control Supervisory Control
Deception attacks
Spoofing, replay Set-point change
Measurement substitution Controller substitution
DoS attacks
Physical jamming Network flooding
Increase in latency Operational disruption
Estimation of CPS risks by naively aggregating risks due to reliability and security
failures does not capture the externalities, and can lead to grossly suboptimal
responses to CPS risks.
To thwart the outcomes that follow sentient opponent
actions, diversity of mechanism is required.
10. SSG SecCon 2018
Security Axioms
Design specifications miss important security details that appear only in code.
For most programmers it's hard enough to get the code into a state where the compiler
reads it and correctly interprets it; worrying about making human-readable code is a
luxury.
The software industry needs to change its outlook from trying to achieve code perfection
to recognizing that code will always have security bugs.
11. SSG SecCon 2018
Cryptography ≠ Security
Whoever thinks his problem can be solved using cryptography, doesn’t
understand his problem and doesn’t understand cryptography.
– Attributed by Roger Needham and Butler Lampson to each other
Cryptography rots, just like food. Every key and every algorithm has shelf time. Some have very short shelf time.
• How long do you need your cryptographic keys or algorithms to be secure? – this is cryptography shelf life (x years)
• How long will it take to extract secrets out of your system? – this is the end of honeymoon (z years)
• What are your parameters to reduce attack surface and to update keys or algorithms? - (pronounced Xi)
𝐼𝑓 𝑧 < 𝑥 + 𝜉, 𝑖𝑚𝑝𝑟𝑜𝑣𝑒 𝑦𝑜𝑢𝑟 𝑎𝑟𝑐ℎ𝑖𝑡𝑒𝑐𝑡𝑢𝑟𝑒 𝑎𝑛𝑑 𝑖𝑛𝑓𝑟𝑎𝑠𝑡𝑟𝑢𝑐𝑡𝑢𝑟𝑒!
FailureRate
Number of Months
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
0.01
0
1 2 43 5 6 7 8 109 11
VulnerabilitiesperMonth
Months since Release
The Honeymoon Effect
Cryptographic Agility
12. SSG SecCon 2018
Example: Dynamic heap memory allocation shall not be used.
This rule in practice prohibits dynamic memory allocations for the variables. The rationale behind this
rule is the fact that dynamic memory allocations can lead to memory leaks, overflow errors and failures
which occur randomly.
Taking just the defects related to the memory leaks can be very difficult to trace and thus very costly. If
left in the code, the memory leaks can cause undeterministic behavior and crashes of the software.
These crashes might require restart of the node, which is impossible during the runtime of a safety-
critical system.
Following this rule, however, also means that there is a limit on the size of the data structures that can be
used, and that the need for memory of the system is predetermined at design time, thus making the use
of this software “safer”.
Programming of Safety-Critical Systems
13. SSG SecCon 2018
1. Restrict all code to very simple control flow constructs, do not use goto statements, setjmp or longjmp constructs, direct or indirect recursion.
2. Give all loops a fixed upper bound. It must be trivially possible for a checking tool to prove statically that the loop cannot exceed a preset upper bound on
the number of iterations. If a tool cannot prove the loop bound statically, the rule is considered violated.
3. Do not use dynamic memory allocation after initialization.
4. No function should be longer than what can be printed on a single sheet of paper in a standard format with one line per statement and one line per
declaration. Typically, this means no more than about 60 lines of code per function.
5. The code’s assertion density should average to minimally two assertions per function. Assertions must be used to check for anomalous conditions that
should never happen in real-life executions. Assertions must be side effect-free and should be defined as Boolean tests. When an assertion fails, an
explicit recovery action must be taken, such as returning an error condition to the caller of the function that executes the failing assertion. Any assertion for
which a static checking tool can prove that it can never fail or never hold violates this rule.
6. Declare all data objects at the smallest possible level of scope.
7. Each calling function must check the return value of non-void functions, and each called function must check the validity of all parameters provided by the
caller.
8. The use of the preprocessor must be limited to the inclusion of header files and simple macro definitions. Token pasting, variable argument lists (ellipses),
and recursive macro calls are not allowed. All macros must expand into complete syntactic units. The use of conditional compilation directives must be
kept to a minimum.
9. The use of pointers must be restricted. Specifically, no more than one level of dereferencing should be used. Pointer dereference operations may not be
hidden in macro definitions or inside typedef declarations. Function pointers are not permitted.
10.All code must be compiled, from the first day of development, with all compiler warnings enabled at the most pedantic setting available. All code must
compile without warnings. All code must also be checked daily with at least one, but preferably more than one, strong static source code analyzer and
should pass all analyses with zero warnings.
NASA’s Ten Principles of Safety-Critical Code
14. SSG SecCon 2018
No single point of failure—this means that no component should be exclusively dependent
on the operation of another component. Service-oriented architectures and middleware
architectures often do not have a single point of failure.
Diagnosing the problems—the diagnostics of the system should be able to detect
malfunctioning of the components, so mechanisms like heartbeat synchronization should be
implemented. The layered architectures support the diagnostics functionality as they allow us
to build two separate hierarchies—one for handling functionality and one for monitoring it.
Timeouts instead of deadlocks—when waiting for data from another component, the
component under operation should be able to abort its operation after a period of time
(timeout) and signal to the diagnostics that there was a problem in the communication.
Service-oriented architectures have built-in mechanisms for monitoring timeouts.
Reliability and Fault Tolerance
15. SSG SecCon 2018
Ultra-Reliable Systems
Air Force F-15 flying despite the absence of one of its wings.
The image demonstrates why self-repairing flight control systems play vital role in aircraft
control.
16. SSG SecCon 2018
The Four Pillars of CPS
The four key pillars driving cyber-physical systems are:
1. Connectivity,
2. Monitoring,
3. Prediction, and
4. Self-Optimization.
While the first two have experienced recent technological enablement, prediction
and optimization are expected to radically change every aspect of our society.
Components associated with physical
control of the vehicle
Components associated with safety
Components associated with
entertainment and convenience
17. SSG SecCon 2018
Three Pillars of Autonomous systems
Autonomous vehicles are a key example where
designers are challenged with the simultaneous
integration of three critical areas:
1. supercomputing complexity,
2. hard real-time embedded performance
3. functional safety.
18. SSG SecCon 2018
Next Gen Autonomous SoC Architecture
Right-Sized Computing
• Single & multi-threaded compute engines
• Deferring access patterns, spatial/temporal
locality and performance requirements
Seamless Cache Coherency
• Uniform shared view of system memory
• Inter-processor communications lead to
network and protocol level deadlocks
Robust Architecture
• Dynamically changing workloads
• Handle changing usecases/SW needs
Comprehensive Safety, Reliability, and Security
• Highest level of fault tolerance
• Configurable ASIL target
• Built-in security
ASIL-D ASIL-C ASIL-B
No ASIL
Target
PCIe PCIe PCIe PCIe
CNN
Engines
CNN
Engines
Deep Learning /
Accelerators
CAN-FD
CAN-FD
SoC Non-coherent InterconnectCache Coherent Interconnect
Wireless
Subsystem
Inter-Chip Links
Camera
Image
Cognition
ISP
Imaging Subsystem
Performa
nce
CPU
Power-
saving
CPU
CPU Subsystem
Real-
Time
CPU
L2 $ L2 $
Vision Subsystem
GPU
Video
Encode
Video
Decode
System Control
Security
Safety
Sensors
M/L BIST
Trace
Power
DMA
Timers
GMAC
AVB
Ethernet
Trace
JTAG
Flash DDR
Zipwire
UART
USB
CSI
CSI
CSI
CSI
I2C
WiFi
GSM
LTE
5G
DSP IP
19. SSG SecCon 2018
3-Dimensional Structure of Digital Security
Defense in Depth
Defense in Diversity
4 i‘s
Isolation
Inoperability
Incompatibility
Independence
But eventually everything fails. You have to make it fail in a predictable way.
Temporal Redundancy
Information Redundancy
Majority voting
Software and Services
Hardware security services
Hardware security building blocks
Security features in the silicon
Analog security monitoring under the CPU
HardwareRootofTrust
Self-Healing
20. SSG SecCon 2018
Self-* and High Dependability
Self-healing is the ability of the system to autonomously change its structure so that its
behavior stays the same.
Trend of using self-adaptation is used increasingly in safety-critical systems as it allows us
to change the operation of a component in the presence of errors and failures.
Self-Monitor Self-Diagnosis
Anomalous Event
Deployment
Self-Testing
Candidate Fix
Generation
Self-Adaptation
Fault Identification
Every 30 years there is a new wave of things that computers do. Around 1950 they began to model events in the world (simulation), and around 1980 to connect people (communication). Since 2010 they have begun to engage with the physical world in a non-trivial way (embodiment – giving them bodies).
http://blogs.blackberry.com/2016/12/ces-2017-holistic-security-for-the-software-defined-car/
Growing amount of software in contemporary cars—as the innovation is driven by software, the amount of software and its complexity grow exponentially. For example the amount of software in the 1990s was a few megabytes of binary code (e.g. Volvo S80) and today reaches over one gigabyte, excluding maps and other user data (e.g. Volvo XC90 of 2016).
Miles
All drivers: 10,658 miles (29.2 miles per day)
Rural drivers: 12,264 miles
Urban drivers: 9,709 miles
Fatalities:
Fatal crashes: 29,989
All fatalities: 32,675
Car occupants: 12,507
SUV occupants: 8,320
Pedestrians: 4,884
Motorcycle: 4,295
Bicyclists: 720
Large trucks: 587
Each day 29 people in the United States die in an alcohol-impaired driving crash; that is one person every 49 minutes.
On average since 1982, one-third of all traffic fatalities were alcohol-impaired driving fatalities with more than 10,400 people killed in 2016.
Almost 40 percent of alcohol-impaired driving fatalities are victims other than the drinking driver.
214 children aged 14 years or younger were killed in alcohol-impaired driving crashes in 2016.
Rural areas are disproportionally affected by alcohol-impaired driving crashes and fatalities.
The total economic cost of alcohol-impaired driving crashes was $121.5 billion in 2010 (including medical costs, earnings losses, productivity losses, legal costs, and vehicle damage).
The more complex the system, the more potential anomalies hidden in the corners. While these anomalies may be rare, there are more than a billion car trips per day in the United States, greater than ten thousand times the number of daily airline flights.
Google engineers speak about the “lazy driver,” the 93 percent of car accidents estimated to derive from human error. (Of course, human-factors specialists have long understood that human errors often are the result of poor system design and poor work practices.)
Example of Interacting Requirements1993 Accident to Airbus A320-211 Aircraft in Warsaw
wet runway, crosswind
Aircraft banked into crosswind
Left wheels touched down 9 seconds after right
pilot applied reverse thrust and spoilers but they were disabled until left gear compressed
Why?
Reverse thrust and spoilers must be disabled in the air
Landing logic requires compression of both L&R gear
Spoilers activate above 72 kts wheel speed or if both landing gear struts are compressed
http://www.rvs.uni-bielefeld.de/publications/Incidents/DOCS/ComAndRep/Warsaw/warsaw-report.html
In cybernetics and control theory, a setpoint (also set point, set-point) is the desired or target value for an essential variable, or process value of a system.[1] Departure of such a variable from its setpoint is one basis for error-controlled regulation using negative feedback for automatic control. [2]. The set point is usually abbreviated to SP, and the process value is usually abbreviated to PV.[3]
https://en.wikipedia.org/wiki/Setpoint_(control_system)
Gerard J Holzmann. The power of 10: rules for developing safety-critical code. Computer, 39(6):95–99, 2006.
I sum up this model as design for security, ship, analyze, self-heal or quarantine, and treat (if required).
Hackers too can generally pivot faster than product-makers so our approach must be anticipatory, flexible and resilient.
I can see a world where we will have put hackers out of business.
– Simon Segars, CEO, Arm
Developers need to efficiently produce systems that meet safety and other key system-level requirements.
This approach facilitates flexible and efficient integration of internal, 3rd party, and/or customer IP subsystems to support late design changes and potentially customer-specific technology/IP requirements.
The speed of innovation is outpacing silicon design cycles, and what do you do about this challenge? The need is for adaptable chips.
But eventually everything fails. You have to make it fail in a predictable way. Here, there are two strong links and one weak link. In case of failure, the weak link will disintegrate before the two strong links fail and detonate the warhead. Two strong links are made using different architecture (incompatibility).
We already know that a “control” strategy fails worse than a “resilience” strategy.
Temporal Redundancy: Read commands multiple times, Use median voting
Information Redundancy: Process values multiple times, Store several copies in memory
Use majority voting to schedule control commands
independence – Design of subsystems to prevent common-mode and common-cause failures such that the failure of one subsystem does not affect the failure of another subsystem
Incompatibility – the use of energy or information that will not be duplicated inadvertently
Isolation – the predictable separation of weapon elements from compatible energy
Inoperability – the predictable inability of weapon elements to function
Despite considerable work in fault tolerance and reliability, software remains notoriously buggy and crash-prone. The current approach to ensuring the security and availability of software consists of a mix of different techniques:
Proactive techniques seek to make the code as dependable as possible, through a combination of safe languages (e.g., Java [5]), libraries [6] and compilers [7, 8], code analysis tools and formal methods [9,10,11], and development methodologies.
Debugging techniques aim to make post-fault analysis and recovery as easy as possible for the programmer that is responsible for producing a fix.
Runtime protection techniques try to detect the fault using some type of fault isolation such as StackGuard [12] and FormatGuard [13], which address specific types of faults or security vulnerabilities.
Containment techniques seek to minimize the scope of a successful exploit by isolating the process from the rest of the system, e.g., through use of virtual machine monitors such as VMWare or Xen, system call sandboxes such as Systrace [14], or operating system constructs such as Unix chroot(), FreeBSD’s jail facility, and others [15, 16].
Byzantine fault-tolerance and quorum techniques rely on redundancy and diversity to create reliable systems out of unreliable components [17, 1, 18].
These approaches offer a poor tradeoff between assurance, reliability in the face of faults, and performance impact of protection mechanisms. In particular, software availability has emerged as a concern of equal importance as integrity.