.lusoftware verification & validation
VVS
Testing Autonomous Cars for Feature
Interaction Failures using Many-
Objective Search
Raja Ben Abdessalem1, Annibale Panichella1,2, Shiva Nejati1, Lionel Briand1
1 University of Luxembourg
2 TU Delft
And Thomas Stifter, IEE Luxembourg
Autonomous Car Features
2
Automated Emergency Braking (AEB) Traffic Sign Recognition (TSR)
3
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
- Steering
- Acceleration
- Braking
4
…
…
Actuator Commands:
- Steering
- Acceleration
- BrakingConflicts
Feature Interaction ProblemUndesired Feature Interactions
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Problem Statement
5
How to automatically detect undesired feature
interactions in self-driving systems at early stages
of development?
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
(Early) Function Modeling
6
…
…
Integration
Component
Actuator
Command
Software Under Test (SUT)
Executable
Function Models
(Matlab/Simulink)
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Testing Function Models
7
SUT
Physics-based
Simulators
sensors/
cameras
(ego) car
other carsactuators
pedestrians
environment (road, weather, etc)
sensor/
camera data
Actuator
commands
Time Stamped Vectors
…
…
Integration "
Component
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
A Search-based Testing
Approach to Detect Undesired
Feature Interactions among
Function Models of Self-Driving
Features
Our Test Objectives
• A combination of three test objectives:
• Coverage-based
• Failure-based
• Unsafe Overriding
9
Coverage-based Test Objective
10
F1
F2
Fn
…
Integration
component
SUT
Goal: Exercising as many branches of the integration component as
possible
if (condition)
F1
Failure-based Test Objective
11
F1
F2
Fn
…
Integration
component
SUT
Example:
- Req: No collision between pedestrians
and cars
- Generating test cases that minimize
the distance between the car and the
pedestrian
Goal: Revealing violations of system-level requirements
Unsafe Overriding Test Objective
12
F1
F2
Fn
…
Integration
component
SUT
Goal: Finding failures that are more likely to be due to faults in
the integration component rather faults in the features
Braking-F1
0 .3 .3 .6 .8 1 1
Braking - Final
F1 F1 F2 F2 F3 F3 F1
0 .3 .2 .2 .3 .3 1
Reward failures that could have been avoided if another feature
had been prioritized by the integration logic
Our Hybrid Test Objectives
13
⌦j,l(tc) > 2 tc does not cover Branch j
One hybrid test objective for every branch j and every requirement l⌦j,l
2 ⌦j,l(tc) > 1 tc covers branch j but F is not
unsafely overriden
1 ⌦j,l(tc) > 0
tc covers branch j and F is
unsafely overriden but req l is
not violated
⌦j,l(tc) = 0 A feature interaction failure is likely detected
if (cnd)
F
Search Algorithm
• Goal: Computing a test suite that covers all the test objectives
• Challenges:
• The number of test objectives is large:
# of requirements × # of branches
• Computing test objectives is computationally expensive
• Not a Pareto front optimization problem
• Objectives compete with each others, e.g., cannot have the car
violating the speed limit after hitting the leading car in one test case
14
MOSA: Many-Objective Search-
based Test Generation
15
Objective 1
Objective 2
Not all (non-dominated) solutions
are optimal for the purpose of testing
Panichella et. al.
[ICST 2015]
MOSA: Many-Objective Search-
based Test Generation
16
Objective 1
Objective 2
Not all (non-dominated) solutions
are optimal for the purpose of testing
These points are
better than others
Panichella et. al.
[ICST 2015]
Tailoring MOSA to Our Context
• The time required to compute fitness functions is large
• Adaptive population size:
• At each iteration, we select the minimum number of test
cases closest to satisfying test objectives
17
Empirical Evaluation
Case Study
• Two Case Study systems from IEE – Our partner company
• Both systems consist of four self-driving features
• Adaptive Cruise Control (ACC)
• Automated Emergency Braking (AEB)
• Traffic Sign Recognition (TSR)
• Pedestrian Protection (PP)
• But, they use different rules to integrate feature actuator commands
19
RQ: Does our Hybrid test objectives reveal more feature
interaction failures compared to baseline test
objectives (coverage-based and failure-based)?
20
21
Hybrid test objectives
reveal significantly more
feature interaction failures
(more than twice)
compared to the baseline
alternatives.
4 80 2 6 10 12
Time (h)
(a) SafeDrive1
Numberoffeatureinteractionfailures
0
2
8
10
4
6
(b) SafeDrive2
0
2
8
10
4
6
Hybrid (mean)
Fail (mean)
Cov (mean)
Hybrid
Coverage-based
Failure-based
Feedback from Domain Experts
• The failures we found were due to undesired feature
interactions
• The failures were not previously known to them
• We identified ways to improve the feature integration logic to
avoid failures
22
Example Feature Interaction Failure
Summary
• Problem: How to automatically detect undesired feature interactions in self-
driving systems at early stages of development?
• Context: Executable Function models and Simulated Environment
• Approach: A search-based testing approach
• Hybrid test objectives (coverage-based, failure-based, unsafe overriding)
• A tailored many-objective search
• We have evaluated and validated our approach using industrial systems
23
Combining Test Objectives
• Goal: Execute every branch of the integration component such that
while executing that branch, the component unsafely overrides every
feature f and its outputs violate every safety requirement related to f
24
⌦j,l = Min{⌦j,l(i)}0iT
For every time step i, every branch j and every requirement l:
If branch j is not covered :
⌦j,l(i) = Branchj(i) + max(Overriding) + max(Failure)
ELSE If feature f is not unsafely overrides :
⌦j,l(i) = Overridingf (i) + max(Failure)
ELSE
⌦j,l(i) = Failurel(i)
Hybrid Test Objectives
• Our hybrid test objectives are able to guide the search
25
Indicates that tc has not covered the branch j
Branch covered but did not caused an unsafe override of f
Branch covered, f is unsafely overriden, but requirement
I
is not violated
⌦j,l(tc) > 2
1 ⌦j,l(tc) > 0
2 ⌦j,l(tc) > 1
Feature interaction failure is likely detected by tc⌦j,l(tc) = 0

Testing Autonomous Cars for Feature Interaction Failures using Many-Objective Search

  • 1.
    .lusoftware verification &validation VVS Testing Autonomous Cars for Feature Interaction Failures using Many- Objective Search Raja Ben Abdessalem1, Annibale Panichella1,2, Shiva Nejati1, Lionel Briand1 1 University of Luxembourg 2 TU Delft And Thomas Stifter, IEE Luxembourg
  • 2.
    Autonomous Car Features 2 AutomatedEmergency Braking (AEB) Traffic Sign Recognition (TSR)
  • 3.
  • 4.
    4 … … Actuator Commands: - Steering -Acceleration - BrakingConflicts Feature Interaction ProblemUndesired Feature Interactions Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command
  • 5.
    Problem Statement 5 How toautomatically detect undesired feature interactions in self-driving systems at early stages of development?
  • 6.
    Sensor/ Camera Data Autonomous Feature Actuator Command (Early) Function Modeling 6 … … Integration Component Actuator Command SoftwareUnder Test (SUT) Executable Function Models (Matlab/Simulink) Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command
  • 7.
    Testing Function Models 7 SUT Physics-based Simulators sensors/ cameras (ego)car other carsactuators pedestrians environment (road, weather, etc) sensor/ camera data Actuator commands Time Stamped Vectors … … Integration " Component Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command
  • 8.
    A Search-based Testing Approachto Detect Undesired Feature Interactions among Function Models of Self-Driving Features
  • 9.
    Our Test Objectives •A combination of three test objectives: • Coverage-based • Failure-based • Unsafe Overriding 9
  • 10.
    Coverage-based Test Objective 10 F1 F2 Fn … Integration component SUT Goal:Exercising as many branches of the integration component as possible if (condition) F1
  • 11.
    Failure-based Test Objective 11 F1 F2 Fn … Integration component SUT Example: -Req: No collision between pedestrians and cars - Generating test cases that minimize the distance between the car and the pedestrian Goal: Revealing violations of system-level requirements
  • 12.
    Unsafe Overriding TestObjective 12 F1 F2 Fn … Integration component SUT Goal: Finding failures that are more likely to be due to faults in the integration component rather faults in the features Braking-F1 0 .3 .3 .6 .8 1 1 Braking - Final F1 F1 F2 F2 F3 F3 F1 0 .3 .2 .2 .3 .3 1 Reward failures that could have been avoided if another feature had been prioritized by the integration logic
  • 13.
    Our Hybrid TestObjectives 13 ⌦j,l(tc) > 2 tc does not cover Branch j One hybrid test objective for every branch j and every requirement l⌦j,l 2 ⌦j,l(tc) > 1 tc covers branch j but F is not unsafely overriden 1 ⌦j,l(tc) > 0 tc covers branch j and F is unsafely overriden but req l is not violated ⌦j,l(tc) = 0 A feature interaction failure is likely detected if (cnd) F
  • 14.
    Search Algorithm • Goal:Computing a test suite that covers all the test objectives • Challenges: • The number of test objectives is large: # of requirements × # of branches • Computing test objectives is computationally expensive • Not a Pareto front optimization problem • Objectives compete with each others, e.g., cannot have the car violating the speed limit after hitting the leading car in one test case 14
  • 15.
    MOSA: Many-Objective Search- basedTest Generation 15 Objective 1 Objective 2 Not all (non-dominated) solutions are optimal for the purpose of testing Panichella et. al. [ICST 2015]
  • 16.
    MOSA: Many-Objective Search- basedTest Generation 16 Objective 1 Objective 2 Not all (non-dominated) solutions are optimal for the purpose of testing These points are better than others Panichella et. al. [ICST 2015]
  • 17.
    Tailoring MOSA toOur Context • The time required to compute fitness functions is large • Adaptive population size: • At each iteration, we select the minimum number of test cases closest to satisfying test objectives 17
  • 18.
  • 19.
    Case Study • TwoCase Study systems from IEE – Our partner company • Both systems consist of four self-driving features • Adaptive Cruise Control (ACC) • Automated Emergency Braking (AEB) • Traffic Sign Recognition (TSR) • Pedestrian Protection (PP) • But, they use different rules to integrate feature actuator commands 19
  • 20.
    RQ: Does ourHybrid test objectives reveal more feature interaction failures compared to baseline test objectives (coverage-based and failure-based)? 20
  • 21.
    21 Hybrid test objectives revealsignificantly more feature interaction failures (more than twice) compared to the baseline alternatives. 4 80 2 6 10 12 Time (h) (a) SafeDrive1 Numberoffeatureinteractionfailures 0 2 8 10 4 6 (b) SafeDrive2 0 2 8 10 4 6 Hybrid (mean) Fail (mean) Cov (mean) Hybrid Coverage-based Failure-based
  • 22.
    Feedback from DomainExperts • The failures we found were due to undesired feature interactions • The failures were not previously known to them • We identified ways to improve the feature integration logic to avoid failures 22 Example Feature Interaction Failure
  • 23.
    Summary • Problem: Howto automatically detect undesired feature interactions in self- driving systems at early stages of development? • Context: Executable Function models and Simulated Environment • Approach: A search-based testing approach • Hybrid test objectives (coverage-based, failure-based, unsafe overriding) • A tailored many-objective search • We have evaluated and validated our approach using industrial systems 23
  • 24.
    Combining Test Objectives •Goal: Execute every branch of the integration component such that while executing that branch, the component unsafely overrides every feature f and its outputs violate every safety requirement related to f 24 ⌦j,l = Min{⌦j,l(i)}0iT For every time step i, every branch j and every requirement l: If branch j is not covered : ⌦j,l(i) = Branchj(i) + max(Overriding) + max(Failure) ELSE If feature f is not unsafely overrides : ⌦j,l(i) = Overridingf (i) + max(Failure) ELSE ⌦j,l(i) = Failurel(i)
  • 25.
    Hybrid Test Objectives •Our hybrid test objectives are able to guide the search 25 Indicates that tc has not covered the branch j Branch covered but did not caused an unsafe override of f Branch covered, f is unsafely overriden, but requirement I is not violated ⌦j,l(tc) > 2 1 ⌦j,l(tc) > 0 2 ⌦j,l(tc) > 1 Feature interaction failure is likely detected by tc⌦j,l(tc) = 0