SlideShare a Scribd company logo
1 of 92
Download to read offline
Automated Testing of Autonomous
Driving Assistance Systems
Lionel Briand
TrustMeIA, 2019
Acknowledgments
2
Raja
Ben Abdessalem
Shiva Nejati Annibale
Panichella
Introduction
3
Cyber-Physical Systems
• Increasingly autonomous
• ML-based, e.g., DNN in
perception layer
• Not just recommendations, act
on the physical environment
• Safety implications
4
• A system of collaborating computational elements controlling
physical entities
CPS Development Process
5
Functional modeling:
• Controllers
• Plant
• Decision
Continuous and discrete
Simulink models
Model simulation and
testing
Architecture modelling
• Structure
• Behavior
• Traceability
System engineering modeling
(SysML)
Analysis:
• Model execution and
testing
• Model-based testing
• Traceability and
change impact
analysis
• ...
(partial) Code generation
Deployed executables on
target platform
Hardware (Sensors ...)
Analog simulators
Testing (expensive)
Hardware-in-the-Loop
Stage
Software-in-the-Loop
Stage
Model-in-the-Loop Stage
Formal Verification
• Significant Research Formal
verification: Exhaustive, guarantees
(deterministic or statistical) [Sheshia
et al. 2017]
• Models of system, environment, and
properties in formalisms for which
there are efficient decision
procedures.
• Challenges: Undecidability,
assumptions, scalability.
6
Seshia et al., Towards Verified AI
Limitations of Formal Verification
• Complexity and heterogeneity of CPS
• Non-Boolean properties
• Models capture continuous dynamics
• Not always easy or feasible to translate into low-level logic-
based languages
• Models contain constructs that are not easy to handle such as
non-algebraic arithmetics, third-party code (S-Functions).
7
Testing
• Focus on falsification (testing) and explanation.
• Automated testing with a focus on safety violations
• Explanation of safety violations è Risk assessment
• Testing takes place at different phases of development in CPS: MiL,
SiL, HiL.
• Testing does not prove satisfiability: It finds safety violations or
provide assurance cases (evidence)
8
Testing Cyber-Physical Systems
• MiL and SiL testing: Computationally expensive (simulation of physical
models)
• HiL: Human effort involved in setting up the hardware and analog
simulators
• Number of test executions tends to be limited compared to other types of
systems
• Test input space is often extremely large, i.e., determined by the
complexity of the physical environment
• Traceability between system testing and requirements is mandated by
standards
9
Testing Advanced Driving
Assistance Systems
10
Problem Definition
and State of Practice
11
Advanced Driver Assistance
Systems (ADAS)
12
Automated Emergency Braking (AEB)
Pedestrian Protection (PP)
Lane Departure Warning (LDW)
Traffic Sign Recognition (TSR)
Advanced Driver Assistance
Systems (ADAS)
Decisions are made over time based on sensor data
13
Sensors
Controller
Actuators Decision
Sensors
/Camera
Environment
ADAS
Automotive Environment
• Highly varied environments, e.g., road topology, weather, building and
pedestrians …
• Huge number of possible scenarios, e.g., determined by trajectories of
pedestrians and cars
• ADAS play an increasingly critical role in modern vehicles
• Systems must comply with functional safety standards, e.g., ISO 26262
• A challenge for testing
14
ISO 26262
• Defines the vehicle safety as the absence of unreasonable
risks that arise from malfunctions of the system.
• Requires safety goals, necessary to mitigate the risks.
• Provides requirements and recommendations to avoid and
control random hardware failures and systematic system
failures that could violate safety goals.
15
SOTIF
• ISO/PAS standard: Safety of the intended functionality (SOTIF).
• Autonomy: Huge increase in functionalities relying on advanced sensing,
algorithms (ML), and actuation.
• SOTIF accounts for limitations and risks related to nominal performance
of sensors and software:
• The inability of the function to correctly comprehend the situation and
operate safely; this also includes functions that use machine learning
algorithms;
• Insufficient robustness of the function with respect to sensor input
variations or diverse environmental conditions.
16
SOTIF: Scenes and Scenarios
17
Domain Conceptual Model (Terminology) [9]
Temporal view of scenes, events,
actions and situations in a scenario
18
Verification in SOTIF
• Decision algorithm verification: “… ability of the decision-
algorithm to react when required and its ability to avoid
unwanted action.”
• Integrated system verification: “Methods to verify the
robustness and the controllability of the system integrated
into the vehicle …”
19
Decision Algorithm Verification
20
Table 6: Decision Algorithm verification
Methods
A Verification of robustness to interference from other sources, e.g. white noise,
audio frequencies, Signal-to-Noise Ratio degradation (e.g. by noise injection
testing)
B Requirement-based test (e.g. classification, sensor data fusion, situation
analysis, function)
C Verification of the architectural properties including independence, if
applicable
D In the loop testing (e.g. SIL / HIL / MIL) on selected SOTIF relevant use cases
and scenarios
E Vehicle level testing on selected SOTIF relevant use cases and scenarios
F Inject inputs into the system that trigger potentially hazardous behaviour
NOTE For test case derivation the method of combinatorial testing can be used
[5]
Integrated System Verification
21
Table 8: Integrated system verification90
Methods
A Verification of robustness to Signal-to-Noise Ratio degradation (e.g. by noise
injection testing)
B Requirement-based Test when integrated within the vehicle environment (e.g.
range, precision, resolution, timing constraints, bandwidth)
C In the loop testing (e.g. SIL / HIL / MIL) on selected SOTIF relevant use cases
and scenarios
D System test under different environmental conditions (e.g. cold, damp, light,
visibility conditions)
E Verification of system ageing affects. (e.g. accelerated life testing)
F Randomized input tests a)
G Vehicle level testing on selected SOTIF relevant use cases and scenarios
H Controllability tests (including reasonably foreseeable misuse)
a) Randomized input tests can include erroneous patterns e.g. in the case of image
sensors adding flipped images or altered image patches; or in the case of radar
sensors adding ghost targets to simulate multi-path returns.
Annex D provides examples for the verification of perception systems.91
How to perform such testing
efficiently and effectively?
22
Research is needed
23
MiL Testing of ADAS
Recent Project
• Developing an automated testing technique
for ADAS
25
• To help engineers efficiently and
effectively explore the complex scenario
space of ADAS
• To identify critical, failure-revealing test
scenarios
• Characterization of input conditions that
lead to most critical scenarios, e.g.,
safety violations
26
Automated Emergency Braking
System (AEB)
26
“Brake-request”
when braking is needed
to avoid collisions
Decision making
Vision
(Camera)
Sensor
Brake
Controller
Objects’
position/speed
Example Critical Situation
• “AEB properly detects a pedestrian in front of the car with a
high degree of certainty and applies braking, but an accident
still happens where the car hits the pedestrian with a
relatively high speed”
27
Testing ADAS
28
A simulator based on
physical/mathematical models
Time-consuming
Expensive
On-road testing
Simulation-based (model) testing
Unsafe
Testing via Physics-based
Simulation
29
ADAS
(SUT)
Simulator (Matlab/Simulink)
Model
(Matlab/Simulink)
▪ Physical plant (vehicle / sensors / actuators)
▪ Other cars
▪ Pedestrians
▪ Environment (weather / roads / traffic signs)
Test input
Test output
time-stamped output
AEB Domain Model
30
- intensity: Real
SceneLight
Dynamic
Object
1
- weatherType:
Condition
Weather
- fog
- rain
- snow
- normal
«enumeration»
Condition
- field of view:
Real
Camera
Sensor
RoadSide
Object
- roadType: RT
Road
1 - curved
- straight
- ramped
«enumeration»
RT
- v0: Real
Vehicle
- x0: Real
- y0: Real
- θ: Real
- v0: Real
Pedestrian
- x: Real
- y: Real
Position
1
*
1
*
1
1
- state: Boolean
Collision
Parked
Cars
Trees
- simulationTime:
Real
- timeStep: Real
Test Scenario
AEB
- certainty: Real
Detection
1
1
11
1
1
1
1
«positioned»
«uses»
1 1
- AWA
Output
Trajectory
Environment inputs
Mobile object inputs
Outputs
ADAS Testing Challenges
• Test input space is multidimensional, large, and complex
• Explaining failures and fault localization are difficult
• Execution of physics-based simulation models is
computationally expensive
31
Our Approach
• We use decision tree classification models
• We use multi-objective search algorithm (NSGAII)
• Objective Functions:
• Each search iteration calls simulation to compute objective
functions
32
1. Minimum distance between the pedestrian and the
field of view
2. The car speed at the time of collision
3. The probability that the object detected is a pedestrian
Multiple Objectives: Pareto Front
33
Individual A Pareto
dominates individual B if
A is at least as good as B
in every objective
and better than B in at
least one objective.
Dominated by x
F1
F2
Pareto front
x
• A multi-objective optimization algorithm (e.g., NSGA II) must:
• Guide the search towards the global Pareto-Optimal front.
• Maintain solution diversity in the Pareto-Optimal front.
Search-based Testing Process
34
Test input generation (NSGA II)
Evaluating test inputs
- Select best tests
- Generate new tests
(candidate)
test inputs
- Simulate every (candidate) test
- Compute fitness functions
Fitness
values
Test cases revealing worst case system behaviors
Input data ranges/dependencies + Simulator + Fitness functions
Search: Genetic Evolution
35
Initial input
Fitness
computation
Selection
Breeding
Better Guidance
• Fitness computations rely on simulations and are very
expensive
• Search needs better guidance
36
Decision Trees
37
Partition the input space into homogeneous regions
All points
Count 1200
“non-critical” 79%
“critical” 21%
“non-critical” 59%
“critical” 41%
Count 564 Count 636
“non-critical” 98%
“critical” 2%
Count 412
“non-critical” 49%
“critical” 51%
Count 152
“non-critical” 84%
“critical” 16%
Count 230 Count 182
vp
0 >= 7.2km/h vp
0 < 7.2km/h
✓p
0 < 218.6 ✓p
0 >= 218.6
RoadTopology(CR = 5,
Straight, RH = [4 12](m))
RoadTopology
(CR = [10 40](m))
“non-critical” 31%
“critical” 69%
“non-critical” 72%
“critical” 28%
Genetic Evolution Guided by
Classification
38
Initial input
Fitness
computation
Classification
Selection
Breeding
Search Guided by Classification
39
Test input generation (NSGA II)
Evaluating test inputs
Build a classification tree
Select/generate tests in the fittest regions
Apply genetic operators
Input data ranges/dependencies + Simulator + Fitness functions
(candidate)
test inputs
- Simulate every (candidate) test
- Compute fitness functions
Fitness
values
Test cases revealing worst case system behaviors +
A characterization of critical input regions
NSGAII-DT vs. NSGAII
40
NSGAII-DT outperforms NSGAII
HV
0.0
0.4
0.8
GD
0.05
0.15
0.25
SP
2
0.6
1.0
1.4
6 10 14 18 22 24
Time (h)
NSGAII-DT
NSGAII
Generated Decision Trees
41
GoodnessOfFit
RegionSize
1 5 642 3
0.40
0.50
0.60
0.70
tree generations
(b)
0.80
71 5 642 3
0.00
0.05
0.10
0.15
tree generations
(a)
0.20
7
GoodnessOfFit-crt
1 5 642 3
0.30
0.50
0.70
tree generations
(c)
0.90
7
The generated critical regions consistently become smaller, more
homogeneous and more precise over successive tree generations of
NSGAII-DT
Usefulness
• The characterizations of the different critical regions can
help with:
(1) Debugging the system model
(2) Identifying possible hardware changes to increase
ADAS safety
(3) Providing proper warnings to drivers
42
Testing for Feature
Interactions
Integration of Autonomous
Features in ADAS
44
Automated Emergency Braking (AEB)
Pedestrian Protection (PP)
Lane Departure Warning (LDW)
Traffic Sign Recognition (TSR)
45
…
…
Actuator Commands:
- Steering
- Acceleration
- BrakingConflicts
Undesired Feature Interactions
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
(Early) Function Modeling
46
…
…
Integration
Component
Actuator
Command
Software Under Test (SUT)
Executable
Function Models
(Matlab/Simulink)
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Example of Rules
47
Conditions Features
Main challenge
• Resolving conflicts:
• Difficult task
• Requires extensive domain expertise and a thorough
analysis of system requirements
• Problems:
• There might be mistakes, inaccuracies and bugs in the
developed rules
48
Problem Statement
49
How to automatically detect undesired feature
interactions in self-driving systems at early
stages of development?
Testing Function Models
50
SUT
Physics-based
Simulators
sensors/
cameras
(ego) car
other carsactuators
pedestrians
environment (road, weather, etc)
sensor/
camera data
Actuator
commands
Time Stamped Vectors
…
…
Integration "
Component
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Sensor/
Camera
Data
Autonomous
Feature
Actuator
Command
Search-Based Solution
• We do not know a priori which safety requirements may be
violated. Neither do we know in which branches of IntC the
violations may be detected. Therefore, we search for any violation
of system safety requirements that may arise when exercising any
branch of IntC.
• A combination of three test objectives:
• Coverage-based
• Failure-based
• Unsafe Overriding
51
Coverage-based Test Objective
52
F1
F2
Fn
…
Integration
component
SUT
Goal: Exercising as many branches of the integration component as
possible
if (condition)
F1
Branch Distance
• Many decision branches in IntC
• Branch coverage of IntC
• Fitness: Approach level and branch
distance d (standard for code
coverage)
• d(b,tc) = 0 when tc covers b
53
Failure-based Test Objective
54
F1
F2
Fn
…
Integration
component
SUT
Example:
- Requirement: No collision between
pedestrians and cars
- Generating test cases that minimize
the distance between the car and the
pedestrian
Goal: Revealing violations of system-level requirements
Fitness: Failure Distance
• Fitness functions based on the trajectory vectors for the ego
car, the leading car and the pedestrian, generated by the
simulator
• PP fitness: Minimum distance between the car and the
pedestrian during the simulation time.
• AEB fitness: Minimum distance between the car and the
leading car during the simulation time.
55
Distance Functions
56
When any of the functions yields zero,
a safety failure corresponding to
that function is detected.
Unsafe Overriding Test Objective
57
F1
F2
Fn
…
Integration
component
SUT
Goal: Finding failures that are more likely to be due to faults in
the integration component rather faults in the features
Braking-F1
0 .3 .3 .6 .8 1 1
Braking - Final
F1 F1 F2 F2 F3 F3 F1
0 .3 .2 .2 .3 .3 1
Reward failures that could have been avoided if another feature
had been prioritized by the integration logic
Combining Test Objectives
• Goal: Execute every branch of the integration component such that
while executing that branch, the component unsafely overrides every
feature f and its outputs violate every safety requirement related to f
58
⌦j,l = Min{⌦j,l(i)}0iT
For every time step i, every branch j and every requirement l:
If branch j is not covered :
⌦j,l(i) = Branchj(i) + max(Overriding) + max(Failure)
ELSE If feature f is not unsafely overrides :
⌦j,l(i) = Overridingf (i) + max(Failure)
ELSE
⌦j,l(i) = Failurel(i)
Our Hybrid Test Objectives
59
⌦j,l(tc) > 2 tc does not cover Branch j
One hybrid test objective for every branch j and every requirement l⌦j,l
2 ⌦j,l(tc) > 1 tc covers branch j but F is not
unsafely overriden
1 ⌦j,l(tc) > 0
tc covers branch j and F is
unsafely overriden but req l is
not violated
⌦j,l(tc) = 0 A feature interaction failure is likely detected
if (cnd)
F
Search Algorithm
• Optimal test suite covers all search objectives, i.e., for all IntC branches
and all safety requirements
• The number of test objectives is large:
# of requirements × # of branches
• Computing test objectives is computationally expensive
• Not a Pareto front optimization problem
• Objectives compete with each others, e.g., cannot have the car
violating the speed limit after hitting the leading car in one test case
60
MOSA: Many-Objective Search-
based Test Generation
61
Objective 1
Objective 2
Not all (non-dominated) solutions
are optimal for the purpose of testing
These points are
better than others
Panichella et. al.
[ICST 2015]
Case Study
• Two Case Study systems from IEE
• Both systems consist of four self-driving features
• Adaptive Cruise Control (ACC)
• Automated Emergency Braking (AEB)
• Traffic Sign Recognition (TSR)
• Pedestrian Protection (PP)
• But, they use different rules to integrate feature actuator commands
• RQ: Does our Hybrid test objectives reveal more feature interaction failures
compared to baseline test objectives (coverage-based and failure-based)?
62
63
Hybrid test objectives
reveal significantly more
feature interaction failures
(more than twice)
compared to the baseline
alternatives.
4 80 2 6 10 12
Time (h)
(a) SafeDrive1
Numberoffeatureinteractionfailures
0
2
8
10
4
6
(b) SafeDrive2
0
2
8
10
4
6
Hybrid (mean)
Fail (mean)
Cov (mean)
Hybrid
Coverage-based
Failure-based
Feedback from Domain Experts
• The failures we found were due to undesired feature
interactions
• The failures were not previously known to them
• We identified ways to improve the feature integration logic to
avoid failures
64
Summary
• Problem: How to automatically detect undesired feature interactions
in self-driving systems at early stages of development?
• Context: Executable Function models and Simulated Environment
• Approach: A search-based testing approach
• Hybrid test objectives (coverage-based, failure-based, unsafe
overriding)
• A tailored many-objective search algorithm
• We have evaluated and validated our approach: Promising results
65
Repairing Feature
Interaction Failures
Problem Statement
•How to automatically repair (some of the errors in)
the feature integration logic?
67
Types of errors in the developed
Rules
• The conjunctive conditions on the left side of the rules may be wrong
• Issues in determining the order of conflict resolution rules
• There is a partial order over rules
• Different rule orderings lead to different system behaviors and it is not
clear what order should be used
68
Our Solution
•A Search-based Fault Localization and Repairing
Approach for the Decision Rules of Self-Driving
Systems
•Many-objective search: Minimize severity of
failures until all test cases pass
69
Program Repair
• Program repair process:
• Step 1- Fault Localization: identifies the locations of
faults
• Step 2 - Patch generation: modifies the software in the
faulty code locations
• Step 3 – Program validation: checks if the synthesized
patch has actually repaired the software
70
Faulty Program Repair Tests
Fault Localization
Program Modification
(Patch Generation)
Program Validation
Repaired Program
Valid
Step 1
Step 2
Step 3
No
Yes
Input
Output
Program Repair Process
• Repair Tests: Failing test cases (demonstrate a defect) +
Passing test cases (satisfy system behavior)
Goal: automatically repair software systems
State of the Art: Challenges
•Our errors span several lines of code
•Each test case covers most paths over
simulation steps
•High suspiciousness for most statements
71
Our Approach
• Goal: Identify errors in the decision rules for self-driving
systems and automatically repair these errors
72
Fault Localization (FL) Patch Generation
Focus on faulty statements
that are related to the
most severe failures
Define mutation operators
to modify the decision rules
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
FL - Illustrations
73
0 1 2 … t … T
s1
s2
s3
s4
s5
s6
s7
…
sn
Simulation time
Statements
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
Test Suite
tc1
tc2
tc3
x : if s is executed
Faulty Program
Decision
ruleConjunctive
conditions
Features
FL - Illustrations
74
Simulation time
Statements
Test Suite
tc2
tc3
Status
tc1 tc2 tc3
✗
tc1
time of failure
Rule executed at
the time of failure ⇡tc1
Faulty Program
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
0 1 2 … tj … T
s1 x
s2 x
s3 x
s4 x
s5 x
s6 x
s7 x
…
sn x x
Status
tc1 tc2 tc3
FL - Illustrations
75
Simulation time
Statements
Test Suite
tc3
x : if s is executed
✗
tc2
Faulty Program
0 1 2 … t … T
s1 x
s2 x
s3 x
s4 x
s5 x
s6 x
s7 x
…
sn x x
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
Status
tc1 tc2 tc3
FL - Illustrations
76
Simulation time
Statements
Test Suite
✗
tc3
✗
time of failure
Rule executed at
the time of failure
⇡tc3
Faulty Program
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
0 1 2 … tk … T
s1 x x
s2 x x
s3 x x
s4 x x
s5 x x
s6 x
s7 x
…
sn
Ranking
• We define the suspiciousness of each statement s
by considering both failure severity and faulty rules
77
Score(s) =
X
tcj 2T S
Itcj
(s)
Rules Score
s1 0.9
s2 0.9
s3 0.9
s6 0.6
s7 0.6
s8 0.6
s4 0
s5 0
…
!tcj
: severity of failure exposed by tcj
• We select a statement among the most suspicious
ones in a randomized way
Itcj
=
⇢
0 if s /2 faulty rule ⇡tcj
!tcj
otherwise
Probabilistic Fault Localization
•For each failure, select in a randomized manner a
statement among most suspicious ones
•Select different statements in a probabilistic
manner (suspiciousness) across search
interactions
•Generate and apply a patch
78
Generate a Patch
• Mutation operators
1. Modify operator
2. Shift operator
• Apply a sequence of randomly selected operators
• Evaluate the patch (run test simulations)
79
Modify
80
1: if(TTC < TTCth && Dist(P/C) < Dist(P/C)th && object detected is pedestrian)
2: Apply PP (braking force = 100%)
3: else if(speed of the car > speed limit)
4: Apply TSR (braking force = 40%)
5: else if(TTC < TTCth && object detected is not pedestrian)
6: Apply AEB (braking force = 60%)
7: else if …
……
n: ……
Example
1: if(TTC < TTCth && Dist(P/C) < Dist(P/C)th + 2 && object detected is pedestrian)
2: Apply PP (braking force = 100%)
3: else if(speed of the car > speed limit)
4: Apply TSR (braking force = 40%)
5: else if(TTC < TTCth && object detected is not pedestrian)
6: Apply AEB (braking force = 60%)
7: else if …
……
n: ……
• Modify operator: modify the condition of the faulty statement by
changing a threshold value, altering the direction of a relational
operator (e.g., > to <) or swapping arithmetic operations (e.g., + to - )
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
Shift
• Shift operator: Change the order of rules
81
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition2(s4)
statement condition2(s5)
else if condition4(s9)
statement condition4(s10)
else if condition5(s11)
statement condition5(s12)
else if condition6(s13)
statement condition6(s14)
:
:
if condition1(s1)
if condition11 (s2)
statement condition11(s3)
end
else if condition2(s4)
statement condition2(s5)
else if condition5(s11)
statement condition5(s12)
else if condition3(s6)
if condition11 (s7)
statement condition31(s8)
end
else if condition4(s9)
statement condition4(s10)
else if condition6(s13)
statement condition6(s14)
:
:
Our Search-based Repairing
Approach
82
Evaluate the Patch
Faulty
Program
Repaired
Program
Test Cases
Localize
Fault
Archive
Generate
a Patch
Faulty
Program
good?
yes
update
the
archive
Evaluate the Patch
Faulty
Program
Repaired
Program
Test Cases
Localize
Fault
Archive
Generate
a Patch
good?
yes
update
the
archive
Best Patched
Programs select a patch
from the archive
Preliminary evaluation
• RQ: How effective and efficient is our approach in localizing
and repairing faults?
83
Results
• Able to repair the decision rules (change the order of the rules
and change some threshold values)
• Able to find different correct versions of the decision rules
84
Time needed to repair errors
Our approach takes on average 5 hours
to repair the decision rules of
self-driving systems
Summary
• We proposed an automated approach to repair errors in the
decision rules for self-driving systems
• A many-objective search-based testing approach
• Our approach effectively and efficiently localizes and repairs
errors in decision rules
• Limitations: Cannot repair all kinds of errors, e.g., adding or
removing rules
85
Conclusions
86
Other CPS Domains
• Similar ISO standards
• With the rise of DNNs, they will develop SOTIF-like standards
• Most CPS are safety critical
• Most CPS follow similar development phases
• Different domain models, e.g., notion of scenes and scenarios
• Different, dedicated simulators
87
Reflections
• Search-based solutions are versatile
• Help relax assumptions compared to exact approaches
• Help decrease modeling requirements
• Scalability, e.g., easy to parallelize
• Require massive empirical studies for validation
• Search is rarely sufficient by itself
88
Reflections
• Combine with machine learning to make the search more
effective and efficient
• Combine search with solvers, e.g., SMT, model checking,
Constraint programming, on model fragments.
• Grey box approaches to search, i.e., use information from
the Simulink models to better guide search.
• Impact of deep learning components in autonomous
systems’ testing and verification
89
The Impact of DNNs
• Increasingly so, it is easier to learn behavior from data using
machine learning, rather than specify and code
• Some ADAS components rely on deep learning …
• Thousands of weights learned (Deep Neural Networks)
• No explicit code, no specifications
• Verification, testing?
• State of the art includes adequacy coverage criteria and mutation
testing for DNNs
90
Selected References
• Briand et al. “Testing the untestable: Model testing of complex
software-intensive systems”, IEEE/ACM ICSE 2016, V2025
• Ben Abdessalem et al., " Testing Advanced Driver Assistance
Systems using Multi-objective Search and Neural Networks, ASE
2016
• Ben Abdessalem et al., "Testing Vision-Based Control Systems
Using Learnable Evolutionary Algorithms”, ICSE 2018
• Ben Abdessalem et al., "Testing Autonomous Cars for Feature
Interaction Failures using Many-Objective Search”, ASE 2018
91
Automated Testing of Autonomous
Driving Assistance Systems
Lionel Briand
TrustMeIA, 2019

More Related Content

What's hot

JOSHUA SEMINAR FINAL (ADAS) (4) (2).pptx
JOSHUA SEMINAR FINAL (ADAS) (4) (2).pptxJOSHUA SEMINAR FINAL (ADAS) (4) (2).pptx
JOSHUA SEMINAR FINAL (ADAS) (4) (2).pptx
FREDYJoy2
 
MISRA Safety Case Guidelines -
MISRA Safety Case Guidelines - MISRA Safety Case Guidelines -
MISRA Safety Case Guidelines -
Automotive IQ
 

What's hot (20)

autonomous car
autonomous carautonomous car
autonomous car
 
Autonomous or self driving cars
Autonomous or self driving carsAutonomous or self driving cars
Autonomous or self driving cars
 
Self driving car
Self driving carSelf driving car
Self driving car
 
Software defined vehicles,automotive standards (safety, security), agile cont...
Software defined vehicles,automotive standards (safety, security), agile cont...Software defined vehicles,automotive standards (safety, security), agile cont...
Software defined vehicles,automotive standards (safety, security), agile cont...
 
JOSHUA SEMINAR FINAL (ADAS) (4) (2).pptx
JOSHUA SEMINAR FINAL (ADAS) (4) (2).pptxJOSHUA SEMINAR FINAL (ADAS) (4) (2).pptx
JOSHUA SEMINAR FINAL (ADAS) (4) (2).pptx
 
The Internet of Cars - Towards the Future of the Connected Car
The Internet of Cars - Towards the Future of the Connected CarThe Internet of Cars - Towards the Future of the Connected Car
The Internet of Cars - Towards the Future of the Connected Car
 
Driving behavior for ADAS and Autonomous Driving
Driving behavior for ADAS and Autonomous DrivingDriving behavior for ADAS and Autonomous Driving
Driving behavior for ADAS and Autonomous Driving
 
Artificial Intelligence in Autonomous Vehicles
Artificial Intelligence in Autonomous VehiclesArtificial Intelligence in Autonomous Vehicles
Artificial Intelligence in Autonomous Vehicles
 
Autonomous cars
Autonomous carsAutonomous cars
Autonomous cars
 
Autonomous Vehicles
Autonomous VehiclesAutonomous Vehicles
Autonomous Vehicles
 
Techniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous DrivingTechniques and Challenges in Autonomous Driving
Techniques and Challenges in Autonomous Driving
 
Seminar on Advanced Driver Assistance Systems (ADAS).pptx
Seminar on Advanced Driver Assistance Systems (ADAS).pptxSeminar on Advanced Driver Assistance Systems (ADAS).pptx
Seminar on Advanced Driver Assistance Systems (ADAS).pptx
 
Autonomous Vehicles
Autonomous VehiclesAutonomous Vehicles
Autonomous Vehicles
 
Autonomous Driving
Autonomous DrivingAutonomous Driving
Autonomous Driving
 
Software Defined Car
Software Defined CarSoftware Defined Car
Software Defined Car
 
Autonomous car
Autonomous carAutonomous car
Autonomous car
 
Machine Learning for Self-Driving Cars
Machine Learning for Self-Driving CarsMachine Learning for Self-Driving Cars
Machine Learning for Self-Driving Cars
 
MISRA Safety Case Guidelines -
MISRA Safety Case Guidelines - MISRA Safety Case Guidelines -
MISRA Safety Case Guidelines -
 
Advanced driver assistance systems
Advanced driver assistance systemsAdvanced driver assistance systems
Advanced driver assistance systems
 
Autonomous car
Autonomous carAutonomous car
Autonomous car
 

Similar to Automated Testing of Autonomous Driving Assistance Systems

Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Lionel Briand
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Lionel Briand
 
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Lionel Briand
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Lionel Briand
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive SystemsTesting the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Lionel Briand
 

Similar to Automated Testing of Autonomous Driving Assistance Systems (20)

Functional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical SystemsFunctional Safety in ML-based Cyber-Physical Systems
Functional Safety in ML-based Cyber-Physical Systems
 
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
Achieving Scalability in Software Testing with Machine Learning and Metaheuri...
 
Testing Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal PerspectiveTesting Machine Learning-enabled Systems: A Personal Perspective
Testing Machine Learning-enabled Systems: A Personal Perspective
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
 
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...Automated and Scalable Solutions for Software Testing: The Essential Role of ...
Automated and Scalable Solutions for Software Testing: The Essential Role of ...
 
AI in SE: A 25-year Journey
AI in SE: A 25-year JourneyAI in SE: A 25-year Journey
AI in SE: A 25-year Journey
 
Validation Framework for Autonomous Aerial Vehicles
Validation Framework for Autonomous Aerial VehiclesValidation Framework for Autonomous Aerial Vehicles
Validation Framework for Autonomous Aerial Vehicles
 
Enabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial IntelligenceEnabling Automated Software Testing with Artificial Intelligence
Enabling Automated Software Testing with Artificial Intelligence
 
Presentation by Lionel Briand
Presentation by Lionel BriandPresentation by Lionel Briand
Presentation by Lionel Briand
 
Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...Scalable Software Testing and Verification of Non-Functional Properties throu...
Scalable Software Testing and Verification of Non-Functional Properties throu...
 
SBST 2019 Keynote
SBST 2019 Keynote SBST 2019 Keynote
SBST 2019 Keynote
 
Testing of Cyber-Physical Systems: Diversity-driven Strategies
Testing of Cyber-Physical Systems: Diversity-driven StrategiesTesting of Cyber-Physical Systems: Diversity-driven Strategies
Testing of Cyber-Physical Systems: Diversity-driven Strategies
 
Technology Insertion: A Well-Grounded Approach to Implementing Out of this Wo...
Technology Insertion: A Well-Grounded Approach to Implementing Out of this Wo...Technology Insertion: A Well-Grounded Approach to Implementing Out of this Wo...
Technology Insertion: A Well-Grounded Approach to Implementing Out of this Wo...
 
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...Making Model-Driven Verification Practical and Scalable: Experiences and Less...
Making Model-Driven Verification Practical and Scalable: Experiences and Less...
 
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
Testing Autonomous Cars for Feature Interaction Failures using Many-Objective...
 
Model-based Regression Testing of Autonomous Robots
Model-based Regression Testing of Autonomous RobotsModel-based Regression Testing of Autonomous Robots
Model-based Regression Testing of Autonomous Robots
 
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and SafetyAutonomous Systems: How to Address the Dilemma between Autonomy and Safety
Autonomous Systems: How to Address the Dilemma between Autonomy and Safety
 
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive SystemsTesting the Untestable: Model Testing of Complex Software-Intensive Systems
Testing the Untestable: Model Testing of Complex Software-Intensive Systems
 
Tod Levitt
Tod LevittTod Levitt
Tod Levitt
 
Autonomous Industry Feedback
Autonomous Industry Feedback Autonomous Industry Feedback
Autonomous Industry Feedback
 

More from Lionel Briand

Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Lionel Briand
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
Lionel Briand
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Lionel Briand
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
Lionel Briand
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Lionel Briand
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
Lionel Briand
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Lionel Briand
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Lionel Briand
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Lionel Briand
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
Lionel Briand
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Lionel Briand
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Lionel Briand
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
Lionel Briand
 
Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
Lionel Briand
 

More from Lionel Briand (20)

Precise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive GoalPrecise and Complete Requirements? An Elusive Goal
Precise and Complete Requirements? An Elusive Goal
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Metamorphic Testing for Web System Security
Metamorphic Testing for Web System SecurityMetamorphic Testing for Web System Security
Metamorphic Testing for Web System Security
 
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-...
 
Fuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation TestingFuzzing for CPS Mutation Testing
Fuzzing for CPS Mutation Testing
 
Data-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical SystemsData-driven Mutation Analysis for Cyber-Physical Systems
Data-driven Mutation Analysis for Cyber-Physical Systems
 
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled SystemsMany-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
Many-Objective Reinforcement Learning for Online Testing of DNN-Enabled Systems
 
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
ATM: Black-box Test Case Minimization based on Test Code Similarity and Evolu...
 
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
Black-box Safety Analysis and Retraining of DNNs based on Feature Extraction ...
 
PRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System LogsPRINS: Scalable Model Inference for Component-based System Logs
PRINS: Scalable Model Inference for Component-based System Logs
 
Revisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software TestingRevisiting the Notion of Diversity in Software Testing
Revisiting the Notion of Diversity in Software Testing
 
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...Applications of Search-based Software Testing to Trustworthy Artificial Intel...
Applications of Search-based Software Testing to Trustworthy Artificial Intel...
 
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
Mathematicians, Social Scientists, or Engineers? The Split Minds of Software ...
 
Reinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case PrioritizationReinforcement Learning for Test Case Prioritization
Reinforcement Learning for Test Case Prioritization
 
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
Mutation Analysis for Cyber-Physical Systems: Scalable Solutions and Results ...
 
On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...On Systematically Building a Controlled Natural Language for Functional Requi...
On Systematically Building a Controlled Natural Language for Functional Requi...
 
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
Efficient Online Testing for DNN-Enabled Systems using Surrogate-Assisted and...
 
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...Guidelines for Assessing the Accuracy of Log Message Template Identification ...
Guidelines for Assessing the Accuracy of Log Message Template Identification ...
 
A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...A Theoretical Framework for Understanding the Relationship between Log Parsin...
A Theoretical Framework for Understanding the Relationship between Log Parsin...
 
Requirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and ApplicationsRequirements in Cyber-Physical Systems: Specifications and Applications
Requirements in Cyber-Physical Systems: Specifications and Applications
 

Recently uploaded

AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 

Recently uploaded (20)

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...Chinsurah Escorts ☎️8617697112  Starting From 5K to 15K High Profile Escorts ...
Chinsurah Escorts ☎️8617697112 Starting From 5K to 15K High Profile Escorts ...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
Crypto Cloud Review - How To Earn Up To $500 Per DAY Of Bitcoin 100% On AutoP...
 
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456LEVEL 5   - SESSION 1 2023 (1).pptx - PDF 123456
LEVEL 5 - SESSION 1 2023 (1).pptx - PDF 123456
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide DeckManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
MarTech Trend 2024 Book : Marketing Technology Trends (2024 Edition) How Data...
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 

Automated Testing of Autonomous Driving Assistance Systems

  • 1. Automated Testing of Autonomous Driving Assistance Systems Lionel Briand TrustMeIA, 2019
  • 4. Cyber-Physical Systems • Increasingly autonomous • ML-based, e.g., DNN in perception layer • Not just recommendations, act on the physical environment • Safety implications 4 • A system of collaborating computational elements controlling physical entities
  • 5. CPS Development Process 5 Functional modeling: • Controllers • Plant • Decision Continuous and discrete Simulink models Model simulation and testing Architecture modelling • Structure • Behavior • Traceability System engineering modeling (SysML) Analysis: • Model execution and testing • Model-based testing • Traceability and change impact analysis • ... (partial) Code generation Deployed executables on target platform Hardware (Sensors ...) Analog simulators Testing (expensive) Hardware-in-the-Loop Stage Software-in-the-Loop Stage Model-in-the-Loop Stage
  • 6. Formal Verification • Significant Research Formal verification: Exhaustive, guarantees (deterministic or statistical) [Sheshia et al. 2017] • Models of system, environment, and properties in formalisms for which there are efficient decision procedures. • Challenges: Undecidability, assumptions, scalability. 6 Seshia et al., Towards Verified AI
  • 7. Limitations of Formal Verification • Complexity and heterogeneity of CPS • Non-Boolean properties • Models capture continuous dynamics • Not always easy or feasible to translate into low-level logic- based languages • Models contain constructs that are not easy to handle such as non-algebraic arithmetics, third-party code (S-Functions). 7
  • 8. Testing • Focus on falsification (testing) and explanation. • Automated testing with a focus on safety violations • Explanation of safety violations è Risk assessment • Testing takes place at different phases of development in CPS: MiL, SiL, HiL. • Testing does not prove satisfiability: It finds safety violations or provide assurance cases (evidence) 8
  • 9. Testing Cyber-Physical Systems • MiL and SiL testing: Computationally expensive (simulation of physical models) • HiL: Human effort involved in setting up the hardware and analog simulators • Number of test executions tends to be limited compared to other types of systems • Test input space is often extremely large, i.e., determined by the complexity of the physical environment • Traceability between system testing and requirements is mandated by standards 9
  • 12. Advanced Driver Assistance Systems (ADAS) 12 Automated Emergency Braking (AEB) Pedestrian Protection (PP) Lane Departure Warning (LDW) Traffic Sign Recognition (TSR)
  • 13. Advanced Driver Assistance Systems (ADAS) Decisions are made over time based on sensor data 13 Sensors Controller Actuators Decision Sensors /Camera Environment ADAS
  • 14. Automotive Environment • Highly varied environments, e.g., road topology, weather, building and pedestrians … • Huge number of possible scenarios, e.g., determined by trajectories of pedestrians and cars • ADAS play an increasingly critical role in modern vehicles • Systems must comply with functional safety standards, e.g., ISO 26262 • A challenge for testing 14
  • 15. ISO 26262 • Defines the vehicle safety as the absence of unreasonable risks that arise from malfunctions of the system. • Requires safety goals, necessary to mitigate the risks. • Provides requirements and recommendations to avoid and control random hardware failures and systematic system failures that could violate safety goals. 15
  • 16. SOTIF • ISO/PAS standard: Safety of the intended functionality (SOTIF). • Autonomy: Huge increase in functionalities relying on advanced sensing, algorithms (ML), and actuation. • SOTIF accounts for limitations and risks related to nominal performance of sensors and software: • The inability of the function to correctly comprehend the situation and operate safely; this also includes functions that use machine learning algorithms; • Insufficient robustness of the function with respect to sensor input variations or diverse environmental conditions. 16
  • 17. SOTIF: Scenes and Scenarios 17 Domain Conceptual Model (Terminology) [9]
  • 18. Temporal view of scenes, events, actions and situations in a scenario 18
  • 19. Verification in SOTIF • Decision algorithm verification: “… ability of the decision- algorithm to react when required and its ability to avoid unwanted action.” • Integrated system verification: “Methods to verify the robustness and the controllability of the system integrated into the vehicle …” 19
  • 20. Decision Algorithm Verification 20 Table 6: Decision Algorithm verification Methods A Verification of robustness to interference from other sources, e.g. white noise, audio frequencies, Signal-to-Noise Ratio degradation (e.g. by noise injection testing) B Requirement-based test (e.g. classification, sensor data fusion, situation analysis, function) C Verification of the architectural properties including independence, if applicable D In the loop testing (e.g. SIL / HIL / MIL) on selected SOTIF relevant use cases and scenarios E Vehicle level testing on selected SOTIF relevant use cases and scenarios F Inject inputs into the system that trigger potentially hazardous behaviour NOTE For test case derivation the method of combinatorial testing can be used [5]
  • 21. Integrated System Verification 21 Table 8: Integrated system verification90 Methods A Verification of robustness to Signal-to-Noise Ratio degradation (e.g. by noise injection testing) B Requirement-based Test when integrated within the vehicle environment (e.g. range, precision, resolution, timing constraints, bandwidth) C In the loop testing (e.g. SIL / HIL / MIL) on selected SOTIF relevant use cases and scenarios D System test under different environmental conditions (e.g. cold, damp, light, visibility conditions) E Verification of system ageing affects. (e.g. accelerated life testing) F Randomized input tests a) G Vehicle level testing on selected SOTIF relevant use cases and scenarios H Controllability tests (including reasonably foreseeable misuse) a) Randomized input tests can include erroneous patterns e.g. in the case of image sensors adding flipped images or altered image patches; or in the case of radar sensors adding ghost targets to simulate multi-path returns. Annex D provides examples for the verification of perception systems.91
  • 22. How to perform such testing efficiently and effectively? 22
  • 25. Recent Project • Developing an automated testing technique for ADAS 25 • To help engineers efficiently and effectively explore the complex scenario space of ADAS • To identify critical, failure-revealing test scenarios • Characterization of input conditions that lead to most critical scenarios, e.g., safety violations
  • 26. 26 Automated Emergency Braking System (AEB) 26 “Brake-request” when braking is needed to avoid collisions Decision making Vision (Camera) Sensor Brake Controller Objects’ position/speed
  • 27. Example Critical Situation • “AEB properly detects a pedestrian in front of the car with a high degree of certainty and applies braking, but an accident still happens where the car hits the pedestrian with a relatively high speed” 27
  • 28. Testing ADAS 28 A simulator based on physical/mathematical models Time-consuming Expensive On-road testing Simulation-based (model) testing Unsafe
  • 29. Testing via Physics-based Simulation 29 ADAS (SUT) Simulator (Matlab/Simulink) Model (Matlab/Simulink) ▪ Physical plant (vehicle / sensors / actuators) ▪ Other cars ▪ Pedestrians ▪ Environment (weather / roads / traffic signs) Test input Test output time-stamped output
  • 30. AEB Domain Model 30 - intensity: Real SceneLight Dynamic Object 1 - weatherType: Condition Weather - fog - rain - snow - normal «enumeration» Condition - field of view: Real Camera Sensor RoadSide Object - roadType: RT Road 1 - curved - straight - ramped «enumeration» RT - v0: Real Vehicle - x0: Real - y0: Real - θ: Real - v0: Real Pedestrian - x: Real - y: Real Position 1 * 1 * 1 1 - state: Boolean Collision Parked Cars Trees - simulationTime: Real - timeStep: Real Test Scenario AEB - certainty: Real Detection 1 1 11 1 1 1 1 «positioned» «uses» 1 1 - AWA Output Trajectory Environment inputs Mobile object inputs Outputs
  • 31. ADAS Testing Challenges • Test input space is multidimensional, large, and complex • Explaining failures and fault localization are difficult • Execution of physics-based simulation models is computationally expensive 31
  • 32. Our Approach • We use decision tree classification models • We use multi-objective search algorithm (NSGAII) • Objective Functions: • Each search iteration calls simulation to compute objective functions 32 1. Minimum distance between the pedestrian and the field of view 2. The car speed at the time of collision 3. The probability that the object detected is a pedestrian
  • 33. Multiple Objectives: Pareto Front 33 Individual A Pareto dominates individual B if A is at least as good as B in every objective and better than B in at least one objective. Dominated by x F1 F2 Pareto front x • A multi-objective optimization algorithm (e.g., NSGA II) must: • Guide the search towards the global Pareto-Optimal front. • Maintain solution diversity in the Pareto-Optimal front.
  • 34. Search-based Testing Process 34 Test input generation (NSGA II) Evaluating test inputs - Select best tests - Generate new tests (candidate) test inputs - Simulate every (candidate) test - Compute fitness functions Fitness values Test cases revealing worst case system behaviors Input data ranges/dependencies + Simulator + Fitness functions
  • 35. Search: Genetic Evolution 35 Initial input Fitness computation Selection Breeding
  • 36. Better Guidance • Fitness computations rely on simulations and are very expensive • Search needs better guidance 36
  • 37. Decision Trees 37 Partition the input space into homogeneous regions All points Count 1200 “non-critical” 79% “critical” 21% “non-critical” 59% “critical” 41% Count 564 Count 636 “non-critical” 98% “critical” 2% Count 412 “non-critical” 49% “critical” 51% Count 152 “non-critical” 84% “critical” 16% Count 230 Count 182 vp 0 >= 7.2km/h vp 0 < 7.2km/h ✓p 0 < 218.6 ✓p 0 >= 218.6 RoadTopology(CR = 5, Straight, RH = [4 12](m)) RoadTopology (CR = [10 40](m)) “non-critical” 31% “critical” 69% “non-critical” 72% “critical” 28%
  • 38. Genetic Evolution Guided by Classification 38 Initial input Fitness computation Classification Selection Breeding
  • 39. Search Guided by Classification 39 Test input generation (NSGA II) Evaluating test inputs Build a classification tree Select/generate tests in the fittest regions Apply genetic operators Input data ranges/dependencies + Simulator + Fitness functions (candidate) test inputs - Simulate every (candidate) test - Compute fitness functions Fitness values Test cases revealing worst case system behaviors + A characterization of critical input regions
  • 40. NSGAII-DT vs. NSGAII 40 NSGAII-DT outperforms NSGAII HV 0.0 0.4 0.8 GD 0.05 0.15 0.25 SP 2 0.6 1.0 1.4 6 10 14 18 22 24 Time (h) NSGAII-DT NSGAII
  • 41. Generated Decision Trees 41 GoodnessOfFit RegionSize 1 5 642 3 0.40 0.50 0.60 0.70 tree generations (b) 0.80 71 5 642 3 0.00 0.05 0.10 0.15 tree generations (a) 0.20 7 GoodnessOfFit-crt 1 5 642 3 0.30 0.50 0.70 tree generations (c) 0.90 7 The generated critical regions consistently become smaller, more homogeneous and more precise over successive tree generations of NSGAII-DT
  • 42. Usefulness • The characterizations of the different critical regions can help with: (1) Debugging the system model (2) Identifying possible hardware changes to increase ADAS safety (3) Providing proper warnings to drivers 42
  • 44. Integration of Autonomous Features in ADAS 44 Automated Emergency Braking (AEB) Pedestrian Protection (PP) Lane Departure Warning (LDW) Traffic Sign Recognition (TSR)
  • 45. 45 … … Actuator Commands: - Steering - Acceleration - BrakingConflicts Undesired Feature Interactions Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command
  • 46. Sensor/ Camera Data Autonomous Feature Actuator Command (Early) Function Modeling 46 … … Integration Component Actuator Command Software Under Test (SUT) Executable Function Models (Matlab/Simulink) Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command
  • 48. Main challenge • Resolving conflicts: • Difficult task • Requires extensive domain expertise and a thorough analysis of system requirements • Problems: • There might be mistakes, inaccuracies and bugs in the developed rules 48
  • 49. Problem Statement 49 How to automatically detect undesired feature interactions in self-driving systems at early stages of development?
  • 50. Testing Function Models 50 SUT Physics-based Simulators sensors/ cameras (ego) car other carsactuators pedestrians environment (road, weather, etc) sensor/ camera data Actuator commands Time Stamped Vectors … … Integration " Component Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command Sensor/ Camera Data Autonomous Feature Actuator Command
  • 51. Search-Based Solution • We do not know a priori which safety requirements may be violated. Neither do we know in which branches of IntC the violations may be detected. Therefore, we search for any violation of system safety requirements that may arise when exercising any branch of IntC. • A combination of three test objectives: • Coverage-based • Failure-based • Unsafe Overriding 51
  • 52. Coverage-based Test Objective 52 F1 F2 Fn … Integration component SUT Goal: Exercising as many branches of the integration component as possible if (condition) F1
  • 53. Branch Distance • Many decision branches in IntC • Branch coverage of IntC • Fitness: Approach level and branch distance d (standard for code coverage) • d(b,tc) = 0 when tc covers b 53
  • 54. Failure-based Test Objective 54 F1 F2 Fn … Integration component SUT Example: - Requirement: No collision between pedestrians and cars - Generating test cases that minimize the distance between the car and the pedestrian Goal: Revealing violations of system-level requirements
  • 55. Fitness: Failure Distance • Fitness functions based on the trajectory vectors for the ego car, the leading car and the pedestrian, generated by the simulator • PP fitness: Minimum distance between the car and the pedestrian during the simulation time. • AEB fitness: Minimum distance between the car and the leading car during the simulation time. 55
  • 56. Distance Functions 56 When any of the functions yields zero, a safety failure corresponding to that function is detected.
  • 57. Unsafe Overriding Test Objective 57 F1 F2 Fn … Integration component SUT Goal: Finding failures that are more likely to be due to faults in the integration component rather faults in the features Braking-F1 0 .3 .3 .6 .8 1 1 Braking - Final F1 F1 F2 F2 F3 F3 F1 0 .3 .2 .2 .3 .3 1 Reward failures that could have been avoided if another feature had been prioritized by the integration logic
  • 58. Combining Test Objectives • Goal: Execute every branch of the integration component such that while executing that branch, the component unsafely overrides every feature f and its outputs violate every safety requirement related to f 58 ⌦j,l = Min{⌦j,l(i)}0iT For every time step i, every branch j and every requirement l: If branch j is not covered : ⌦j,l(i) = Branchj(i) + max(Overriding) + max(Failure) ELSE If feature f is not unsafely overrides : ⌦j,l(i) = Overridingf (i) + max(Failure) ELSE ⌦j,l(i) = Failurel(i)
  • 59. Our Hybrid Test Objectives 59 ⌦j,l(tc) > 2 tc does not cover Branch j One hybrid test objective for every branch j and every requirement l⌦j,l 2 ⌦j,l(tc) > 1 tc covers branch j but F is not unsafely overriden 1 ⌦j,l(tc) > 0 tc covers branch j and F is unsafely overriden but req l is not violated ⌦j,l(tc) = 0 A feature interaction failure is likely detected if (cnd) F
  • 60. Search Algorithm • Optimal test suite covers all search objectives, i.e., for all IntC branches and all safety requirements • The number of test objectives is large: # of requirements × # of branches • Computing test objectives is computationally expensive • Not a Pareto front optimization problem • Objectives compete with each others, e.g., cannot have the car violating the speed limit after hitting the leading car in one test case 60
  • 61. MOSA: Many-Objective Search- based Test Generation 61 Objective 1 Objective 2 Not all (non-dominated) solutions are optimal for the purpose of testing These points are better than others Panichella et. al. [ICST 2015]
  • 62. Case Study • Two Case Study systems from IEE • Both systems consist of four self-driving features • Adaptive Cruise Control (ACC) • Automated Emergency Braking (AEB) • Traffic Sign Recognition (TSR) • Pedestrian Protection (PP) • But, they use different rules to integrate feature actuator commands • RQ: Does our Hybrid test objectives reveal more feature interaction failures compared to baseline test objectives (coverage-based and failure-based)? 62
  • 63. 63 Hybrid test objectives reveal significantly more feature interaction failures (more than twice) compared to the baseline alternatives. 4 80 2 6 10 12 Time (h) (a) SafeDrive1 Numberoffeatureinteractionfailures 0 2 8 10 4 6 (b) SafeDrive2 0 2 8 10 4 6 Hybrid (mean) Fail (mean) Cov (mean) Hybrid Coverage-based Failure-based
  • 64. Feedback from Domain Experts • The failures we found were due to undesired feature interactions • The failures were not previously known to them • We identified ways to improve the feature integration logic to avoid failures 64
  • 65. Summary • Problem: How to automatically detect undesired feature interactions in self-driving systems at early stages of development? • Context: Executable Function models and Simulated Environment • Approach: A search-based testing approach • Hybrid test objectives (coverage-based, failure-based, unsafe overriding) • A tailored many-objective search algorithm • We have evaluated and validated our approach: Promising results 65
  • 67. Problem Statement •How to automatically repair (some of the errors in) the feature integration logic? 67
  • 68. Types of errors in the developed Rules • The conjunctive conditions on the left side of the rules may be wrong • Issues in determining the order of conflict resolution rules • There is a partial order over rules • Different rule orderings lead to different system behaviors and it is not clear what order should be used 68
  • 69. Our Solution •A Search-based Fault Localization and Repairing Approach for the Decision Rules of Self-Driving Systems •Many-objective search: Minimize severity of failures until all test cases pass 69
  • 70. Program Repair • Program repair process: • Step 1- Fault Localization: identifies the locations of faults • Step 2 - Patch generation: modifies the software in the faulty code locations • Step 3 – Program validation: checks if the synthesized patch has actually repaired the software 70 Faulty Program Repair Tests Fault Localization Program Modification (Patch Generation) Program Validation Repaired Program Valid Step 1 Step 2 Step 3 No Yes Input Output Program Repair Process • Repair Tests: Failing test cases (demonstrate a defect) + Passing test cases (satisfy system behavior) Goal: automatically repair software systems
  • 71. State of the Art: Challenges •Our errors span several lines of code •Each test case covers most paths over simulation steps •High suspiciousness for most statements 71
  • 72. Our Approach • Goal: Identify errors in the decision rules for self-driving systems and automatically repair these errors 72 Fault Localization (FL) Patch Generation Focus on faulty statements that are related to the most severe failures Define mutation operators to modify the decision rules
  • 73. if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : FL - Illustrations 73 0 1 2 … t … T s1 s2 s3 s4 s5 s6 s7 … sn Simulation time Statements if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : Test Suite tc1 tc2 tc3 x : if s is executed Faulty Program Decision ruleConjunctive conditions Features
  • 74. FL - Illustrations 74 Simulation time Statements Test Suite tc2 tc3 Status tc1 tc2 tc3 ✗ tc1 time of failure Rule executed at the time of failure ⇡tc1 Faulty Program if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : 0 1 2 … tj … T s1 x s2 x s3 x s4 x s5 x s6 x s7 x … sn x x
  • 75. Status tc1 tc2 tc3 FL - Illustrations 75 Simulation time Statements Test Suite tc3 x : if s is executed ✗ tc2 Faulty Program 0 1 2 … t … T s1 x s2 x s3 x s4 x s5 x s6 x s7 x … sn x x if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : :
  • 76. Status tc1 tc2 tc3 FL - Illustrations 76 Simulation time Statements Test Suite ✗ tc3 ✗ time of failure Rule executed at the time of failure ⇡tc3 Faulty Program if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : 0 1 2 … tk … T s1 x x s2 x x s3 x x s4 x x s5 x x s6 x s7 x … sn
  • 77. Ranking • We define the suspiciousness of each statement s by considering both failure severity and faulty rules 77 Score(s) = X tcj 2T S Itcj (s) Rules Score s1 0.9 s2 0.9 s3 0.9 s6 0.6 s7 0.6 s8 0.6 s4 0 s5 0 … !tcj : severity of failure exposed by tcj • We select a statement among the most suspicious ones in a randomized way Itcj = ⇢ 0 if s /2 faulty rule ⇡tcj !tcj otherwise
  • 78. Probabilistic Fault Localization •For each failure, select in a randomized manner a statement among most suspicious ones •Select different statements in a probabilistic manner (suspiciousness) across search interactions •Generate and apply a patch 78
  • 79. Generate a Patch • Mutation operators 1. Modify operator 2. Shift operator • Apply a sequence of randomly selected operators • Evaluate the patch (run test simulations) 79
  • 80. Modify 80 1: if(TTC < TTCth && Dist(P/C) < Dist(P/C)th && object detected is pedestrian) 2: Apply PP (braking force = 100%) 3: else if(speed of the car > speed limit) 4: Apply TSR (braking force = 40%) 5: else if(TTC < TTCth && object detected is not pedestrian) 6: Apply AEB (braking force = 60%) 7: else if … …… n: …… Example 1: if(TTC < TTCth && Dist(P/C) < Dist(P/C)th + 2 && object detected is pedestrian) 2: Apply PP (braking force = 100%) 3: else if(speed of the car > speed limit) 4: Apply TSR (braking force = 40%) 5: else if(TTC < TTCth && object detected is not pedestrian) 6: Apply AEB (braking force = 60%) 7: else if … …… n: …… • Modify operator: modify the condition of the faulty statement by changing a threshold value, altering the direction of a relational operator (e.g., > to <) or swapping arithmetic operations (e.g., + to - ) if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : :
  • 81. if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : Shift • Shift operator: Change the order of rules 81 if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition2(s4) statement condition2(s5) else if condition4(s9) statement condition4(s10) else if condition5(s11) statement condition5(s12) else if condition6(s13) statement condition6(s14) : : if condition1(s1) if condition11 (s2) statement condition11(s3) end else if condition2(s4) statement condition2(s5) else if condition5(s11) statement condition5(s12) else if condition3(s6) if condition11 (s7) statement condition31(s8) end else if condition4(s9) statement condition4(s10) else if condition6(s13) statement condition6(s14) : :
  • 82. Our Search-based Repairing Approach 82 Evaluate the Patch Faulty Program Repaired Program Test Cases Localize Fault Archive Generate a Patch Faulty Program good? yes update the archive Evaluate the Patch Faulty Program Repaired Program Test Cases Localize Fault Archive Generate a Patch good? yes update the archive Best Patched Programs select a patch from the archive
  • 83. Preliminary evaluation • RQ: How effective and efficient is our approach in localizing and repairing faults? 83
  • 84. Results • Able to repair the decision rules (change the order of the rules and change some threshold values) • Able to find different correct versions of the decision rules 84 Time needed to repair errors Our approach takes on average 5 hours to repair the decision rules of self-driving systems
  • 85. Summary • We proposed an automated approach to repair errors in the decision rules for self-driving systems • A many-objective search-based testing approach • Our approach effectively and efficiently localizes and repairs errors in decision rules • Limitations: Cannot repair all kinds of errors, e.g., adding or removing rules 85
  • 87. Other CPS Domains • Similar ISO standards • With the rise of DNNs, they will develop SOTIF-like standards • Most CPS are safety critical • Most CPS follow similar development phases • Different domain models, e.g., notion of scenes and scenarios • Different, dedicated simulators 87
  • 88. Reflections • Search-based solutions are versatile • Help relax assumptions compared to exact approaches • Help decrease modeling requirements • Scalability, e.g., easy to parallelize • Require massive empirical studies for validation • Search is rarely sufficient by itself 88
  • 89. Reflections • Combine with machine learning to make the search more effective and efficient • Combine search with solvers, e.g., SMT, model checking, Constraint programming, on model fragments. • Grey box approaches to search, i.e., use information from the Simulink models to better guide search. • Impact of deep learning components in autonomous systems’ testing and verification 89
  • 90. The Impact of DNNs • Increasingly so, it is easier to learn behavior from data using machine learning, rather than specify and code • Some ADAS components rely on deep learning … • Thousands of weights learned (Deep Neural Networks) • No explicit code, no specifications • Verification, testing? • State of the art includes adequacy coverage criteria and mutation testing for DNNs 90
  • 91. Selected References • Briand et al. “Testing the untestable: Model testing of complex software-intensive systems”, IEEE/ACM ICSE 2016, V2025 • Ben Abdessalem et al., " Testing Advanced Driver Assistance Systems using Multi-objective Search and Neural Networks, ASE 2016 • Ben Abdessalem et al., "Testing Vision-Based Control Systems Using Learnable Evolutionary Algorithms”, ICSE 2018 • Ben Abdessalem et al., "Testing Autonomous Cars for Feature Interaction Failures using Many-Objective Search”, ASE 2018 91
  • 92. Automated Testing of Autonomous Driving Assistance Systems Lionel Briand TrustMeIA, 2019