Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-based Safety-critical Systems

1
Simulator-based Explanation and Debugging of
Hazard-triggering Events
in DNN-based Safety-critical Systems
https://dl.acm.org/doi/10.1145/3569935
ACM Transactions on Software Engineering & Methodology (TOSEM)
Hazem Fahmy 1, Fabrizio Pastore 1, Lionel Briand 1,2, Thomas Stifter 3
1 University of Luxembourg, 2 University of Ottawa, 3 IEE S.A.

2
DNN-based Safety-critical Systems
Autonomous
Drones
Self-driving
Cars
Child
Detection

3
Simulator-based Training of DNNs
Training set
(simulator images)
DNN
Training
Fine-tuning
DNN
Testing
Trained
DNN
Fine-Tuned
DNN
Failures
Training set
(real-world images)
Test set
(real-world images)
Necessary for engineers to characterize failure-inducing images.
To improve DNNs (e.g., select images for retraining), or to identify
countermeasures (e.g., two cameras).
Manual inspection of images is expensive and error prone.

4
Can we automatically generate
expressions
constraining simulator parameters
to explain DNN failures
observed with real-world data?

5
-28.2 < Head_Vert < -15.7
(Top)
-23.9 < Head_Hor < 0.7
(Left)
22.8 > Head_Vert > -10.8
(Middle – Top)
7.3 < Head_Hor < 32.9
(Center – Right)

7
Simulator-based Explanations for DNN Errors (SEDE)
Real-world
Error-inducing images HUDD
Step 1. Identify root-cause clusters (RCCs)
Rooot
Cause
Clusters
(RCCs)
Simulator-based Explanation for DNN Errors (SEDE)

8
Real-world
Evolutionary
Algorithm
Simulator
Simulator
images
Configuration
Parameters
RCC Representative
Images
Step 2. Generate images associated to RCCs
RCCs
Step 2.1. Identify RCC Representative Images
PaiR

9
(PaiR)
Generates a set of images that belong to the RCC and are diverse
Off-springs
RCC medoid
Parents
O2. Diversity
O1. Cluster Membership

11
Real-world
Evolutionary
Algorithms
Simulator
Simulator
images
Configuration
Parameters
RCCs
PaiR
Unsafe images in the RCC
+ simulator parameters
RCC Representative
Images
Step 2.2. Generate a set of failing images belonging to the cluster

12
Step 2.2. Generate a set of failing images
• Objective: Generate a failing image that is similar to one reference image in P1
• to characterize the unsafe space of a cluster
• while leveraging the diversity in the population P1
P1 images
Failing images
RCC
Real unsafe
space

13
Real-world
Evolutionary
Algorithms
Simulator
Simulator
images
Configuration
Parameters
RCCs
Step 2.3. Generate one non-failing image close to each failing image
PaiR
Non-failing images similar to faioling
images + simulator parameters
Failing images in the RCC
RCC Representative
Images
Step 2.2. Generate a set of failing images belonging to the cluster

15
Real-world
Evolutionary
Algorithms
Simulator
Simulator
images
Configuration
Parameters
Safe images similar to unsafe images
Rule Extraction Algorithm (PART)
IF-THEN
Rules
Expressions
Generator
Explanation
Expression
RCCs
Step 2.2. Generate a set of non-failing images belonging to the cluster
Step 2.3. Generate one non-failing image close to each unsafe image
Step 3. Generate expressions that characterize unsafe images
PaiR
RCC Representative
Images Step 2.1. Identify RCC Representative Images

16
Example Output
-4.25 < Head_Hor < 36.5
(Center – Right)
6.4 < Head_Vert < 21.8
(Bottom)
HUDD-RCC
SEDE Failing Images
SEDE Expressions
SEDE Passing Images

18
Real-world
IF-THEN
Rules
Expressions
Generator
Explanation
Expression
Configuration
Parameters
Unsafe
Improvement Set
Retrained
DNN
Inputs
Selection
Simulator
Retraining
Best DNN
Step 4. Retrain the DNN Execute 10 times
RCCs
Evolutionary
Algorithms
Simulator
Simulator
images
Configuration
Parameters
Step 2.2. Generate a set of unsafe images belonging to the cluster
Step 2.3. Generate one safe image for each unsafe image
PaiR
RCC Representative

20
Research Questions
§ RQ1: How does PaiR fare, compared to alternative approaches, for the
generation of diverse images belonging to RCCs?
§ RQ2: Does SEDE generate images that are close to the center of each
RCC?
§ RQ3: Does SEDE generate, for each RCC, a set of images sharing similar
characteristics?
§ RQ4: Do the RCC expressions identified by SEDE delimit an unsafe space?
§ Necessary to generate meaningful explanations
§ RQ5: How does SEDE compare to traditional DNN accuracy improvement
practices?
§ Necessary to evaluate if the images generated according to our expressions help
improving the accuracy of a DNN when it processed real-world images

21
Two opensource Simulators from
IEE Face-Simulator
(13 parameters)
IEE Human-Simulator
(21 parameters)

22
Subjects of the study
• Two head pose detection DNNs
• One trained with IEE Face-Simulator
• One trained with IEE Human-Simulator
• Both fine-tuned with IEE real-world dataset
• One face-landmarks detection DNN
• Trained with to IEE Face-Simulator
• Fine-tuned with IEE real-world dataset
IEE-Faces
Generated Image
IEE-Humans
Simulator
IEE Real-world
Dataset
BIWI-Kinect
Dataset

33
RQ4. Do the expressions identified by SEDE delimit an
unsafe space?
• We aim to demonstrate that images matching our expression
lead to low accuracy
• Experiment design:
• Generate 500 images for each RCC, according to SEDE
expressions
• Compute the percentage of correctly classified images
• Positively answer RQ4 if, for a large number of clusters,
• the generated images have an accuracy that is significantly lower than the
accuracy observed with the Test Set

34
RQ4. Summary Results
Do the RCC expressions identified by SEDE delimit an unsafe space?
-37%
-36%
-17%

35
RQ5. Is it possible to improve the DNN by leveraging the
unsafe expressions identified by SEDE?
• We aim to determine if images matching our expressions
may improve the accuracy of the DNN
• Experiment design:
• Retrain the DNN using 500 generated images per cluster
matching our expressions
• Measure the overall improvement of the DNN’s accuracy on the
testset
• Compare with HUDD and a random baseline
• Repeat the experiment 10 times

37
RQ5. Results
How does SEDE compare to traditional DNN accuracy improvement practices?
DNN
Original
accuracy
Accuracy after retraining SEDE
Gain over
best
baseline
Stat. Sign.
SEDE HUDD RBL p-value A12
FLD 80.06% 86.14% 79.94% 77.41% +6.19% 1e-4 1.0
HPD-F 51.65% 56.15% 45.80% 44.33% +10.35% 4e-4 0.94
HPD-H 51.03% 69.68% 60.65% 55.57% +9.03% 1e-4 1.0

38
4
Can we automatically generate
explanations for DNN failures
as expressions
constraining simulator parameters?
15
Real-world
IF-THEN
Rules
Expressions
Generator
Explanation
Expression
Configuration
Parameters
Unsafe
Improvement Set
Retrained
DNN
Inputs
Selection
Simulator
Retraining
Best DNN
Step 4. Retrain the DNN xN
RCCs
SEDE. Full Approach
Evolutionary
Algorithms
Simulator
Simulator
images
Configuration
Parameters
Step 2.2. Generate a set of unsafe images belonging to the cluster
Step 2.3. Generate one safe image for each unsafe image
PaiR
RCC Representative
17
Research Questions
• RQ1: How does PaiR fare, compared to alternative approaches, for the
generation of diverse images belonging to RCCs?
• RQ2: Does SEDE generate images that are close to the center of each
RCC?
• Necessary to generate explanations that are related to each RCC
• RQ3: Does SEDE generate, for each RCC, a set of images sharing similar
characteristics?
• Necessary to generate meaningful explanations
• RQ4: Do the RCC expressions identified by SEDE delimit an unsafe space?
• Necessary to evaluate the effectiveness of our explanations
• RQ5: How does SEDE compare to traditional DNN accuracy improvement
practices?
• Necessary to evaluate if the generated images according to our expressions can address
problems concerning real-world scenarios, compared to SOTA
SEDE. RQs
https://github.com/SNTSVV/SEDE
https://doi.org/10.6084/m9.figshare.19467401

39
Simulator-based Explanation and Debugging of
Hazard-triggering Events
in DNN-based Safety-critical Systems
https://dl.acm.org/doi/10.1145/3569935
ACM Transactions on Software Engineering & Methodology (TOSEM)
Hazem Fahmy 1, Fabrizio Pastore 1, Lionel Briand 1,2, Thomas Stifter 3
1 University of Luxembourg, 2 University of Ottawa, 3 IEE S.A.

Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-based Safety-critical Systems

More Related Content

Similar to Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-based Safety-critical Systems

More from Lionel Briand

Recently uploaded

Simulator-based Explanation and Debugging of Hazard-triggering Events in DNN-based Safety-critical Systems