SlideShare a Scribd company logo
How Do Deep Learning Faults Affect
AI-Enabled Cyber-Physical Systems in
Operation? A PreliminaryStudy Based
on DeepCrime Mutation Operators
Aitor Arrieta, Pablo Valle, Asier Iriarte, Miren Illarramendi
Cyber-Physical Systems
2
AI-enabled Cyber-Physical Systems
3
Deep Learning techniques allow for higher
autonomy in Cyber-Physical Systems and
enables the control of complex tasks
4
But!
Deep Learning systems are also
prone to faults!
5
Off-line vs On-line CPSs testing
Test
Data Predictions
Off-line Testing
Physical
Plant
Test
Scenarios
CPS
Behavior
over time
On-line Testing
6
Key Research Question
How Do Deep Learning Faults Affect AI-
Enabled Cyber-Physical Systems in
Operation?
7
• Based on real faults (e.g., low quality data, faults in the DNN
architecture)
• Consider randomized nature of DL systems
𝑖𝑠𝐾𝑖𝑙𝑙𝑒𝑑 𝑃, 𝑀, 𝑇𝑒𝑠𝑡𝐷 = ቐ
𝑡𝑟𝑢𝑒 if 𝑒𝑓𝑓𝑒𝑐𝑡𝑆𝑖𝑧𝑒 ≥ 𝛽 𝑎𝑛𝑑 𝑝_𝑣𝑎𝑙𝑢𝑒 < 𝛼
𝑓𝑎𝑙𝑠𝑒 Otherwise
Mutation Testing of DL systems
8
Research Questions
RQ1 — How do DL faults affect AI-enabled CPSs in
operation?
RQ2 — How do DL differ when deployed in an AI-
enabled CPS as compared to when executed in an
off-line fashion?
RQ3 – Are there differences in terms of killability
between the type of DL faults when deployed in
operation?
9
• Case study system and used circuit
Experimental setup
10
• Deep learning faults
• 4 DL mutation operators selected from DeepCrime
• New Learning Rates (HLR)
• New Number of Epochs (HNE)
• Add Noise to Training Data (TAN)
• Change Labels of Training Data (TCL)
• 5 configurations each
• 10 runs to account for stochasticity
Experimental setup
200 DNN models in total + 10 DNN models for the
original study
11
• Evaluation metrics
• Mean Squared Error (MSE) for the off-line testing
• For operational
• Time required by the robot for completing two entire laps
• Whether the robot went out or not
• Other considerations
• Controlled light of the environment
• Manual time was considered by recording the time twice
Experimental setup
12
Results
13
RQ1 – Faults affecting physical rover
35% of the mutants were detected in the circuit
14
RQ2 – Off-line vs Physical
95% of the mutants were detected off-line
These results contrast with other studies
15
RQ3 – Type of mutation operator
All mutants from TCL were detected
Two mutants from HLR were detected
TAN and HNE were not detected
16
• Faults do not manifest that easily during operation
• Off-line seems to find further faults than with physical
testing
• Preliminary study: More faults and other case studies
required to generalize our findings
Conclusion
17
• More CPS case study systems
• More faults
• Other angles of research
• What about simulation?
• What about systems with multiple DNNs?
• Control levels: Low-level controlling functions vs High-
level controlling functions
Future work
Thank you!
Aitor Arrieta
aarrieta@mondragon.edu

More Related Content

Similar to ESEM_2023.pdf

Talk-Foutse-SrangeLoop.pdf
Talk-Foutse-SrangeLoop.pdfTalk-Foutse-SrangeLoop.pdf
Talk-Foutse-SrangeLoop.pdf
Foutse Khomh
 
Dependable Systems -Dependability Threats (2/16)
Dependable Systems -Dependability Threats (2/16)Dependable Systems -Dependability Threats (2/16)
Dependable Systems -Dependability Threats (2/16)
Peter Tröger
 
Basics of process fault detection and diagnostics
Basics of process fault detection and diagnosticsBasics of process fault detection and diagnostics
Basics of process fault detection and diagnostics
Rahul Dey
 
Case study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck SchedulerCase study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck Scheduler
Mengxuan Xia
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
ChemAxon
 
Dependable Systems - Introduction (1/16)
Dependable Systems - Introduction (1/16)Dependable Systems - Introduction (1/16)
Dependable Systems - Introduction (1/16)
Peter Tröger
 
David Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest IrelandDavid Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest Ireland
David O'Dowd
 
Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...
Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...
Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...
Moogsoft
 
Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments
Liming Zhu
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
CS, NcState
 
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography & Mass Spectrometry Solutions
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
Peter Varhol
 
SSBSE 2020 keynote
SSBSE 2020 keynoteSSBSE 2020 keynote
SSBSE 2020 keynote
Shiva Nejati
 
CSD-01 - Introduction to Control System Design.ppt
CSD-01 - Introduction to Control System Design.pptCSD-01 - Introduction to Control System Design.ppt
CSD-01 - Introduction to Control System Design.ppt
MuhammadMansorBurhan
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
CSIRO
 
SE2_Lec 20_Software Testing
SE2_Lec 20_Software TestingSE2_Lec 20_Software Testing
SE2_Lec 20_Software Testing
Amr E. Mohamed
 
Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Jaap van Ekris
 
Operating System
Operating SystemOperating System
Operating System
cpjcollege
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Lionel Briand
 
Black-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software SystemsBlack-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software Systems
Mohammad Jafar Mashhadi
 

Similar to ESEM_2023.pdf (20)

Talk-Foutse-SrangeLoop.pdf
Talk-Foutse-SrangeLoop.pdfTalk-Foutse-SrangeLoop.pdf
Talk-Foutse-SrangeLoop.pdf
 
Dependable Systems -Dependability Threats (2/16)
Dependable Systems -Dependability Threats (2/16)Dependable Systems -Dependability Threats (2/16)
Dependable Systems -Dependability Threats (2/16)
 
Basics of process fault detection and diagnostics
Basics of process fault detection and diagnosticsBasics of process fault detection and diagnostics
Basics of process fault detection and diagnostics
 
Case study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck SchedulerCase study: formal verification of the Brain Fuck Scheduler
Case study: formal verification of the Brain Fuck Scheduler
 
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
Cheminfo Stories APAC 2020 - Chemical Descriptors & Standardizers for Machine...
 
Dependable Systems - Introduction (1/16)
Dependable Systems - Introduction (1/16)Dependable Systems - Introduction (1/16)
Dependable Systems - Introduction (1/16)
 
David Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest IrelandDavid Parnas - Documentation Based Software Testing - SoftTest Ireland
David Parnas - Documentation Based Software Testing - SoftTest Ireland
 
Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...
Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...
Webinar Slides - How KeyBank Liberated its IT Ops from Rules-Based Event Mana...
 
Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments Challenges in Practicing High Frequency Releases in Cloud Environments
Challenges in Practicing High Frequency Releases in Cloud Environments
 
In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?In the age of Big Data, what role for Software Engineers?
In the age of Big Data, what role for Software Engineers?
 
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
Chromatography Data System: Getting It “Right First Time” Seminar Series – Pa...
 
Using Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps PracticesUsing Machine Learning to Optimize DevOps Practices
Using Machine Learning to Optimize DevOps Practices
 
SSBSE 2020 keynote
SSBSE 2020 keynoteSSBSE 2020 keynote
SSBSE 2020 keynote
 
CSD-01 - Introduction to Control System Design.ppt
CSD-01 - Introduction to Control System Design.pptCSD-01 - Introduction to Control System Design.ppt
CSD-01 - Introduction to Control System Design.ppt
 
From ensembles to computer networks
From ensembles to computer networksFrom ensembles to computer networks
From ensembles to computer networks
 
SE2_Lec 20_Software Testing
SE2_Lec 20_Software TestingSE2_Lec 20_Software Testing
SE2_Lec 20_Software Testing
 
Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)Testing Safety Critical Systems (10-02-2014, VU amsterdam)
Testing Safety Critical Systems (10-02-2014, VU amsterdam)
 
Operating System
Operating SystemOperating System
Operating System
 
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
Testing Dynamic Behavior in Executable Software Models - Making Cyber-physica...
 
Black-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software SystemsBlack-box Behavioral Model Inference for Autopilot Software Systems
Black-box Behavioral Model Inference for Autopilot Software Systems
 

Recently uploaded

Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
İsa Badur
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
Nistarini College, Purulia (W.B) India
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
Hitesh Sikarwar
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
SSR02
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
IshaGoswami9
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
by6843629
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
Sharon Liu
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills MN
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 

Recently uploaded (20)

Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
aziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobelaziz sancar nobel prize winner: from mardin to nobel
aziz sancar nobel prize winner: from mardin to nobel
 
Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.Nucleic Acid-its structural and functional complexity.
Nucleic Acid-its structural and functional complexity.
 
Cytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptxCytokines and their role in immune regulation.pptx
Cytokines and their role in immune regulation.pptx
 
Nucleophilic Addition of carbonyl compounds.pptx
Nucleophilic Addition of carbonyl  compounds.pptxNucleophilic Addition of carbonyl  compounds.pptx
Nucleophilic Addition of carbonyl compounds.pptx
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 
Phenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvementPhenomics assisted breeding in crop improvement
Phenomics assisted breeding in crop improvement
 
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf8.Isolation of pure cultures and preservation of cultures.pdf
8.Isolation of pure cultures and preservation of cultures.pdf
 
20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx20240520 Planning a Circuit Simulator in JavaScript.pptx
20240520 Planning a Circuit Simulator in JavaScript.pptx
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 

ESEM_2023.pdf

  • 1. How Do Deep Learning Faults Affect AI-Enabled Cyber-Physical Systems in Operation? A PreliminaryStudy Based on DeepCrime Mutation Operators Aitor Arrieta, Pablo Valle, Asier Iriarte, Miren Illarramendi
  • 3. AI-enabled Cyber-Physical Systems 3 Deep Learning techniques allow for higher autonomy in Cyber-Physical Systems and enables the control of complex tasks
  • 4. 4 But! Deep Learning systems are also prone to faults!
  • 5. 5 Off-line vs On-line CPSs testing Test Data Predictions Off-line Testing Physical Plant Test Scenarios CPS Behavior over time On-line Testing
  • 6. 6 Key Research Question How Do Deep Learning Faults Affect AI- Enabled Cyber-Physical Systems in Operation?
  • 7. 7 • Based on real faults (e.g., low quality data, faults in the DNN architecture) • Consider randomized nature of DL systems 𝑖𝑠𝐾𝑖𝑙𝑙𝑒𝑑 𝑃, 𝑀, 𝑇𝑒𝑠𝑡𝐷 = ቐ 𝑡𝑟𝑢𝑒 if 𝑒𝑓𝑓𝑒𝑐𝑡𝑆𝑖𝑧𝑒 ≥ 𝛽 𝑎𝑛𝑑 𝑝_𝑣𝑎𝑙𝑢𝑒 < 𝛼 𝑓𝑎𝑙𝑠𝑒 Otherwise Mutation Testing of DL systems
  • 8. 8 Research Questions RQ1 — How do DL faults affect AI-enabled CPSs in operation? RQ2 — How do DL differ when deployed in an AI- enabled CPS as compared to when executed in an off-line fashion? RQ3 – Are there differences in terms of killability between the type of DL faults when deployed in operation?
  • 9. 9 • Case study system and used circuit Experimental setup
  • 10. 10 • Deep learning faults • 4 DL mutation operators selected from DeepCrime • New Learning Rates (HLR) • New Number of Epochs (HNE) • Add Noise to Training Data (TAN) • Change Labels of Training Data (TCL) • 5 configurations each • 10 runs to account for stochasticity Experimental setup 200 DNN models in total + 10 DNN models for the original study
  • 11. 11 • Evaluation metrics • Mean Squared Error (MSE) for the off-line testing • For operational • Time required by the robot for completing two entire laps • Whether the robot went out or not • Other considerations • Controlled light of the environment • Manual time was considered by recording the time twice Experimental setup
  • 13. 13 RQ1 – Faults affecting physical rover 35% of the mutants were detected in the circuit
  • 14. 14 RQ2 – Off-line vs Physical 95% of the mutants were detected off-line These results contrast with other studies
  • 15. 15 RQ3 – Type of mutation operator All mutants from TCL were detected Two mutants from HLR were detected TAN and HNE were not detected
  • 16. 16 • Faults do not manifest that easily during operation • Off-line seems to find further faults than with physical testing • Preliminary study: More faults and other case studies required to generalize our findings Conclusion
  • 17. 17 • More CPS case study systems • More faults • Other angles of research • What about simulation? • What about systems with multiple DNNs? • Control levels: Low-level controlling functions vs High- level controlling functions Future work