SlideShare a Scribd company logo
1 of 18
Download to read offline
Finding Bugs in Deep Learning
Programs
SoftWare Analytics
and Technologies Lab
S.W.A.T
Foutse Khomh, PhD, Ing.
foutse.khomh@polymtl.ca
@SWATLab
Canada CIFAR AI Chair on Trustworthy Machine Learning Systems
Trustworthy Engineering of AI Software
1
Quality Assurance of ML-enabled systems
System evolution & continuous delivery
2
Reliable system
Data Health Model Health
Code Health
Time
Distribution shift
Data imbalance
Coding errors
Noisy labels
Underspecification
Bias
Vulnerabilities
Deep Learning is being rapidly adopted in industry
3
Preliminary
preparation
Data
Collection
Data Preprocessing
Model
Implementation
Model
Training
Model
Evaluation
Model
Tuning
Data postprocessing
Model
Prediction
DL development phases produce a lot of code!
Han et al., What do Programmers Discuss about Deep Learning Frameworks
DL programs can be faulty!
5
6
Multi-dimensional space of DL bugs
Model
Issues
not enough learning
capacity
non-optimal
regularization
correct
incorrect feature
extraction
incorrect gradient
computation
…
Implementation Issues
correct
…
7
TOSEM’21
NeuraLint : A linter for DL programs
 Capture defects early, so saves rework cost.
 Less expensive, because it doesn’t require
execution.
 Find defects in seconds.
 …
NeuraLint is fast and effective!
 It achieves an accuracy of 91.7 % .
 It correctly reported 18 additional bugs that were
not found by developers.
 The average execution time of NeuraLint for the
studied TensorFlow and Keras based programs are
2.892 and 3.197 seconds respectively.
Try it out!
NeuraLint has two pillars…
8
A meta-model of DL programs Taxonomy of common DL faults
Gunel Jahangirova, Nargiz Humbatova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2019. Taxonomy of Real Faults in Deep
Learning Systems. arXiv preprint arXiv:1910.11015
NeuraLint: Execution Flow
9
Graph transformation
Rules
Potential issues
Program
Original program
Model Extraction
Model
Run
List of detected Issues
TheDeepChecker outperforms AWS SMD
 DL coding bugs and misconfigurations are detected
with (precision, recall), respectively, equal to (90%,
96.4%) and (77%, 83.3%).
 Finds 30% more defects than AWS SageMaker.
10
TOSEM’22
TheDeepChecker : Dynamic testing of DL programs
 Capture defects during the training process.
 Less expensive than testing the resulting model.
 Some overhead on the training process.
…
Try it out!
11
TheDeepChecker verification rules…
Parameters-related Issues Untrained Parameters
Poor Weight Initialization
Parameters’ Values Divergence
Parameters Unstable Learning
Activation-related Issues Activations out of Range
Neuron Saturation
Dead ReLU
Optimization-related Issues Unable to fit a small sample
Zero Loss
Diverging Loss
Slow or Non decreasing Loss
Loss Fluctuations
Unstable Gradient: Exploding
Unstable Gradient: Vanishing
Parameters-related Issues Untrained Parameters
Given a layer 𝑖𝑖 and 𝑁𝑁 iterations
𝑊𝑊𝑖𝑖
0
= 𝑊𝑊𝑖𝑖
1
,𝑏𝑏𝑖𝑖
0
= 𝑏𝑏𝑖𝑖
1
𝑊𝑊𝑖𝑖
1
= 𝑊𝑊𝑖𝑖
2
,𝑏𝑏𝑖𝑖
1
= 𝑏𝑏𝑖𝑖
2
…
𝑊𝑊𝑖𝑖
𝑁𝑁−1
= 𝑊𝑊𝑖𝑖
𝑁𝑁
,𝑏𝑏𝑖𝑖
𝑁𝑁−1
= 𝑏𝑏𝑖𝑖
𝑁𝑁
Issue
Poor Weight Initialization
Parameters’ Values Divergence
Given a layer 𝑖𝑖 and an iteration 𝑗𝑗
𝑊𝑊
𝑖𝑖
𝑗𝑗
≠ 𝑊𝑊
𝑖𝑖
𝑗𝑗+1
𝑏𝑏𝑖𝑖
𝑗𝑗
≠ 𝑏𝑏𝑖𝑖
𝑗𝑗+1
∀ 𝑗𝑗 ∈ [0, 𝑁𝑁 − 1]
Verification
Routine
Parameters Unstable Learning
TheDeepChecker verification rules…
13
Activation-related Issues Activations out of Range
Given a layer 𝑖𝑖
𝐴𝐴𝑖𝑖 ∉ [𝑚𝑚𝑚𝑚𝑚𝑚, 𝑚𝑚𝑚𝑚𝑚𝑚]
Issue
Neuron Saturation
Given a layer 𝑖𝑖
𝑚𝑚𝑚𝑚𝑚𝑚 ≤ 𝐴𝐴𝑖𝑖 ≤ 𝑚𝑚𝑚𝑚𝑚𝑚
Verification
Routine
Dead ReLU
Specification of verification rules
14
Optimization-related Issues Unable to fit a small sample
The DNN could not properly
minimize the loss.
Issue
Zero Loss
Diverging Loss
Slow or Non decreasing Loss
The DNN (with regularization off)
should overfit a tiny sample of data.
Given N iterations
𝑙𝑙𝑙𝑙𝑙𝑙𝑠𝑠𝑁𝑁 = 0
Verification
Routine
Loss Fluctuations
Unstable Gradient: Exploding
Unstable Gradient: Vanishing
TheDeepChecker verification rules…
15
Program
Pre-processing
Post-processing
Verification
Routines
Potential issues
Program
Original program
Monitoring
Monitored Program
Run
Sanity Check of Program
TheDeepChecker: Execution Flow
TheDeepChecker outperforms AWS SMD
 DL coding bugs and misconfigurations are detected
with (precision, recall), respectively, equal to (90%,
96.4%) and (77%, 83.3%).
 Finds 30% more defects than AWS SageMaker.
16
TOSEM’22
TheDeepChecker : Dynamic testing of DL programs
 Capture defects during the training process.
 Less expensive than testing the resulting model.
 Some overhead on the training process.
…
Try it out!
17
Foutse Khomh, PhD, Ing.
foutse.khomh@polymtl.ca
@SWATLab
Canada CIFAR AI Chair
Emmanuel Thepie Fapi
18
Try it out!

More Related Content

Similar to Finding Bugs in Deep Learning Programs with NeuraLint and TheDeepChecker

Soft quality & standards
Soft quality & standardsSoft quality & standards
Soft quality & standardsPrince Bhanwra
 
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...TEST Huddle
 
Black box-software-testing-douglas-hoffman2483
Black box-software-testing-douglas-hoffman2483Black box-software-testing-douglas-hoffman2483
Black box-software-testing-douglas-hoffman2483Chaitanya Kn
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningLucidworks
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10Roger Barga
 
Testing and Mocking Object - The Art of Mocking.
Testing and Mocking Object - The Art of Mocking.Testing and Mocking Object - The Art of Mocking.
Testing and Mocking Object - The Art of Mocking.Deepak Singhvi
 
Using the Machine to predict Testability
Using the Machine to predict TestabilityUsing the Machine to predict Testability
Using the Machine to predict TestabilityMiguel Lopez
 
How to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated TestingHow to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated TestingTechWell
 
Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)Venkatesh Prasad Ranganath
 
Making the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To TestingMaking the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To TestingCameron Presley
 
Building functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortalBuilding functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortalDmitriy Gumeniuk
 
Defect free development - QS Tag2019
Defect free development - QS Tag2019Defect free development - QS Tag2019
Defect free development - QS Tag2019Arnon Axelrod
 
Designing Self-maintaining UI Tests for Web Applications
Designing Self-maintaining UI Tests for Web ApplicationsDesigning Self-maintaining UI Tests for Web Applications
Designing Self-maintaining UI Tests for Web ApplicationsTechWell
 
Testing & implementation system 1-wm
Testing & implementation system 1-wmTesting & implementation system 1-wm
Testing & implementation system 1-wmWiwik Muslehatin
 
Metric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in OracleMetric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in OracleSteve Karam
 
[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...
[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...
[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...DataScienceConferenc1
 

Similar to Finding Bugs in Deep Learning Programs with NeuraLint and TheDeepChecker (20)

Soft quality & standards
Soft quality & standardsSoft quality & standards
Soft quality & standards
 
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
Testing As A Bottleneck - How Testing Slows Down Modern Development Processes...
 
Black box-software-testing-douglas-hoffman2483
Black box-software-testing-douglas-hoffman2483Black box-software-testing-douglas-hoffman2483
Black box-software-testing-douglas-hoffman2483
 
Webinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep LearningWebinar: Question Answering and Virtual Assistants with Deep Learning
Webinar: Question Answering and Virtual Assistants with Deep Learning
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Testing and Mocking Object - The Art of Mocking.
Testing and Mocking Object - The Art of Mocking.Testing and Mocking Object - The Art of Mocking.
Testing and Mocking Object - The Art of Mocking.
 
Using the Machine to predict Testability
Using the Machine to predict TestabilityUsing the Machine to predict Testability
Using the Machine to predict Testability
 
Mlcc #4
Mlcc #4Mlcc #4
Mlcc #4
 
How to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated TestingHow to Actually DO High-volume Automated Testing
How to Actually DO High-volume Automated Testing
 
Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)
 
Making the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To TestingMaking the Unstable Stable - An Intro To Testing
Making the Unstable Stable - An Intro To Testing
 
midterm_fa08.pdf
midterm_fa08.pdfmidterm_fa08.pdf
midterm_fa08.pdf
 
test
testtest
test
 
test
testtest
test
 
Building functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortalBuilding functional Quality Gates with ReportPortal
Building functional Quality Gates with ReportPortal
 
Defect free development - QS Tag2019
Defect free development - QS Tag2019Defect free development - QS Tag2019
Defect free development - QS Tag2019
 
Designing Self-maintaining UI Tests for Web Applications
Designing Self-maintaining UI Tests for Web ApplicationsDesigning Self-maintaining UI Tests for Web Applications
Designing Self-maintaining UI Tests for Web Applications
 
Testing & implementation system 1-wm
Testing & implementation system 1-wmTesting & implementation system 1-wm
Testing & implementation system 1-wm
 
Metric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in OracleMetric Abuse: Frequently Misused Metrics in Oracle
Metric Abuse: Frequently Misused Metrics in Oracle
 
[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...
[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...
[DSC Europe 22] Testing Machine Learning Systems: What it is and why you shou...
 

More from Foutse Khomh

Foutse_MSR Vision keynote.pptx
Foutse_MSR Vision keynote.pptxFoutse_MSR Vision keynote.pptx
Foutse_MSR Vision keynote.pptxFoutse Khomh
 
Stack overflow code_laundering
Stack overflow code_launderingStack overflow code_laundering
Stack overflow code_launderingFoutse Khomh
 
Mining the Relationship between Anti-patterns Dependencies and Fault-Proneness
Mining the Relationship between Anti-patterns Dependencies and Fault-PronenessMining the Relationship between Anti-patterns Dependencies and Fault-Proneness
Mining the Relationship between Anti-patterns Dependencies and Fault-PronenessFoutse Khomh
 
Predicting bugs using antipatterns
Predicting bugs using antipatternsPredicting bugs using antipatterns
Predicting bugs using antipatternsFoutse Khomh
 
How does Context affect the Distribution of Software Maintainability Metrics?
How does Context affect the Distribution of Software Maintainability Metrics?How does Context affect the Distribution of Software Maintainability Metrics?
How does Context affect the Distribution of Software Maintainability Metrics?Foutse Khomh
 
On Rapid Releases and Software Testing
On Rapid Releases and Software TestingOn Rapid Releases and Software Testing
On Rapid Releases and Software TestingFoutse Khomh
 
Adapting Linux for Mobile Platforms: An Empirical Study of Android
Adapting Linux for Mobile Platforms: An Empirical Study of AndroidAdapting Linux for Mobile Platforms: An Empirical Study of Android
Adapting Linux for Mobile Platforms: An Empirical Study of AndroidFoutse Khomh
 
Recovering Commit Dependencies for Selective Code Integration in Software Pro...
Recovering Commit Dependencies for Selective Code Integration in Software Pro...Recovering Commit Dependencies for Selective Code Integration in Software Pro...
Recovering Commit Dependencies for Selective Code Integration in Software Pro...Foutse Khomh
 
Late Propagation in Software Clones
Late Propagation in Software ClonesLate Propagation in Software Clones
Late Propagation in Software ClonesFoutse Khomh
 
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...Foutse Khomh
 
Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality? Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality? Foutse Khomh
 

More from Foutse Khomh (11)

Foutse_MSR Vision keynote.pptx
Foutse_MSR Vision keynote.pptxFoutse_MSR Vision keynote.pptx
Foutse_MSR Vision keynote.pptx
 
Stack overflow code_laundering
Stack overflow code_launderingStack overflow code_laundering
Stack overflow code_laundering
 
Mining the Relationship between Anti-patterns Dependencies and Fault-Proneness
Mining the Relationship between Anti-patterns Dependencies and Fault-PronenessMining the Relationship between Anti-patterns Dependencies and Fault-Proneness
Mining the Relationship between Anti-patterns Dependencies and Fault-Proneness
 
Predicting bugs using antipatterns
Predicting bugs using antipatternsPredicting bugs using antipatterns
Predicting bugs using antipatterns
 
How does Context affect the Distribution of Software Maintainability Metrics?
How does Context affect the Distribution of Software Maintainability Metrics?How does Context affect the Distribution of Software Maintainability Metrics?
How does Context affect the Distribution of Software Maintainability Metrics?
 
On Rapid Releases and Software Testing
On Rapid Releases and Software TestingOn Rapid Releases and Software Testing
On Rapid Releases and Software Testing
 
Adapting Linux for Mobile Platforms: An Empirical Study of Android
Adapting Linux for Mobile Platforms: An Empirical Study of AndroidAdapting Linux for Mobile Platforms: An Empirical Study of Android
Adapting Linux for Mobile Platforms: An Empirical Study of Android
 
Recovering Commit Dependencies for Selective Code Integration in Software Pro...
Recovering Commit Dependencies for Selective Code Integration in Software Pro...Recovering Commit Dependencies for Selective Code Integration in Software Pro...
Recovering Commit Dependencies for Selective Code Integration in Software Pro...
 
Late Propagation in Software Clones
Late Propagation in Software ClonesLate Propagation in Software Clones
Late Propagation in Software Clones
 
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
An Entropy Evaluation Approach for Triaging Field Crashes: A Case Study of Mo...
 
Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality? Do Faster Releases Improve Software Quality?
Do Faster Releases Improve Software Quality?
 

Recently uploaded

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWave PLM
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantAxelRicardoTrocheRiq
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 

Recently uploaded (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
What is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need ItWhat is Fashion PLM and Why Do You Need It
What is Fashion PLM and Why Do You Need It
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Salesforce Certified Field Service Consultant
Salesforce Certified Field Service ConsultantSalesforce Certified Field Service Consultant
Salesforce Certified Field Service Consultant
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 

Finding Bugs in Deep Learning Programs with NeuraLint and TheDeepChecker

  • 1. Finding Bugs in Deep Learning Programs SoftWare Analytics and Technologies Lab S.W.A.T Foutse Khomh, PhD, Ing. foutse.khomh@polymtl.ca @SWATLab Canada CIFAR AI Chair on Trustworthy Machine Learning Systems Trustworthy Engineering of AI Software 1
  • 2. Quality Assurance of ML-enabled systems System evolution & continuous delivery 2 Reliable system Data Health Model Health Code Health Time Distribution shift Data imbalance Coding errors Noisy labels Underspecification Bias Vulnerabilities
  • 3. Deep Learning is being rapidly adopted in industry 3
  • 4. Preliminary preparation Data Collection Data Preprocessing Model Implementation Model Training Model Evaluation Model Tuning Data postprocessing Model Prediction DL development phases produce a lot of code! Han et al., What do Programmers Discuss about Deep Learning Frameworks
  • 5. DL programs can be faulty! 5
  • 6. 6 Multi-dimensional space of DL bugs Model Issues not enough learning capacity non-optimal regularization correct incorrect feature extraction incorrect gradient computation … Implementation Issues correct …
  • 7. 7 TOSEM’21 NeuraLint : A linter for DL programs  Capture defects early, so saves rework cost.  Less expensive, because it doesn’t require execution.  Find defects in seconds.  … NeuraLint is fast and effective!  It achieves an accuracy of 91.7 % .  It correctly reported 18 additional bugs that were not found by developers.  The average execution time of NeuraLint for the studied TensorFlow and Keras based programs are 2.892 and 3.197 seconds respectively. Try it out!
  • 8. NeuraLint has two pillars… 8 A meta-model of DL programs Taxonomy of common DL faults Gunel Jahangirova, Nargiz Humbatova, Gabriele Bavota, Vincenzo Riccio, Andrea Stocco, and Paolo Tonella. 2019. Taxonomy of Real Faults in Deep Learning Systems. arXiv preprint arXiv:1910.11015
  • 9. NeuraLint: Execution Flow 9 Graph transformation Rules Potential issues Program Original program Model Extraction Model Run List of detected Issues
  • 10. TheDeepChecker outperforms AWS SMD  DL coding bugs and misconfigurations are detected with (precision, recall), respectively, equal to (90%, 96.4%) and (77%, 83.3%).  Finds 30% more defects than AWS SageMaker. 10 TOSEM’22 TheDeepChecker : Dynamic testing of DL programs  Capture defects during the training process.  Less expensive than testing the resulting model.  Some overhead on the training process. … Try it out!
  • 11. 11 TheDeepChecker verification rules… Parameters-related Issues Untrained Parameters Poor Weight Initialization Parameters’ Values Divergence Parameters Unstable Learning Activation-related Issues Activations out of Range Neuron Saturation Dead ReLU Optimization-related Issues Unable to fit a small sample Zero Loss Diverging Loss Slow or Non decreasing Loss Loss Fluctuations Unstable Gradient: Exploding Unstable Gradient: Vanishing
  • 12. Parameters-related Issues Untrained Parameters Given a layer 𝑖𝑖 and 𝑁𝑁 iterations 𝑊𝑊𝑖𝑖 0 = 𝑊𝑊𝑖𝑖 1 ,𝑏𝑏𝑖𝑖 0 = 𝑏𝑏𝑖𝑖 1 𝑊𝑊𝑖𝑖 1 = 𝑊𝑊𝑖𝑖 2 ,𝑏𝑏𝑖𝑖 1 = 𝑏𝑏𝑖𝑖 2 … 𝑊𝑊𝑖𝑖 𝑁𝑁−1 = 𝑊𝑊𝑖𝑖 𝑁𝑁 ,𝑏𝑏𝑖𝑖 𝑁𝑁−1 = 𝑏𝑏𝑖𝑖 𝑁𝑁 Issue Poor Weight Initialization Parameters’ Values Divergence Given a layer 𝑖𝑖 and an iteration 𝑗𝑗 𝑊𝑊 𝑖𝑖 𝑗𝑗 ≠ 𝑊𝑊 𝑖𝑖 𝑗𝑗+1 𝑏𝑏𝑖𝑖 𝑗𝑗 ≠ 𝑏𝑏𝑖𝑖 𝑗𝑗+1 ∀ 𝑗𝑗 ∈ [0, 𝑁𝑁 − 1] Verification Routine Parameters Unstable Learning TheDeepChecker verification rules…
  • 13. 13 Activation-related Issues Activations out of Range Given a layer 𝑖𝑖 𝐴𝐴𝑖𝑖 ∉ [𝑚𝑚𝑚𝑚𝑚𝑚, 𝑚𝑚𝑚𝑚𝑚𝑚] Issue Neuron Saturation Given a layer 𝑖𝑖 𝑚𝑚𝑚𝑚𝑚𝑚 ≤ 𝐴𝐴𝑖𝑖 ≤ 𝑚𝑚𝑚𝑚𝑚𝑚 Verification Routine Dead ReLU Specification of verification rules
  • 14. 14 Optimization-related Issues Unable to fit a small sample The DNN could not properly minimize the loss. Issue Zero Loss Diverging Loss Slow or Non decreasing Loss The DNN (with regularization off) should overfit a tiny sample of data. Given N iterations 𝑙𝑙𝑙𝑙𝑙𝑙𝑠𝑠𝑁𝑁 = 0 Verification Routine Loss Fluctuations Unstable Gradient: Exploding Unstable Gradient: Vanishing TheDeepChecker verification rules…
  • 16. TheDeepChecker outperforms AWS SMD  DL coding bugs and misconfigurations are detected with (precision, recall), respectively, equal to (90%, 96.4%) and (77%, 83.3%).  Finds 30% more defects than AWS SageMaker. 16 TOSEM’22 TheDeepChecker : Dynamic testing of DL programs  Capture defects during the training process.  Less expensive than testing the resulting model.  Some overhead on the training process. … Try it out!
  • 17. 17 Foutse Khomh, PhD, Ing. foutse.khomh@polymtl.ca @SWATLab Canada CIFAR AI Chair Emmanuel Thepie Fapi