SlideShare a Scribd company logo
1 of 21
Download to read offline
1
Does Neuron Coverage Matter for Deep
Reinforcement Learning?
A Preliminary Study
Miller Trujillo, Mario Linares-Vásquez, Camilo Escobar-Velásquez, Ivana Dusparic, Nicolás Cardozo
2
Reinforcement
learning
• Learn optimal actions for specific environment
conditions by trial-and-error.
• Pick an action for a given state to maximize reward.
• Q-learning, state-action pairs.
Policy π(St, at)
3
Deep
Reinforcement
Learning (DRL)
• A neural network can be used to approximate
a policy function.
• Useful when the state space or action space
are too large to be known.
4
Is this the correct/expected behavior?
out
Testing DRL Behavior
In
• Reward-based
• Test case Generation
5
DNN Testing
• Proposes neuron coverage as test
adequacy metric for DNNs.
Neuron Coverage is an adaptation of statement coverage.
• Empirically demonstrates that changes
in neuron coverage are statistically
correlated with changes in the actions
of self-driving cars.
Objectives
Does traditional DNN testing techniques apply to DRL?
RQ0: Is there any difference in
the evolution patterns of neuron
coverage for the different layers
in a DeepRL system?
RQ1: Is there any correlation
between neuron coverage and
cumulative reward in a
DeepRL system?
6
7
Mountain Car initial state
The Mountain Car problem
Empirical Study Design
8
The Mountain Car problem
Goal:
Climb the mountain in maximum 200
steps.
Environment state:
Position ( ) and velocity ( ).
Actions:
Accelerate towards left, neutral/no
action, Accelerate towards right.
x v
Bounds:
The position is bounded by
The velocity is bounded by
Reward:
[−1.2,0.6]
[−0.07,0.07]
R(x, v) =
{
10, if x ≥ 0.5
−1, otherwise.
Empirical Study Design
We used two different models chosen randomly from GitHub.
Model A Model B
https://github.com/pylSERhttps://github.com/branavg
Empirical Study Design
Models
9
10
Empirical Study Design
Experiment description
11
Empirical Study Design
Models parameters
Batch size:
(learning rate)
(discount factor)
-greedy exploration strategy
starts at and decays after each episode.
32
α = 0.001
γ = 0.99
ϵ
ϵ 1 0.05
12
Empirical Study Design
Analysis Methods
• Neuron Coverage (NC)
• Cumulative Neuron Coverage
(CNC)
• Neuron Layered Coverage (NLC)
• Cumulative Neuron Layered
Coverage (CNLC)
Empirical Study Design
Analysis Methods Example
First iteration of first episode Second iteration of first episode Third iteration of first episode
L1L1 L1L2 L2 L2L3 L3 L3
Cumulative
13
/ This is the place

for subtitle /
Results
14
Results
CNC and CNLC distribution for training and
testing
Model A
15
Results
Model A
NC, NLC, and Reward per episode for training
R(x, v) =
{
10, if x ≥ 0.5
−1, otherwise.
/ This is the place

for subtitle /
Results
16
CNC and CNLC distribution for training and
testing
Results
Model B
17
Results
NC, NLC, and Reward per episode for training
Model B
R(x, v) =
{
10, if x ≥ 0.5
−1, otherwise.
Discussion
Better coverage in training does not
necessarily mean better coverage in
testing.
Neuron coverage is not sufficient to
reach substantial conclusions about
the design or structure of DeepRL
networks.
Training and testing evolution
patterns are not necessarily the
same.
19
Future work
• The evaluation of neuron coverage as
adequacy metric should be extended
to other DeepRL scenarios to
consolidate the validity of our results,
in exploring for a metric that can
correlate coverage and maximize
reward.
• A similar approach could be used to
evaluate other DNN-based learning
models, as means to evaluate the
applicability of coverage testing to
ML.
20
SUMMARY
Thank you!
Miller Trujillo
ma.trujillo10@uniandes.edu.co
Mario Linares-Vásquez
m.linaresv@uniandes.edu.co
Camilo Escobar-Velásquez
ca.escobar2434@uniandes.edu.co
Nicolás Cardozo
n.cardozo@uniandes.edu.co
Ivana Dusparic
duspari@scss.tcd.ie

More Related Content

What's hot

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...Jinwon Lee
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkNAVER Engineering
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionKishor Datta Gupta
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...gabrielesisinna
 
【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...
【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...
【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...Deep Learning JP
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationYogendra Tamang
 
The evaluation for the defense of adversarial attacks
The evaluation for the defense of adversarial attacksThe evaluation for the defense of adversarial attacks
The evaluation for the defense of adversarial attacksSimossyi Funabashi
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detectionDaeHeeKim31
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningMLAI2
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningﺁﺻﻒ ﻋﻠﯽ ﻣﯿﺮ
 
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper reviewtaeseon ryu
 
Download It
Download ItDownload It
Download Itbutest
 
Incremental Discretization for Naive Bayes Learning using FIFFD
Incremental Discretization for Naive Bayes Learning using FIFFDIncremental Discretization for Naive Bayes Learning using FIFFD
Incremental Discretization for Naive Bayes Learning using FIFFDijsrd.com
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide威智 黃
 
Reproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlow
Reproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlowReproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlow
Reproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlowUniversitat Politècnica de Catalunya
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...taeseon ryu
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attributiontaeseon ryu
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNNShuai Zhang
 

What's hot (20)

PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural network
 
Policy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detectionPolicy Based reinforcement Learning for time series Anomaly detection
Policy Based reinforcement Learning for time series Anomaly detection
 
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
 
【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...
【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...
【DL輪読会】Spectral Normalisation for Deep Reinforcement Learning: An Optimisatio...
 
Efficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image ClassficationEfficient Neural Network Architecture for Image Classfication
Efficient Neural Network Architecture for Image Classfication
 
The evaluation for the defense of adversarial attacks
The evaluation for the defense of adversarial attacksThe evaluation for the defense of adversarial attacks
The evaluation for the defense of adversarial attacks
 
Focal loss for dense object detection
Focal loss for dense object detectionFocal loss for dense object detection
Focal loss for dense object detection
 
Task Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive LearningTask Adaptive Neural Network Search with Meta-Contrastive Learning
Task Adaptive Neural Network Search with Meta-Contrastive Learning
 
Few shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learningFew shot learning/ one shot learning/ machine learning
Few shot learning/ one shot learning/ machine learning
 
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
[딥논읽] Meta-Transfer Learning for Zero-Shot Super-Resolution paper review
 
Download It
Download ItDownload It
Download It
 
Incremental Discretization for Naive Bayes Learning using FIFFD
Incremental Discretization for Naive Bayes Learning using FIFFDIncremental Discretization for Naive Bayes Learning using FIFFD
Incremental Discretization for Naive Bayes Learning using FIFFD
 
Mobilenetv1 v2 slide
Mobilenetv1 v2 slideMobilenetv1 v2 slide
Mobilenetv1 v2 slide
 
Reproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlow
Reproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlowReproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlow
Reproducing and Analyzing Adaptive Computation Time in PyTorch and TensorFlow
 
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
Switch Transformers: Scaling to Trillion Parameter Models with Simple and Eff...
 
Restricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for AttributionRestricting the Flow: Information Bottlenecks for Attribution
Restricting the Flow: Information Bottlenecks for Attribution
 
AlexNet
AlexNetAlexNet
AlexNet
 
Introduction to CNN
Introduction to CNNIntroduction to CNN
Introduction to CNN
 

Similar to Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study

AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesAI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesValue Amplify Consulting
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsValery Tkachenko
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Dalei Li
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.pptyang947066
 
Deep learning: Modeling high-level face features through deep networks
Deep learning: Modeling high-level face features through deep networksDeep learning: Modeling high-level face features through deep networks
Deep learning: Modeling high-level face features through deep networksNelson Forte
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learningAkshay Kanchan
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierNeha Kulkarni
 
Knowledge distillation deeplab
Knowledge distillation deeplabKnowledge distillation deeplab
Knowledge distillation deeplabFrozen Paradise
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautzbutest
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautzbutest
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfAnkita Tiwari
 
[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalization[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalizationJaeJun Yoo
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”Dr.(Mrs).Gethsiyal Augasta
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningMLAI2
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...Big Data Spain
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsSajib Sen
 

Similar to Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study (20)

AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI TechnologiesAI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
AI Class Topic 6: Easy Way to Learn Deep Learning AI Technologies
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...Two strategies for large-scale multi-label classification on the YouTube-8M d...
Two strategies for large-scale multi-label classification on the YouTube-8M d...
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Deep learning: Modeling high-level face features through deep networks
Deep learning: Modeling high-level face features through deep networksDeep learning: Modeling high-level face features through deep networks
Deep learning: Modeling high-level face features through deep networks
 
Intro to machine learning
Intro to machine learningIntro to machine learning
Intro to machine learning
 
K-Nearest Neighbor Classifier
K-Nearest Neighbor ClassifierK-Nearest Neighbor Classifier
K-Nearest Neighbor Classifier
 
DNN Model Interpretability
DNN Model InterpretabilityDNN Model Interpretability
DNN Model Interpretability
 
slides.pdf
slides.pdfslides.pdf
slides.pdf
 
Mnist soln
Mnist solnMnist soln
Mnist soln
 
Knowledge distillation deeplab
Knowledge distillation deeplabKnowledge distillation deeplab
Knowledge distillation deeplab
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
 
Learning to Search Henry Kautz
Learning to Search Henry KautzLearning to Search Henry Kautz
Learning to Search Henry Kautz
 
EssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdfEssentialsOfMachineLearning.pdf
EssentialsOfMachineLearning.pdf
 
[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalization[PR12] understanding deep learning requires rethinking generalization
[PR12] understanding deep learning requires rethinking generalization
 
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESIMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
 
Neural Networks in Data Mining - “An Overview”
Neural Networks  in Data Mining -   “An Overview”Neural Networks  in Data Mining -   “An Overview”
Neural Networks in Data Mining - “An Overview”
 
Online Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual LearningOnline Coreset Selection for Rehearsal-based Continual Learning
Online Coreset Selection for Rehearsal-based Continual Learning
 
State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...State of the art time-series analysis with deep learning by Javier Ordóñez at...
State of the art time-series analysis with deep learning by Javier Ordóñez at...
 
An Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their ApplicationsAn Updated Survey on Niching Methods and Their Applications
An Updated Survey on Niching Methods and Their Applications
 

More from Universidad de los Andes

An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...Universidad de los Andes
 
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript ProgramsUniversidad de los Andes
 
[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...Universidad de los Andes
 
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning ProjectsUniversidad de los Andes
 
[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile Apps[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile AppsUniversidad de los Andes
 
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...Universidad de los Andes
 
[CCC'21] Evaluation of Work Stealing Algorithms
[CCC'21] Evaluation of Work Stealing Algorithms[CCC'21] Evaluation of Work Stealing Algorithms
[CCC'21] Evaluation of Work Stealing AlgorithmsUniversidad de los Andes
 
Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...Universidad de los Andes
 
Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...Universidad de los Andes
 
Learning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptationsLearning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptationsUniversidad de los Andes
 
CQL: declarative language for context activation
CQL: declarative language for context activationCQL: declarative language for context activation
CQL: declarative language for context activationUniversidad de los Andes
 
Generating software adaptations using machine learning
Generating software adaptations using machine learningGenerating software adaptations using machine learning
Generating software adaptations using machine learningUniversidad de los Andes
 
[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finales[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finalesUniversidad de los Andes
 
Programming language techniques for adaptive software
Programming language techniques for adaptive softwareProgramming language techniques for adaptive software
Programming language techniques for adaptive softwareUniversidad de los Andes
 
Peace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contextsPeace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contextsUniversidad de los Andes
 

More from Universidad de los Andes (18)

An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...An expressive and modular layer activation mechanism for Context-Oriented Pro...
An expressive and modular layer activation mechanism for Context-Oriented Pro...
 
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
[FTfJP23] Points-to Analysis for Context-oriented Javascript Programs
 
[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...[JIST] Programming language implementations for context-oriented self-adaptiv...
[JIST] Programming language implementations for context-oriented self-adaptiv...
 
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
[CAIN'23] Prevalence of Code Smells in Reinforcement Learning Projects
 
[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile Apps[CIbSE2023] Cross-language clone detection for Mobile Apps
[CIbSE2023] Cross-language clone detection for Mobile Apps
 
Keeping Up! with LaTeX
Keeping Up! with LaTeXKeeping Up! with LaTeX
Keeping Up! with LaTeX
 
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
[JPDC,JCC@LMN22] Ad hoc systems Management and specification with distributed...
 
[CCC'21] Evaluation of Work Stealing Algorithms
[CCC'21] Evaluation of Work Stealing Algorithms[CCC'21] Evaluation of Work Stealing Algorithms
[CCC'21] Evaluation of Work Stealing Algorithms
 
Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...Generating Adaptations from the System Execution using Reinforcement Learning...
Generating Adaptations from the System Execution using Reinforcement Learning...
 
Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...Language Abstractions and Techniques for Developing Collective Adaptive Syste...
Language Abstractions and Techniques for Developing Collective Adaptive Syste...
 
Learning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptationsLearning run-time composition of interacting adaptations
Learning run-time composition of interacting adaptations
 
Distributed context Petri nets
Distributed context Petri netsDistributed context Petri nets
Distributed context Petri nets
 
CQL: declarative language for context activation
CQL: declarative language for context activationCQL: declarative language for context activation
CQL: declarative language for context activation
 
Generating software adaptations using machine learning
Generating software adaptations using machine learningGenerating software adaptations using machine learning
Generating software adaptations using machine learning
 
[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finales[Bachelor_project] Asignación de exámenes finales
[Bachelor_project] Asignación de exámenes finales
 
Programming language techniques for adaptive software
Programming language techniques for adaptive softwareProgramming language techniques for adaptive software
Programming language techniques for adaptive software
 
Peace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contextsPeace COrP: Learning to solve conflicts between contexts
Peace COrP: Learning to solve conflicts between contexts
 
Emergent Software Services
Emergent Software ServicesEmergent Software Services
Emergent Software Services
 

Recently uploaded

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13Steve Thomason
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991RKavithamani
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docxPoojaSen20
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphThiyagu K
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationnomboosow
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3JemimahLaneBuaron
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingTechSoup
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introductionMaksud Ahmed
 

Recently uploaded (20)

The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13The Most Excellent Way | 1 Corinthians 13
The Most Excellent Way | 1 Corinthians 13
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
Industrial Policy - 1948, 1956, 1973, 1977, 1980, 1991
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
mini mental status format.docx
mini    mental       status     format.docxmini    mental       status     format.docx
mini mental status format.docx
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Z Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot GraphZ Score,T Score, Percential Rank and Box Plot Graph
Z Score,T Score, Percential Rank and Box Plot Graph
 
Interactive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communicationInteractive Powerpoint_How to Master effective communication
Interactive Powerpoint_How to Master effective communication
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3Q4-W6-Restating Informational Text Grade 3
Q4-W6-Restating Informational Text Grade 3
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Grant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy ConsultingGrant Readiness 101 TechSoup and Remy Consulting
Grant Readiness 101 TechSoup and Remy Consulting
 
microwave assisted reaction. General introduction
microwave assisted reaction. General introductionmicrowave assisted reaction. General introduction
microwave assisted reaction. General introduction
 
Staff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSDStaff of Color (SOC) Retention Efforts DDSD
Staff of Color (SOC) Retention Efforts DDSD
 

Does Neuron Coverage Matter for Deep Reinforcement Learning? A preliminary study

  • 1. 1 Does Neuron Coverage Matter for Deep Reinforcement Learning? A Preliminary Study Miller Trujillo, Mario Linares-Vásquez, Camilo Escobar-Velásquez, Ivana Dusparic, Nicolás Cardozo
  • 2. 2 Reinforcement learning • Learn optimal actions for specific environment conditions by trial-and-error. • Pick an action for a given state to maximize reward. • Q-learning, state-action pairs. Policy π(St, at)
  • 3. 3 Deep Reinforcement Learning (DRL) • A neural network can be used to approximate a policy function. • Useful when the state space or action space are too large to be known.
  • 4. 4 Is this the correct/expected behavior? out Testing DRL Behavior In • Reward-based • Test case Generation
  • 5. 5 DNN Testing • Proposes neuron coverage as test adequacy metric for DNNs. Neuron Coverage is an adaptation of statement coverage. • Empirically demonstrates that changes in neuron coverage are statistically correlated with changes in the actions of self-driving cars.
  • 6. Objectives Does traditional DNN testing techniques apply to DRL? RQ0: Is there any difference in the evolution patterns of neuron coverage for the different layers in a DeepRL system? RQ1: Is there any correlation between neuron coverage and cumulative reward in a DeepRL system? 6
  • 7. 7 Mountain Car initial state The Mountain Car problem Empirical Study Design
  • 8. 8 The Mountain Car problem Goal: Climb the mountain in maximum 200 steps. Environment state: Position ( ) and velocity ( ). Actions: Accelerate towards left, neutral/no action, Accelerate towards right. x v Bounds: The position is bounded by The velocity is bounded by Reward: [−1.2,0.6] [−0.07,0.07] R(x, v) = { 10, if x ≥ 0.5 −1, otherwise. Empirical Study Design
  • 9. We used two different models chosen randomly from GitHub. Model A Model B https://github.com/pylSERhttps://github.com/branavg Empirical Study Design Models 9
  • 11. 11 Empirical Study Design Models parameters Batch size: (learning rate) (discount factor) -greedy exploration strategy starts at and decays after each episode. 32 α = 0.001 γ = 0.99 ϵ ϵ 1 0.05
  • 12. 12 Empirical Study Design Analysis Methods • Neuron Coverage (NC) • Cumulative Neuron Coverage (CNC) • Neuron Layered Coverage (NLC) • Cumulative Neuron Layered Coverage (CNLC)
  • 13. Empirical Study Design Analysis Methods Example First iteration of first episode Second iteration of first episode Third iteration of first episode L1L1 L1L2 L2 L2L3 L3 L3 Cumulative 13
  • 14. / This is the place
 for subtitle / Results 14 Results CNC and CNLC distribution for training and testing Model A
  • 15. 15 Results Model A NC, NLC, and Reward per episode for training R(x, v) = { 10, if x ≥ 0.5 −1, otherwise.
  • 16. / This is the place
 for subtitle / Results 16 CNC and CNLC distribution for training and testing Results Model B
  • 17. 17 Results NC, NLC, and Reward per episode for training Model B R(x, v) = { 10, if x ≥ 0.5 −1, otherwise.
  • 18. Discussion Better coverage in training does not necessarily mean better coverage in testing. Neuron coverage is not sufficient to reach substantial conclusions about the design or structure of DeepRL networks. Training and testing evolution patterns are not necessarily the same.
  • 19. 19 Future work • The evaluation of neuron coverage as adequacy metric should be extended to other DeepRL scenarios to consolidate the validity of our results, in exploring for a metric that can correlate coverage and maximize reward. • A similar approach could be used to evaluate other DNN-based learning models, as means to evaluate the applicability of coverage testing to ML.
  • 21. Thank you! Miller Trujillo ma.trujillo10@uniandes.edu.co Mario Linares-Vásquez m.linaresv@uniandes.edu.co Camilo Escobar-Velásquez ca.escobar2434@uniandes.edu.co Nicolás Cardozo n.cardozo@uniandes.edu.co Ivana Dusparic duspari@scss.tcd.ie