SlideShare a Scribd company logo
1 of 18
Download to read offline
ETeMoX: explaining
reinforcement learning
J. M. Parra-Ullauri1
, A. García-Domínguez2
, N. Bencomo3
,
C. Zheng4
, C. Zhen5
, J. Boubeta-Puig6
, G. Ortiz6
, S. Yang7
1: Aston University, 2: University of York, 3: Durham University
4: University of Oxford, 5: University of Science and Technology of China
6: University of Cadiz, 7: Edinburgh Napier University
MODELS 2022 - Thursday October 27th
, 2022
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Rising need for explanations in self-adaptation / AI
● Software is being written to deal with more and more complex environments,
where they need to reconfigure themselves and learn from experience
● If not careful, these systems can be “black boxes” where we can only take
their decisions at face value - it will be hard to calibrate our trust on them
● We want to be able to ask things like:
○ Why did they take that action?
○ Why did they not take that *other* action?
○ How do you (roughly) work?
● The “right to explanation” is being enshrined in the GDPR, or the IEEE P7001
standard for transparency of autonomous systems
● There is an entire field on eXplainable AI (XAI)
2
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Types of explanations and stages involved
● Explanations can be broadly classified by scope into:
○ Local - for a specific decision
○ Global - for the overall behaviour of the system (usually, a simplified behavioural model)
● Adadi et al. identified four uses for explanations:
○ To justify decisions impacting people
○ To control systems into an envelope of good behaviour ⬅
○ To discover knowledge from the system behaviour
○ To improve the system by highlighting flaws ⬅
● Neerincx considered three stages for producing these explanations:
○ Generation - obtain necessary data and reason about it ⬅
○ Communication - show it to consumer (human / system) ⬅
○ Reception - was it effective and efficient?
3
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
How can MDE help XAI?
● In Model-Driven Engineering (MDE), we already have significant experience
abstracting away unnecessary complexity
● At design time, we raise the level of abstraction so developers of a system
can think in terms of their domain concepts
● We can also do this while the system is running - we can build a model of
what the system is perceiving, thinking, and doing (a runtime model)
● If we decide on a common trace metamodel for this, we can reuse efforts on
introducing explainability across systems
4
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
MDE: Reusable Trace Metamodel - common half
5
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
MDE: Reusable
Trace Metamodel -
specific half
● First half of the metamodel is
reusable across systems
making their own decisions
● Second half of the metamodel is
specific - this one is for systems
using Q-Learning (a type of
Reinf. Learning)
● A decision takes into account
the Q-values of each Action
● Observations have rewards
associated to them, and map to
a state in the Q-table
6
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
MDE: Indexing
Models into
Temporal Graph DBs
● At each system timeslice, the
runtime model is indexed into a
temporal graph
● Efficient representation of a
graph’s full history, using
copy-on-write state chunks
● Implemented by Greycat (from
Hartmann et al.), and used by
Eclipse Hawk for automated
model indexing
● More details here:
https://www.eclipse.org/hawk/ad
vanced-use/temporal-queries/
7
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
MDE: History-Aware
Model Querying
● For explanation generation, we
can query the temporal graph
● We created a Hawk-specific
dialect of EOL with time-aware
predicates and properties
● More details in our MODELS
2019 paper
AGD, NB, JMPU and LGP,
‘Querying and annotating model
histories with time-aware patterns’,
http://dx.doi.org/10.1109/MODELS.
2019.000-2
8
Version traversal x.versions, x.next, x.prev,
x.time, x.earliest, x.latest…
Temporal assertions x.always(version | p),
x.never(v | p),
x.eventually(v | p)...
Predicate-based scoping x.since(v | p), x.until(v | p)...
Context-based scoping x.sinceThen, x.untilThen…
Unscoping x.unscoped
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Scaling up temporal graphs to large event volumes
● We first applied history-based expls. to Bayesian Learning-based systems
○ Partially Observable Markov Decision Processes (POMDP)
○ Had a case study on data mirroring over the network (Remote Data Mirroring)
○ Wasn’t too resource-intensive (we could simply record all versions)
● Then we tried applying it to a Reinforcement Learning system
○ Tens of training epochs, each with thousands of episodes
○ Original RL system had per-timeslice MongoDB with GBs of records to be indexed
○ RL system changed to send updates directly to Hawk - CoW reduced storage needs
○ Still a lot of history to go through - queries could take a long time!
● Do we really need all this history?
○ Answer: No.
○ How do we select the “right” moments, without imposing too much load?
9
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Event-Driven Monitoring: Complex Event Processing
10
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Event-driven Temporal Models for eXplanations (ETeMoX)
11
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Case study: airborne base stations
12
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Experiment 1:
Evolution of metrics
(optionally sampled)
● We evaluated the impact of
sampling at different rates on
the accuracy of a query
providing the historic reward
values during the RL training
● We set up the CEP engine with
Esper EPL rules as in the top
right
● We observed linear decreases
in storage required depending
on sampling rate
● 10% sampling is safe, more
than that depended on the RL
algorithm (DQN is sensitive!)
13
Q-Learning DQN
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Experiment 2: Exploration vs Exploitation (1/2)
14
● RL systems don’t always pick the best
option (exploitation): they try things
sometimes to learn more (exploration)
● How often does this happen?
● We compared two approaches to track this:
a. CEP pattern to detect exploration/exploitation and
only index episodes with exploration
b. EOL query on full history, to check CEP pattern
correctness
● Q-Learning explored 1.41% of the time,
SARSA 7.99%, DQN 7.82%
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Experiment 2: Exploration vs Exploitation (2/2)
15
● We tried using the exploration CEP rule as a filter for metric evolution, too
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
Experiment 3: user handovers between stations
● To provide continuous service, a user may be handed over to another station
● We wrote an EOL query to detect handovers in the system history
○ Handover: signal-to-noise ratio changes significantly across stations between timepoints
○ Found 1784 handovers in Q-Learning, 590 in SARSA, 82176 in DQN
● These queries required many checks:
○ 10 episodes, 2000 time steps
○ 2 stations, 1050 users
○ All together: 42M combinations to check!
● Required times:
○ 917s for Q-Learning
○ 1,132s for SARSA
○ 7,914s for DQN
16
October 27th
, 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning
What’s next?
● Optimise queries via Hawk timeline annotations and CEP time windows
● Explanations for other uses:
○ Human-in-the-loop (SAM 2022 presentation on Monday showed early work)
○ Hyper-parameter optimisation (external system requests explanations and drives change)
○ Global explanations of the system behaviour (event graphs)
● Studying explanation reception:
○ Effectiveness of explanation formats: plots, results, generated text, diagrams…
○ Focused on system developers so far: look into less technical audiences
○ Consider existing models for evaluation
■ Technology Acceptance Model (Davis)
■ XAI metrics (Rosenfeld)
17
Thank you!
Antonio García-Domínguez
@antoniogado - a.garcia-dominguez@york.ac.uk
Juan Marcelo Parra-Ullauri
j.parra-ullauri@aston.ac.uk
18

More Related Content

Similar to MODELS 2022 Journal-First presentation: ETeMoX - explaining reinforcement learning

Simulation Project 2
Simulation Project 2Simulation Project 2
Simulation Project 2shri1984
 
Parallel multivariate deep learning models for time-series prediction: A comp...
Parallel multivariate deep learning models for time-series prediction: A comp...Parallel multivariate deep learning models for time-series prediction: A comp...
Parallel multivariate deep learning models for time-series prediction: A comp...IAESIJAI
 
Report on Knowledge Modeling in Various applications in Traffic Systems
Report on Knowledge Modeling in Various applications in Traffic SystemsReport on Knowledge Modeling in Various applications in Traffic Systems
Report on Knowledge Modeling in Various applications in Traffic SystemsYomna Mahmoud Ibrahim Hassan
 
Video captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learningVideo captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learningIJECEIAES
 
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
End-to-end deep auto-encoder for segmenting a moving object  with limited tra...End-to-end deep auto-encoder for segmenting a moving object  with limited tra...
End-to-end deep auto-encoder for segmenting a moving object with limited tra...IJECEIAES
 
Module 04 Content· As a continuation to examining your policies, r
Module 04 Content· As a continuation to examining your policies, rModule 04 Content· As a continuation to examining your policies, r
Module 04 Content· As a continuation to examining your policies, rIlonaThornburg83
 
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...Bharath Sudharsan
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET Journal
 
TFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platformTFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platformShunya Ueta
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning민재 정
 
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...cscpconf
 
Machine learning testing survey, landscapes and horizons, the Cliff Notes
Machine learning testing  survey, landscapes and horizons, the Cliff NotesMachine learning testing  survey, landscapes and horizons, the Cliff Notes
Machine learning testing survey, landscapes and horizons, the Cliff NotesHeemeng Foo
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahulKirtoniya
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsQuantUniversity
 
A Survey of Machine Learning Methods Applied to Computer ...
A Survey of Machine Learning Methods Applied to Computer ...A Survey of Machine Learning Methods Applied to Computer ...
A Survey of Machine Learning Methods Applied to Computer ...butest
 
M.Tech: AI and Neural Networks Assignment II
M.Tech:  AI and Neural Networks Assignment IIM.Tech:  AI and Neural Networks Assignment II
M.Tech: AI and Neural Networks Assignment IIVijayananda Mohire
 

Similar to MODELS 2022 Journal-First presentation: ETeMoX - explaining reinforcement learning (20)

Simulation Project 2
Simulation Project 2Simulation Project 2
Simulation Project 2
 
Parallel multivariate deep learning models for time-series prediction: A comp...
Parallel multivariate deep learning models for time-series prediction: A comp...Parallel multivariate deep learning models for time-series prediction: A comp...
Parallel multivariate deep learning models for time-series prediction: A comp...
 
Report on Knowledge Modeling in Various applications in Traffic Systems
Report on Knowledge Modeling in Various applications in Traffic SystemsReport on Knowledge Modeling in Various applications in Traffic Systems
Report on Knowledge Modeling in Various applications in Traffic Systems
 
Video captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learningVideo captioning in Vietnamese using deep learning
Video captioning in Vietnamese using deep learning
 
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
End-to-end deep auto-encoder for segmenting a moving object  with limited tra...End-to-end deep auto-encoder for segmenting a moving object  with limited tra...
End-to-end deep auto-encoder for segmenting a moving object with limited tra...
 
Module 04 Content· As a continuation to examining your policies, r
Module 04 Content· As a continuation to examining your policies, rModule 04 Content· As a continuation to examining your policies, r
Module 04 Content· As a continuation to examining your policies, r
 
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
Ensemble Methods for Collective Intelligence: Combining Ubiquitous ML Models ...
 
IRJET- Semantics based Document Clustering
IRJET- Semantics based Document ClusteringIRJET- Semantics based Document Clustering
IRJET- Semantics based Document Clustering
 
TFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platformTFX: A tensor flow-based production-scale machine learning platform
TFX: A tensor flow-based production-scale machine learning platform
 
Combinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learningCombinatorial optimization and deep reinforcement learning
Combinatorial optimization and deep reinforcement learning
 
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
A SIMPLE PROCESS TO SPEED UP MACHINE LEARNING METHODS: APPLICATION TO HIDDEN ...
 
IC2IT 2013 Presentation
IC2IT 2013 PresentationIC2IT 2013 Presentation
IC2IT 2013 Presentation
 
IC2IT 2013 Presentation
IC2IT 2013 PresentationIC2IT 2013 Presentation
IC2IT 2013 Presentation
 
Machine learning testing survey, landscapes and horizons, the Cliff Notes
Machine learning testing  survey, landscapes and horizons, the Cliff NotesMachine learning testing  survey, landscapes and horizons, the Cliff Notes
Machine learning testing survey, landscapes and horizons, the Cliff Notes
 
Presentation
PresentationPresentation
Presentation
 
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptxRahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
Rahul_Kirtoniya_11800121032_CSE_Machine_Learning.pptx
 
Time series analysis : Refresher and Innovations
Time series analysis : Refresher and InnovationsTime series analysis : Refresher and Innovations
Time series analysis : Refresher and Innovations
 
A Survey of Machine Learning Methods Applied to Computer ...
A Survey of Machine Learning Methods Applied to Computer ...A Survey of Machine Learning Methods Applied to Computer ...
A Survey of Machine Learning Methods Applied to Computer ...
 
M.Tech: AI and Neural Networks Assignment II
M.Tech:  AI and Neural Networks Assignment IIM.Tech:  AI and Neural Networks Assignment II
M.Tech: AI and Neural Networks Assignment II
 
T0 numtq0n tk=
T0 numtq0n tk=T0 numtq0n tk=
T0 numtq0n tk=
 

More from Antonio García-Domínguez

History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...
History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...
History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...Antonio García-Domínguez
 
Boosting individual feedback with AutoFeedback
Boosting individual feedback with AutoFeedbackBoosting individual feedback with AutoFeedback
Boosting individual feedback with AutoFeedbackAntonio García-Domínguez
 
MODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patternsMODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patternsAntonio García-Domínguez
 
Tips and resources for publication-grade figures and tables
Tips and resources for publication-grade figures and tablesTips and resources for publication-grade figures and tables
Tips and resources for publication-grade figures and tablesAntonio García-Domínguez
 
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a serviceCOMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a serviceAntonio García-Domínguez
 
MRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsMRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsAntonio García-Domínguez
 
Hawk: indexado de modelos en bases de datos NoSQL
Hawk: indexado de modelos en bases de datos NoSQLHawk: indexado de modelos en bases de datos NoSQL
Hawk: indexado de modelos en bases de datos NoSQLAntonio García-Domínguez
 
OCL'16 slides: Models from Code or Code as a Model?
OCL'16 slides: Models from Code or Code as a Model?OCL'16 slides: Models from Code or Code as a Model?
OCL'16 slides: Models from Code or Code as a Model?Antonio García-Domínguez
 
Developing a new Epsilon Language through Annotations: TestLang
Developing a new Epsilon Language through Annotations: TestLangDeveloping a new Epsilon Language through Annotations: TestLang
Developing a new Epsilon Language through Annotations: TestLangAntonio García-Domínguez
 
MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...
MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...
MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...Antonio García-Domínguez
 
Software libre para la integración de información en la Universidad de Cádiz
Software libre para la integración de información en la Universidad de CádizSoftware libre para la integración de información en la Universidad de Cádiz
Software libre para la integración de información en la Universidad de CádizAntonio García-Domínguez
 

More from Antonio García-Domínguez (17)

MODELS 2022 Picto Web tool demo
MODELS 2022 Picto Web tool demoMODELS 2022 Picto Web tool demo
MODELS 2022 Picto Web tool demo
 
EduSymp 2022 slides (The Epsilon Playground)
EduSymp 2022 slides (The Epsilon Playground)EduSymp 2022 slides (The Epsilon Playground)
EduSymp 2022 slides (The Epsilon Playground)
 
History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...
History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...
History-Aware Explanations: Towards Enabling Human-in-the-Loop in Self-Adapti...
 
Boosting individual feedback with AutoFeedback
Boosting individual feedback with AutoFeedbackBoosting individual feedback with AutoFeedback
Boosting individual feedback with AutoFeedback
 
MODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patternsMODELS 2019: Querying and annotating model histories with time-aware patterns
MODELS 2019: Querying and annotating model histories with time-aware patterns
 
Tips and resources for publication-grade figures and tables
Tips and resources for publication-grade figures and tablesTips and resources for publication-grade figures and tables
Tips and resources for publication-grade figures and tables
 
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a serviceCOMMitMDE'18: Eclipse Hawk: model repository querying as a service
COMMitMDE'18: Eclipse Hawk: model repository querying as a service
 
MRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph modelsMRT 2018: reflecting on the past and the present with temporal graph models
MRT 2018: reflecting on the past and the present with temporal graph models
 
Hawk: indexado de modelos en bases de datos NoSQL
Hawk: indexado de modelos en bases de datos NoSQLHawk: indexado de modelos en bases de datos NoSQL
Hawk: indexado de modelos en bases de datos NoSQL
 
Software and product quality for videogames
Software and product quality for videogamesSoftware and product quality for videogames
Software and product quality for videogames
 
OCL'16 slides: Models from Code or Code as a Model?
OCL'16 slides: Models from Code or Code as a Model?OCL'16 slides: Models from Code or Code as a Model?
OCL'16 slides: Models from Code or Code as a Model?
 
Developing a new Epsilon Language through Annotations: TestLang
Developing a new Epsilon Language through Annotations: TestLangDeveloping a new Epsilon Language through Annotations: TestLang
Developing a new Epsilon Language through Annotations: TestLang
 
MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...
MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...
MoDELS'16 presentation: Integration of a Graph-Based Model Indexer in Commerc...
 
ECMFA 2016 slides
ECMFA 2016 slidesECMFA 2016 slides
ECMFA 2016 slides
 
BMSD 2015 slides (revised)
BMSD 2015 slides (revised)BMSD 2015 slides (revised)
BMSD 2015 slides (revised)
 
Elaboración de un buen póster científico
Elaboración de un buen póster científicoElaboración de un buen póster científico
Elaboración de un buen póster científico
 
Software libre para la integración de información en la Universidad de Cádiz
Software libre para la integración de información en la Universidad de CádizSoftware libre para la integración de información en la Universidad de Cádiz
Software libre para la integración de información en la Universidad de Cádiz
 

Recently uploaded

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxhumanexperienceaaa
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learningmisbanausheenparvam
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130Suhani Kapoor
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...srsj9000
 

Recently uploaded (20)

(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptxthe ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
the ladakh protest in leh ladakh 2024 sonam wangchuk.pptx
 
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
9953056974 Call Girls In South Ex, Escorts (Delhi) NCR.pdf
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Roadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and RoutesRoadmap to Membership of RICS - Pathways and Routes
Roadmap to Membership of RICS - Pathways and Routes
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
chaitra-1.pptx fake news detection using machine learning
chaitra-1.pptx  fake news detection using machine learningchaitra-1.pptx  fake news detection using machine learning
chaitra-1.pptx fake news detection using machine learning
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
VIP Call Girls Service Hitech City Hyderabad Call +91-8250192130
 
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
Gfe Mayur Vihar Call Girls Service WhatsApp -> 9999965857 Available 24x7 ^ De...
 

MODELS 2022 Journal-First presentation: ETeMoX - explaining reinforcement learning

  • 1. ETeMoX: explaining reinforcement learning J. M. Parra-Ullauri1 , A. García-Domínguez2 , N. Bencomo3 , C. Zheng4 , C. Zhen5 , J. Boubeta-Puig6 , G. Ortiz6 , S. Yang7 1: Aston University, 2: University of York, 3: Durham University 4: University of Oxford, 5: University of Science and Technology of China 6: University of Cadiz, 7: Edinburgh Napier University MODELS 2022 - Thursday October 27th , 2022
  • 2. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Rising need for explanations in self-adaptation / AI ● Software is being written to deal with more and more complex environments, where they need to reconfigure themselves and learn from experience ● If not careful, these systems can be “black boxes” where we can only take their decisions at face value - it will be hard to calibrate our trust on them ● We want to be able to ask things like: ○ Why did they take that action? ○ Why did they not take that *other* action? ○ How do you (roughly) work? ● The “right to explanation” is being enshrined in the GDPR, or the IEEE P7001 standard for transparency of autonomous systems ● There is an entire field on eXplainable AI (XAI) 2
  • 3. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Types of explanations and stages involved ● Explanations can be broadly classified by scope into: ○ Local - for a specific decision ○ Global - for the overall behaviour of the system (usually, a simplified behavioural model) ● Adadi et al. identified four uses for explanations: ○ To justify decisions impacting people ○ To control systems into an envelope of good behaviour ⬅ ○ To discover knowledge from the system behaviour ○ To improve the system by highlighting flaws ⬅ ● Neerincx considered three stages for producing these explanations: ○ Generation - obtain necessary data and reason about it ⬅ ○ Communication - show it to consumer (human / system) ⬅ ○ Reception - was it effective and efficient? 3
  • 4. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning How can MDE help XAI? ● In Model-Driven Engineering (MDE), we already have significant experience abstracting away unnecessary complexity ● At design time, we raise the level of abstraction so developers of a system can think in terms of their domain concepts ● We can also do this while the system is running - we can build a model of what the system is perceiving, thinking, and doing (a runtime model) ● If we decide on a common trace metamodel for this, we can reuse efforts on introducing explainability across systems 4
  • 5. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning MDE: Reusable Trace Metamodel - common half 5
  • 6. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning MDE: Reusable Trace Metamodel - specific half ● First half of the metamodel is reusable across systems making their own decisions ● Second half of the metamodel is specific - this one is for systems using Q-Learning (a type of Reinf. Learning) ● A decision takes into account the Q-values of each Action ● Observations have rewards associated to them, and map to a state in the Q-table 6
  • 7. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning MDE: Indexing Models into Temporal Graph DBs ● At each system timeslice, the runtime model is indexed into a temporal graph ● Efficient representation of a graph’s full history, using copy-on-write state chunks ● Implemented by Greycat (from Hartmann et al.), and used by Eclipse Hawk for automated model indexing ● More details here: https://www.eclipse.org/hawk/ad vanced-use/temporal-queries/ 7
  • 8. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning MDE: History-Aware Model Querying ● For explanation generation, we can query the temporal graph ● We created a Hawk-specific dialect of EOL with time-aware predicates and properties ● More details in our MODELS 2019 paper AGD, NB, JMPU and LGP, ‘Querying and annotating model histories with time-aware patterns’, http://dx.doi.org/10.1109/MODELS. 2019.000-2 8 Version traversal x.versions, x.next, x.prev, x.time, x.earliest, x.latest… Temporal assertions x.always(version | p), x.never(v | p), x.eventually(v | p)... Predicate-based scoping x.since(v | p), x.until(v | p)... Context-based scoping x.sinceThen, x.untilThen… Unscoping x.unscoped
  • 9. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Scaling up temporal graphs to large event volumes ● We first applied history-based expls. to Bayesian Learning-based systems ○ Partially Observable Markov Decision Processes (POMDP) ○ Had a case study on data mirroring over the network (Remote Data Mirroring) ○ Wasn’t too resource-intensive (we could simply record all versions) ● Then we tried applying it to a Reinforcement Learning system ○ Tens of training epochs, each with thousands of episodes ○ Original RL system had per-timeslice MongoDB with GBs of records to be indexed ○ RL system changed to send updates directly to Hawk - CoW reduced storage needs ○ Still a lot of history to go through - queries could take a long time! ● Do we really need all this history? ○ Answer: No. ○ How do we select the “right” moments, without imposing too much load? 9
  • 10. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Event-Driven Monitoring: Complex Event Processing 10
  • 11. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Event-driven Temporal Models for eXplanations (ETeMoX) 11
  • 12. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Case study: airborne base stations 12
  • 13. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Experiment 1: Evolution of metrics (optionally sampled) ● We evaluated the impact of sampling at different rates on the accuracy of a query providing the historic reward values during the RL training ● We set up the CEP engine with Esper EPL rules as in the top right ● We observed linear decreases in storage required depending on sampling rate ● 10% sampling is safe, more than that depended on the RL algorithm (DQN is sensitive!) 13 Q-Learning DQN
  • 14. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Experiment 2: Exploration vs Exploitation (1/2) 14 ● RL systems don’t always pick the best option (exploitation): they try things sometimes to learn more (exploration) ● How often does this happen? ● We compared two approaches to track this: a. CEP pattern to detect exploration/exploitation and only index episodes with exploration b. EOL query on full history, to check CEP pattern correctness ● Q-Learning explored 1.41% of the time, SARSA 7.99%, DQN 7.82%
  • 15. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Experiment 2: Exploration vs Exploitation (2/2) 15 ● We tried using the exploration CEP rule as a filter for metric evolution, too
  • 16. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning Experiment 3: user handovers between stations ● To provide continuous service, a user may be handed over to another station ● We wrote an EOL query to detect handovers in the system history ○ Handover: signal-to-noise ratio changes significantly across stations between timepoints ○ Found 1784 handovers in Q-Learning, 590 in SARSA, 82176 in DQN ● These queries required many checks: ○ 10 episodes, 2000 time steps ○ 2 stations, 1050 users ○ All together: 42M combinations to check! ● Required times: ○ 917s for Q-Learning ○ 1,132s for SARSA ○ 7,914s for DQN 16
  • 17. October 27th , 2022 Event-driven temporal models for explanations - ETeMoX: explaining reinforcement learning What’s next? ● Optimise queries via Hawk timeline annotations and CEP time windows ● Explanations for other uses: ○ Human-in-the-loop (SAM 2022 presentation on Monday showed early work) ○ Hyper-parameter optimisation (external system requests explanations and drives change) ○ Global explanations of the system behaviour (event graphs) ● Studying explanation reception: ○ Effectiveness of explanation formats: plots, results, generated text, diagrams… ○ Focused on system developers so far: look into less technical audiences ○ Consider existing models for evaluation ■ Technology Acceptance Model (Davis) ■ XAI metrics (Rosenfeld) 17
  • 18. Thank you! Antonio García-Domínguez @antoniogado - a.garcia-dominguez@york.ac.uk Juan Marcelo Parra-Ullauri j.parra-ullauri@aston.ac.uk 18