SlideShare a Scribd company logo
Hi
Deep Reinforcement Learning
Azzeddine CHENINE
AI Research Engineer @instadeepai
An introduction workshop
Deep Reinforcement Learning
Azzeddine CHENINE
AI Research Engineer @instadeepai
What? How? What’s hot about it 🔥?
4
5
6
7
8
9
But What if…..
…Your task is manifested by a series of decisions to
reach or keep an optimal performance
10
Reinforcement Learning
• Building agents that are able to learn an optimal policy to preform a task within a
Markovian environment
…What ?
11
Reinforcement Learning
• Building agents that are able to learn an optimal policy to preform a task within a
Markovian environment
• In a Markovian environment the next state depends only on the current state and the
agent that will be preformed by the agent
…What ?
12
Reinforcement Learning
• Building agents that are able to learn an optimal policy to preform a task within a
Markovian environment
• In a Markovian environment the next state depends only on the current state and the
agent that will be preformed by the agent
…What ?
• This task can be episodic or continues
13
Reinforcement Learning
…How ?
Environment
Agent
14
Reinforcement Learning
…How ?
Environment
Agent
State
15
Reinforcement Learning
…How ?
Environment
Agent
Action
State
16
Reinforcement Learning
…How ?
Environment
Agent
Reward New State Action
17
Reinforcement Learning
…How ?
Environment
Agent
Reward
New State
Action
• Reach an optimal policy
𝝿
•
𝝿
can be deterministic or stochastic
• A deterministic version of
𝝿
can be derived from the
action value function Q(S,a)
• You are free to choose your policy type
18
What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
19
What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
• Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic
programing Algorithms
20
What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
• Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic
programing Algorithms
• Monte-carlo, Sarsa (not salsa 💃), Q-learning, expected Sarsa…etc
21
What’s hot 🔥about DeepRL
• Reinforcement Learning existed since the early 80s
• Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic
programing Algorithms
• Monte-carlo, Sarsa (not salsa 💃), Q-learning, expected Sarsa…etc
• Data structures to hold reference for the actions values of each state
22
What’s hot 🔥about DeepRL
Bio
Stocks Games
Robots
• Modern environments present complex action and state spaces
23
What’s hot 🔥about DeepRL
Bio
Stocks Games
Robots
• Deep Neural Networks are able to extract features from different state types
24
• Modern environments present complex action and state spaces
What’s hot 🔥about DeepRL
Bio
Stocks Games
Robots
• Deep Neural Networks are able to approximate functions that map an observation to
a desired output space
25
• Deep Neural Networks are able to extract features from different state types
• Modern environments present complex action and state spaces
DeepRL workshop
• Inspecting a dynamic programing version of Q-learning
• Inspecting limitation and Deep Neural network use case
• Implementing Deep Q-learning with Tensor
fl
ow Keras API and Pytorch
• Getting introduced to OpenAI GYM for reinforcement learning environments
• Visualizing the training and inference of a DQN agents
26
Other hot topics
• Multi-agent reinforcement learning
• Imitation learning and behaviour cloning
• The problem of generation in Deep RL
• Policy based methods: PPO, A2C, A3C…
• DeepRL frameworks: RLLib, TF Agents…
27
Resources
• Berkeley DeepRL Bootcamp on Youtube
• Reinforcement Learning, an introduction
• Udacity DeepRL Nanodegree if possible
• RL course by David silver on Youtube
• Open AI gym documentation
28

More Related Content

Similar to Introduction to Deep Reinforcement Learning workshop at School of Ai: AI Day

An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
Subrat Panda, PhD
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
ssuseradaf5f
 
Sippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest LouisvilleSippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest Louisville
Dawn Yankeelov
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx
ManiMaran230751
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
Khaled Saleh
 
Deep Learning in Robotics
Deep Learning in RoboticsDeep Learning in Robotics
Deep Learning in Robotics
Sungjoon Choi
 
“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...
“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...
“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...
Edge AI and Vision Alliance
 
Orchestration, the conductor's score
Orchestration, the conductor's scoreOrchestration, the conductor's score
Orchestration, the conductor's score
Salesforce Engineering
 
Introduction2drl
Introduction2drlIntroduction2drl
Introduction2drl
Shenglin Zhao
 
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014 Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
Persontyle
 
Fundamentals of Machine Learning Bootcamp - 24 Nov London
Fundamentals of Machine Learning Bootcamp - 24 Nov London Fundamentals of Machine Learning Bootcamp - 24 Nov London
Fundamentals of Machine Learning Bootcamp - 24 Nov London
Persontyle
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at Scale
Databricks
 
Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...
Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...
Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...
Steve Werby
 
孫民/從電腦視覺看人工智慧 : 下一件大事
孫民/從電腦視覺看人工智慧 : 下一件大事孫民/從電腦視覺看人工智慧 : 下一件大事
孫民/從電腦視覺看人工智慧 : 下一件大事
台灣資料科學年會
 
Susan epstein at ibm csig speaker series
Susan epstein at ibm csig speaker seriesSusan epstein at ibm csig speaker series
Susan epstein at ibm csig speaker series
diannepatricia
 
Demystifying Machine Learning and Artificial Intelligence
Demystifying Machine Learning and Artificial IntelligenceDemystifying Machine Learning and Artificial Intelligence
Demystifying Machine Learning and Artificial Intelligence
EPCC, University of Edinburgh
 
How to Rescue Complex Projects from Disaster
How to Rescue Complex Projects from DisasterHow to Rescue Complex Projects from Disaster
How to Rescue Complex Projects from Disaster
Perforce
 
Is Production RL at a tipping point?
Is Production RL at a tipping point?Is Production RL at a tipping point?
Is Production RL at a tipping point?
M Waleed Kadous
 
Keynote - From Monolith to Microservices - Lessons Learned in the Real World
Keynote - From Monolith to Microservices - Lessons Learned in the Real WorldKeynote - From Monolith to Microservices - Lessons Learned in the Real World
Keynote - From Monolith to Microservices - Lessons Learned in the Real World
Eran Stiller
 
Types of Artificial Intelligence.ppt
Types of Artificial Intelligence.pptTypes of Artificial Intelligence.ppt
Types of Artificial Intelligence.ppt
GEETHAS668001
 

Similar to Introduction to Deep Reinforcement Learning workshop at School of Ai: AI Day (20)

An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
anintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdfanintroductiontoreinforcementlearning-180912151720.pdf
anintroductiontoreinforcementlearning-180912151720.pdf
 
Sippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest LouisvilleSippin: A Mobile Application Case Study presented at Techfest Louisville
Sippin: A Mobile Application Case Study presented at Techfest Louisville
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
 
Deep Learning in Robotics
Deep Learning in RoboticsDeep Learning in Robotics
Deep Learning in Robotics
 
“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...
“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...
“Reinforcement Learning: a Practical Introduction,” a Presentation from Micro...
 
Orchestration, the conductor's score
Orchestration, the conductor's scoreOrchestration, the conductor's score
Orchestration, the conductor's score
 
Introduction2drl
Introduction2drlIntroduction2drl
Introduction2drl
 
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014 Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
Fundamentals of Machine Learning Bootcamp - 24 Nov London 2014
 
Fundamentals of Machine Learning Bootcamp - 24 Nov London
Fundamentals of Machine Learning Bootcamp - 24 Nov London Fundamentals of Machine Learning Bootcamp - 24 Nov London
Fundamentals of Machine Learning Bootcamp - 24 Nov London
 
Horizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at ScaleHorizon: Deep Reinforcement Learning at Scale
Horizon: Deep Reinforcement Learning at Scale
 
Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...
Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...
Bad Advice Unintended Consequences and Broken Paradigms - Think && Act Differ...
 
孫民/從電腦視覺看人工智慧 : 下一件大事
孫民/從電腦視覺看人工智慧 : 下一件大事孫民/從電腦視覺看人工智慧 : 下一件大事
孫民/從電腦視覺看人工智慧 : 下一件大事
 
Susan epstein at ibm csig speaker series
Susan epstein at ibm csig speaker seriesSusan epstein at ibm csig speaker series
Susan epstein at ibm csig speaker series
 
Demystifying Machine Learning and Artificial Intelligence
Demystifying Machine Learning and Artificial IntelligenceDemystifying Machine Learning and Artificial Intelligence
Demystifying Machine Learning and Artificial Intelligence
 
How to Rescue Complex Projects from Disaster
How to Rescue Complex Projects from DisasterHow to Rescue Complex Projects from Disaster
How to Rescue Complex Projects from Disaster
 
Is Production RL at a tipping point?
Is Production RL at a tipping point?Is Production RL at a tipping point?
Is Production RL at a tipping point?
 
Keynote - From Monolith to Microservices - Lessons Learned in the Real World
Keynote - From Monolith to Microservices - Lessons Learned in the Real WorldKeynote - From Monolith to Microservices - Lessons Learned in the Real World
Keynote - From Monolith to Microservices - Lessons Learned in the Real World
 
Types of Artificial Intelligence.ppt
Types of Artificial Intelligence.pptTypes of Artificial Intelligence.ppt
Types of Artificial Intelligence.ppt
 

Recently uploaded

Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
Kamal Acharya
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
PreethaV16
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
wafawafa52
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
drshikhapandey2022
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Transcat
 
openshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoinopenshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoin
snaprevwdev
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
aryanpankaj78
 
Impartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 StandardImpartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 Standard
MuhammadJazib15
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
paraasingh12 #V08
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
Indrajeet sahu
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
Abdullah Al Noman
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
PreethaV16
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
upoux
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Balvir Singh
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
PriyankaKilaniya
 
Height and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdfHeight and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdf
q30122000
 
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptxEV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
nikshimanasa
 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
Pallavi Sharma
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
DharmaBanothu
 

Recently uploaded (20)

Blood finder application project report (1).pdf
Blood finder application project report (1).pdfBlood finder application project report (1).pdf
Blood finder application project report (1).pdf
 
Object Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOADObject Oriented Analysis and Design - OOAD
Object Oriented Analysis and Design - OOAD
 
Ericsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.pptEricsson LTE Throughput Troubleshooting Techniques.ppt
Ericsson LTE Throughput Troubleshooting Techniques.ppt
 
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUESAN INTRODUCTION OF AI & SEARCHING TECHIQUES
AN INTRODUCTION OF AI & SEARCHING TECHIQUES
 
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
Tools & Techniques for Commissioning and Maintaining PV Systems W-Animations ...
 
openshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoinopenshift technical overview - Flow of openshift containerisatoin
openshift technical overview - Flow of openshift containerisatoin
 
Digital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptxDigital Twins Computer Networking Paper Presentation.pptx
Digital Twins Computer Networking Paper Presentation.pptx
 
Impartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 StandardImpartiality as per ISO /IEC 17025:2017 Standard
Impartiality as per ISO /IEC 17025:2017 Standard
 
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls ChennaiCall Girls Chennai +91-8824825030 Vip Call Girls Chennai
Call Girls Chennai +91-8824825030 Vip Call Girls Chennai
 
Open Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surfaceOpen Channel Flow: fluid flow with a free surface
Open Channel Flow: fluid flow with a free surface
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
Presentation on Food Delivery Systems
Presentation on Food Delivery SystemsPresentation on Food Delivery Systems
Presentation on Food Delivery Systems
 
FULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back EndFULL STACK PROGRAMMING - Both Front End and Back End
FULL STACK PROGRAMMING - Both Front End and Back End
 
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
一比一原版(osu毕业证书)美国俄勒冈州立大学毕业证如何办理
 
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdfSri Guru Hargobind Ji - Bandi Chor Guru.pdf
Sri Guru Hargobind Ji - Bandi Chor Guru.pdf
 
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
Prediction of Electrical Energy Efficiency Using Information on Consumer's Ac...
 
Height and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdfHeight and depth gauge linear metrology.pdf
Height and depth gauge linear metrology.pdf
 
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptxEV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
EV BMS WITH CHARGE MONITOR AND FIRE DETECTION.pptx
 
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdfSELENIUM CONF -PALLAVI SHARMA - 2024.pdf
SELENIUM CONF -PALLAVI SHARMA - 2024.pdf
 
This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...This study Examines the Effectiveness of Talent Procurement through the Imple...
This study Examines the Effectiveness of Talent Procurement through the Imple...
 

Introduction to Deep Reinforcement Learning workshop at School of Ai: AI Day

  • 1. Hi
  • 2. Deep Reinforcement Learning Azzeddine CHENINE AI Research Engineer @instadeepai An introduction workshop
  • 3. Deep Reinforcement Learning Azzeddine CHENINE AI Research Engineer @instadeepai What? How? What’s hot about it 🔥?
  • 4. 4
  • 5. 5
  • 6. 6
  • 7. 7
  • 8. 8
  • 9. 9
  • 10. But What if….. …Your task is manifested by a series of decisions to reach or keep an optimal performance 10
  • 11. Reinforcement Learning • Building agents that are able to learn an optimal policy to preform a task within a Markovian environment …What ? 11
  • 12. Reinforcement Learning • Building agents that are able to learn an optimal policy to preform a task within a Markovian environment • In a Markovian environment the next state depends only on the current state and the agent that will be preformed by the agent …What ? 12
  • 13. Reinforcement Learning • Building agents that are able to learn an optimal policy to preform a task within a Markovian environment • In a Markovian environment the next state depends only on the current state and the agent that will be preformed by the agent …What ? • This task can be episodic or continues 13
  • 18. Reinforcement Learning …How ? Environment Agent Reward New State Action • Reach an optimal policy 𝝿 • 𝝿 can be deterministic or stochastic • A deterministic version of 𝝿 can be derived from the action value function Q(S,a) • You are free to choose your policy type 18
  • 19. What’s hot 🔥about DeepRL • Reinforcement Learning existed since the early 80s 19
  • 20. What’s hot 🔥about DeepRL • Reinforcement Learning existed since the early 80s • Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic programing Algorithms 20
  • 21. What’s hot 🔥about DeepRL • Reinforcement Learning existed since the early 80s • Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic programing Algorithms • Monte-carlo, Sarsa (not salsa 💃), Q-learning, expected Sarsa…etc 21
  • 22. What’s hot 🔥about DeepRL • Reinforcement Learning existed since the early 80s • Reinforcement Learning before the hype of Deep Learning used to rely on Dynamic programing Algorithms • Monte-carlo, Sarsa (not salsa 💃), Q-learning, expected Sarsa…etc • Data structures to hold reference for the actions values of each state 22
  • 23. What’s hot 🔥about DeepRL Bio Stocks Games Robots • Modern environments present complex action and state spaces 23
  • 24. What’s hot 🔥about DeepRL Bio Stocks Games Robots • Deep Neural Networks are able to extract features from different state types 24 • Modern environments present complex action and state spaces
  • 25. What’s hot 🔥about DeepRL Bio Stocks Games Robots • Deep Neural Networks are able to approximate functions that map an observation to a desired output space 25 • Deep Neural Networks are able to extract features from different state types • Modern environments present complex action and state spaces
  • 26. DeepRL workshop • Inspecting a dynamic programing version of Q-learning • Inspecting limitation and Deep Neural network use case • Implementing Deep Q-learning with Tensor fl ow Keras API and Pytorch • Getting introduced to OpenAI GYM for reinforcement learning environments • Visualizing the training and inference of a DQN agents 26
  • 27. Other hot topics • Multi-agent reinforcement learning • Imitation learning and behaviour cloning • The problem of generation in Deep RL • Policy based methods: PPO, A2C, A3C… • DeepRL frameworks: RLLib, TF Agents… 27
  • 28. Resources • Berkeley DeepRL Bootcamp on Youtube • Reinforcement Learning, an introduction • Udacity DeepRL Nanodegree if possible • RL course by David silver on Youtube • Open AI gym documentation 28