SlideShare a Scribd company logo
1 of 15
Download to read offline
1 16th of May 2019
Nicolas Kuhaupt
Research Data Scientist
Nicolaskuhaupt@aol.com
Getting Started With Reinforcement Learning
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
2 16th of May 2019
1. Introduction to Reinforcement Learning
2. OpenAI Gym
3. Ray RLlib
4. Further Resources
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
3 16th of May 2019
1. Introduction to Reinforcement Learning
2. OpenAI Gym
3. Ray RLlib
4. Further Resources
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
Reinforcement Learning
4 16th of May 2019
Agent
Environment
Action at
State st
Reward rt
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
Atari Games
5 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
Deep Reinforcement Learning
6 16th of May 2019
Environment
Action at
State st
Reward rt
Agent
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
Deep Reinforcement Learning with Deep Q Networks
7 16th of May 2019
Regression: Input Data
Prediction of continuous variable
Classification: Input Data
Score for Input being class A
Score for Input being class B
Reinforcement
Learning:
Estimated reward for action 1
Estimated reward for action 2
State
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
Deep Q Networks
8 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
S0 S2
a0
a1
a2
a1
S1
a0
a2
0.3
1.0
0.7
0.2
0.8
1.0
1.0
0.10.1
0.8
40
10
50
Adapted from „Hands-On Machine Learning with Scikit-Learn & TensorFlow“ by Aurélien Géron
9 16th of May 2019
1. Introduction to Reinforcement Learning
2. OpenAI Gym
3. Ray RLlib
4. Further Resources
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
OpenAI Gym
10 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
https://gym.openai.com/envs/#mujoco
OpenAI Gym
11 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
https://gym.openai.com/envs/#robotics
Let´s Code
12 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
13 16th of May 2019
1. Introduction to Reinforcement Learning
2. OpenAI Gym
3. Ray RLlib
4. Further Resources
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
Further Resources
• Spinning up Reinforcement Learning from OpenAI
(https://spinningup.openai.com/)
• University Lectures
(http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html)
• Deep RL Bootcamp
(https://sites.google.com/view/deep-rl-bootcamp/lectures)
• Deepmind Blog
(https://deepmind.com/blog/)
14 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt
15 16th of May 2019
Nicolas Kuhaupt
Research Data Scientist
Nicolaskuhaupt@aol.com
Thank you for your attention!
J on the Beach / Getting Started With Reinforcement Learning /
Nicolas Kuhaupt

More Related Content

More from J On The Beach

Pushing it to the edge in IoT
Pushing it to the edge in IoTPushing it to the edge in IoT
Pushing it to the edge in IoTJ On The Beach
 
Drinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actorsDrinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actorsJ On The Beach
 
How do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server patternHow do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server patternJ On The Beach
 
When Cloud Native meets the Financial Sector
When Cloud Native meets the Financial SectorWhen Cloud Native meets the Financial Sector
When Cloud Native meets the Financial SectorJ On The Beach
 
The big data Universe. Literally.
The big data Universe. Literally.The big data Universe. Literally.
The big data Universe. Literally.J On The Beach
 
Streaming to a New Jakarta EE
Streaming to a New Jakarta EEStreaming to a New Jakarta EE
Streaming to a New Jakarta EEJ On The Beach
 
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...J On The Beach
 
Pushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and BlazorPushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and BlazorJ On The Beach
 
Axon Server went RAFTing
Axon Server went RAFTingAxon Server went RAFTing
Axon Server went RAFTingJ On The Beach
 
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...J On The Beach
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysJ On The Beach
 
Servers are doomed to fail
Servers are doomed to failServers are doomed to fail
Servers are doomed to failJ On The Beach
 
Interaction Protocols: It's all about good manners
Interaction Protocols: It's all about good mannersInteraction Protocols: It's all about good manners
Interaction Protocols: It's all about good mannersJ On The Beach
 
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...J On The Beach
 
Leadership at every level
Leadership at every levelLeadership at every level
Leadership at every levelJ On The Beach
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesJ On The Beach
 
Toward Predictability and Stability At The Edge Of Chaos
Toward Predictability and Stability At The Edge Of ChaosToward Predictability and Stability At The Edge Of Chaos
Toward Predictability and Stability At The Edge Of ChaosJ On The Beach
 
Numeric programming with spire
Numeric programming with spireNumeric programming with spire
Numeric programming with spireJ On The Beach
 
Istio Service Mesh & pragmatic microservices architecture
Istio Service Mesh & pragmatic microservices architectureIstio Service Mesh & pragmatic microservices architecture
Istio Service Mesh & pragmatic microservices architectureJ On The Beach
 

More from J On The Beach (20)

Pushing it to the edge in IoT
Pushing it to the edge in IoTPushing it to the edge in IoT
Pushing it to the edge in IoT
 
Drinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actorsDrinking from the firehose, with virtual streams and virtual actors
Drinking from the firehose, with virtual streams and virtual actors
 
How do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server patternHow do we deploy? From Punched cards to Immutable server pattern
How do we deploy? From Punched cards to Immutable server pattern
 
Java, Turbocharged
Java, TurbochargedJava, Turbocharged
Java, Turbocharged
 
When Cloud Native meets the Financial Sector
When Cloud Native meets the Financial SectorWhen Cloud Native meets the Financial Sector
When Cloud Native meets the Financial Sector
 
The big data Universe. Literally.
The big data Universe. Literally.The big data Universe. Literally.
The big data Universe. Literally.
 
Streaming to a New Jakarta EE
Streaming to a New Jakarta EEStreaming to a New Jakarta EE
Streaming to a New Jakarta EE
 
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
The TIPPSS Imperative for IoT - Ensuring Trust, Identity, Privacy, Protection...
 
Pushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and BlazorPushing AI to the Client with WebAssembly and Blazor
Pushing AI to the Client with WebAssembly and Blazor
 
Axon Server went RAFTing
Axon Server went RAFTingAxon Server went RAFTing
Axon Server went RAFTing
 
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
The Six Pitfalls of building a Microservices Architecture (and how to avoid t...
 
Madaari : Ordering For The Monkeys
Madaari : Ordering For The MonkeysMadaari : Ordering For The Monkeys
Madaari : Ordering For The Monkeys
 
Servers are doomed to fail
Servers are doomed to failServers are doomed to fail
Servers are doomed to fail
 
Interaction Protocols: It's all about good manners
Interaction Protocols: It's all about good mannersInteraction Protocols: It's all about good manners
Interaction Protocols: It's all about good manners
 
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
A race of two compilers: GraalVM JIT versus HotSpot JIT C2. Which one offers ...
 
Leadership at every level
Leadership at every levelLeadership at every level
Leadership at every level
 
Machine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind LibrariesMachine Learning: The Bare Math Behind Libraries
Machine Learning: The Bare Math Behind Libraries
 
Toward Predictability and Stability At The Edge Of Chaos
Toward Predictability and Stability At The Edge Of ChaosToward Predictability and Stability At The Edge Of Chaos
Toward Predictability and Stability At The Edge Of Chaos
 
Numeric programming with spire
Numeric programming with spireNumeric programming with spire
Numeric programming with spire
 
Istio Service Mesh & pragmatic microservices architecture
Istio Service Mesh & pragmatic microservices architectureIstio Service Mesh & pragmatic microservices architecture
Istio Service Mesh & pragmatic microservices architecture
 

Recently uploaded

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 

Getting started with Deep Reinforcement Learning

  • 1. 1 16th of May 2019 Nicolas Kuhaupt Research Data Scientist Nicolaskuhaupt@aol.com Getting Started With Reinforcement Learning J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 2. 2 16th of May 2019 1. Introduction to Reinforcement Learning 2. OpenAI Gym 3. Ray RLlib 4. Further Resources J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 3. 3 16th of May 2019 1. Introduction to Reinforcement Learning 2. OpenAI Gym 3. Ray RLlib 4. Further Resources J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 4. Reinforcement Learning 4 16th of May 2019 Agent Environment Action at State st Reward rt J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 5. Atari Games 5 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 6. Deep Reinforcement Learning 6 16th of May 2019 Environment Action at State st Reward rt Agent J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 7. Deep Reinforcement Learning with Deep Q Networks 7 16th of May 2019 Regression: Input Data Prediction of continuous variable Classification: Input Data Score for Input being class A Score for Input being class B Reinforcement Learning: Estimated reward for action 1 Estimated reward for action 2 State J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 8. Deep Q Networks 8 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt S0 S2 a0 a1 a2 a1 S1 a0 a2 0.3 1.0 0.7 0.2 0.8 1.0 1.0 0.10.1 0.8 40 10 50 Adapted from „Hands-On Machine Learning with Scikit-Learn & TensorFlow“ by Aurélien Géron
  • 9. 9 16th of May 2019 1. Introduction to Reinforcement Learning 2. OpenAI Gym 3. Ray RLlib 4. Further Resources J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 10. OpenAI Gym 10 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt https://gym.openai.com/envs/#mujoco
  • 11. OpenAI Gym 11 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt https://gym.openai.com/envs/#robotics
  • 12. Let´s Code 12 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 13. 13 16th of May 2019 1. Introduction to Reinforcement Learning 2. OpenAI Gym 3. Ray RLlib 4. Further Resources J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 14. Further Resources • Spinning up Reinforcement Learning from OpenAI (https://spinningup.openai.com/) • University Lectures (http://www0.cs.ucl.ac.uk/staff/d.silver/web/Teaching.html) • Deep RL Bootcamp (https://sites.google.com/view/deep-rl-bootcamp/lectures) • Deepmind Blog (https://deepmind.com/blog/) 14 16th of May 2019J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt
  • 15. 15 16th of May 2019 Nicolas Kuhaupt Research Data Scientist Nicolaskuhaupt@aol.com Thank you for your attention! J on the Beach / Getting Started With Reinforcement Learning / Nicolas Kuhaupt