Introduction to Reinforcement Learning - Code Heroku

Please turn off your webcam
If you are joining from a mobile phone
be sure to click on
Join via Device Audio
We are waiting for other participants to join
We will begin at 4:30 PM IST

Mihir Thakkar
Founder and Instructor
hello@codeheroku.com
Introduction to
Reinforcement
Learning

SESSION
OBJECTIVES
• Introduction to RL
• Use Cases
• Formalization
• Multi Arm Bandit in
Python

www.codeheroku.com Introduction to Machine Learning – Reinforcement Learning

Quiz
Which is the next best move to make for our
robot?

Why
Reinforcement
Learning?
• Closer to Real World
• Deals with Stochastic
Nature of Environment
• Understand Intelligence
• Replicate Intelligence

https://drive.google.com/file/d/1gOrJu2svliyEnlIIeItHeBEd1rT3Vhob/view?usp=sharing

RL Model

Quiz:
Define States, Actions and Rewards

Quiz
Define States, Actions and Rewards

Markov Decision Process (MDP)

Our Goal

Multi Arm Bandit
•Unknown Reward
Distribution
•Deterministic Actions
•Objective: Find Sequence
of actions which will
maximize total reward

Iterative Averaging

Exploration Vs Exploitation
To approximate values of actions Agent must
choose actions that are non-optimal to start
with.
Once an agent has approximated the values, it
can greedily pick the highest value action.

Reinforcement
Learning
Challenges
• Access to the environment
• Delayed Reward (Temporal Credit
Risk Assignment)
• High Cost Actions
• Distribution of data changes by the
choice of actions you take
• Efficient state representations? What
constitutes a good state
• Good Rewards functions?

Multi Arm Bandit
https://drive.google.com/file/d/1gql13NuNpRyEnpJJOUnIAsp
m2k4krQ9b/view?usp=sharing
https://github.com/codeheroku/Introduction-to-Machine-
Learning/tree/master/Reinforcement%20Learning/RL1%20Multiarm%20Bandit

Introduction to Reinforcement Learning - Code Heroku

More Related Content

Similar to Introduction to Reinforcement Learning - Code Heroku

More from codeheroku

Recently uploaded

Introduction to Reinforcement Learning - Code Heroku