Introduction to reinforcement learning

•

0 likes•56 views

Pramod Ramachandra

presented at a meetup on 29Jun2019 at Harman Bangalore

Data & Analytics

Introduction to
Reinforcement Learning
Pramod R,
Senior Lead Data Scientist
Fidelity

Types of Machine Learning
Supervised Learning
Learn from labelled data - predict the
right label. Eg: Fraudulent transaction
classification, probability of a customer
to purchase a product given an online
Ad, etc.
Unsupervised Learning
No Labelled data - Instead, it relies of the
underlying pattern of data to find the
relationships between the data elements. Eg:
Marketing segmentation of customers based
on their demographic attributes, finding the
product associations, etc.

What is Reinforcement Learning
Modelled against a human brain, where we take an action, seek reward
for that action taken and determine what next action to take. Eg: A baby
learning to walk
There is no labelled data, nor do we find relationship between the data
points - We just seek the reward from every step and determine the
action based on the reward we get
The data is positioned in time sequence manner following this paradigm:
State→Action→Reward→State→Action

Applications of RL
Self Driving Cars
Online Ad Recommendations
Robotics
Chatbot
Medication on patients
Stock Trading
Online Education

Components of Reinforcement Learning - Markov Decision Process
● State St: Environmental Condition
● Agent: The model/Robot which learns about
the environment and decides the action
● Action At: Agent’s action based on some
condition
● Policy π: Mapping from State → Action
● Reward Rt: Feedback received for the action
The central idea of a reinforcement learning is to
maximize the expected cumulative reward

Markov Property
For a sequence - {q1, q2, q3, q4.. qn} -
P(qn|qn-1,qn-2,.. q1) = P(qn|qn-1)
Example: India’s chance of winning tomorrow’s match only depends on the last match that
India played
“The future is
independent of the
past given the
present”
- Markov

Basic working of Reinforcement Learning
● Action Space: Left, Right, Jump
● State: Position of Mario, position of the
enemy, places where the reward is, etc.
● Reward: Coins
● Discounted cumulative expected reward:

Types of Reinforcement Learning Algorithms
Multi Arm Bandits:
● Used in A/B testing of marketing
Ads, Actual Drug vs Placebo usage
in clinical trials, etc.
● Explore-Exploit Dilemma
● Epsilon Greedy

Types of Reinforcement Learning Algorithms
Temporal Differencing
Value of a state V(S): Tells us how
good it is to be at a state at a time t
Cumulative Discounted Reward:
Gt = Rt+1 + ℽRt+2 + ℽ2Rt+3 + ℽ3Rt+4…
TD(1):
V(S)t = V(S)t + ⍺ (Gt - V(S)t)

Learning Resources
● David Silver Reinforcement Learning Videos
● Sutton and Barto - Reinforcement Learning
● Prof. Ravindran Balaraman Videos on Reinforcement Learning
● Github repo: Awesome RL
● Deep RL Bootcamp Lectures

Similar to Introduction to reinforcement learning

Reinforcement learning Chandra Meena

RL.pptAzharJamil15

RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptxdeeplearning6

Deep Reinforcement LearningUsman Qayyum

24.09.2021 Reinforcement Learning Algorithms.pptxManiMaran230751

Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...Lviv Startup Club

Online learning & adaptive game playingSaeid Ghafouri

reiniforcement learning.pptcharusharma165

Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai Pratik Bhavsar

REINFORCEMENT LEARNINGpradiprahul

Reinforcement Learning with Amazon SageMaker RLThom Lane

Reinforcement Learning Guide For Beginnersgokulprasath06

CS799_FinalReportAbhanshu Gupta

What is Reinforcement Learning.pdfAiblogtech

Reinforcement Learning.pptPOOJASHREEC1

Reinforcement LearningSVijaylakshmi

YijueRL.pptShoaib Iqbal

RL_online _presentation_1.pptssuser43a599

reinforcement learning in artificial intelligencepanditadesh123

Machine learningShailja Tripathi

Similar to Introduction to reinforcement learning (20)

Reinforcement learning

RL.ppt

RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx

Deep Reinforcement Learning

24.09.2021 Reinforcement Learning Algorithms.pptx

Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...

Online learning & adaptive game playing

reiniforcement learning.ppt

Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai

REINFORCEMENT LEARNING

Reinforcement Learning with Amazon SageMaker RL

Reinforcement Learning Guide For Beginners

CS799_FinalReport

What is Reinforcement Learning.pdf

Reinforcement Learning.ppt

Reinforcement Learning

YijueRL.ppt

RL_online _presentation_1.ppt

reinforcement learning in artificial intelligence

Machine learning

Recently uploaded

RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor

定制英国白金汉大学毕业证（UCB毕业证书）成绩单原版一比一ffjhghh

VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083

VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor

Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten

April 2024 - Crypto Market Report's Analysismanisha194592

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

Brighton SEO | April 2024 | Data StorytellingNeil Barnes

Ravak dropshipping via API with DroFx.pptxolyaivanovalion

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa

CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion

B2 Creative Industry Response Evaluation.docxStephen266013

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh9953056974 Low Rate Call Girls In Saket, Delhi NCR

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster

Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71

Recently uploaded (20)

RA-11058_IRR-COMPRESS Do 198 series of 1998

(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service

VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati

定制英国白金汉大学毕业证（UCB毕业证书）成绩单原版一比一

VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...

Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call

VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...

Log Analysis using OSSEC sasoasasasas.pptx

April 2024 - Crypto Market Report's Analysis

꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...

Brighton SEO | April 2024 | Data Storytelling

Ravak dropshipping via API with DroFx.pptx

Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf

CebaBaby dropshipping via API with DroFX.pptx

B2 Creative Industry Response Evaluation.docx

꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call

Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh

EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx

Generative AI on Enterprise Cloud with NiFi and Milvus

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha

Introduction to reinforcement learning

1. Introduction to Reinforcement Learning Pramod R, Senior Lead Data Scientist Fidelity

2. Types of Machine Learning Supervised Learning Learn from labelled data - predict the right label. Eg: Fraudulent transaction classification, probability of a customer to purchase a product given an online Ad, etc. Unsupervised Learning No Labelled data - Instead, it relies of the underlying pattern of data to find the relationships between the data elements. Eg: Marketing segmentation of customers based on their demographic attributes, finding the product associations, etc.

3. What is Reinforcement Learning Modelled against a human brain, where we take an action, seek reward for that action taken and determine what next action to take. Eg: A baby learning to walk There is no labelled data, nor do we find relationship between the data points - We just seek the reward from every step and determine the action based on the reward we get The data is positioned in time sequence manner following this paradigm: State→Action→Reward→State→Action

4. Applications of RL Self Driving Cars Online Ad Recommendations Robotics Chatbot Medication on patients Stock Trading Online Education

5. Components of Reinforcement Learning - Markov Decision Process ● State St: Environmental Condition ● Agent: The model/Robot which learns about the environment and decides the action ● Action At: Agent’s action based on some condition ● Policy π: Mapping from State → Action ● Reward Rt: Feedback received for the action The central idea of a reinforcement learning is to maximize the expected cumulative reward

6. Markov Property For a sequence - {q1, q2, q3, q4.. qn} - P(qn|qn-1,qn-2,.. q1) = P(qn|qn-1) Example: India’s chance of winning tomorrow’s match only depends on the last match that India played “The future is independent of the past given the present” - Markov

7. Basic working of Reinforcement Learning ● Action Space: Left, Right, Jump ● State: Position of Mario, position of the enemy, places where the reward is, etc. ● Reward: Coins ● Discounted cumulative expected reward:

8. Types of Reinforcement Learning Algorithms Multi Arm Bandits: ● Used in A/B testing of marketing Ads, Actual Drug vs Placebo usage in clinical trials, etc. ● Explore-Exploit Dilemma ● Epsilon Greedy

9. Types of Reinforcement Learning Algorithms Temporal Differencing Value of a state V(S): Tells us how good it is to be at a state at a time t Cumulative Discounted Reward: Gt = Rt+1 + ℽRt+2 + ℽ2Rt+3 + ℽ3Rt+4… TD(1): V(S)t = V(S)t + ⍺ (Gt - V(S)t)

10. Learning Resources ● David Silver Reinforcement Learning Videos ● Sutton and Barto - Reinforcement Learning ● Prof. Ravindran Balaraman Videos on Reinforcement Learning ● Github repo: Awesome RL ● Deep RL Bootcamp Lectures

Introduction to reinforcement learning

Recommended

Recommended

More Related Content

Similar to Introduction to reinforcement learning

Similar to Introduction to reinforcement learning (20)

Recently uploaded

Recently uploaded (20)

Introduction to reinforcement learning