SlideShare a Scribd company logo
1 of 10
Introduction to
Reinforcement Learning
Pramod R,
Senior Lead Data Scientist
Fidelity
Types of Machine Learning
Supervised Learning
Learn from labelled data - predict the
right label. Eg: Fraudulent transaction
classification, probability of a customer
to purchase a product given an online
Ad, etc.
Unsupervised Learning
No Labelled data - Instead, it relies of the
underlying pattern of data to find the
relationships between the data elements. Eg:
Marketing segmentation of customers based
on their demographic attributes, finding the
product associations, etc.
What is Reinforcement Learning
Modelled against a human brain, where we take an action, seek reward
for that action taken and determine what next action to take. Eg: A baby
learning to walk
There is no labelled data, nor do we find relationship between the data
points - We just seek the reward from every step and determine the
action based on the reward we get
The data is positioned in time sequence manner following this paradigm:
State→Action→Reward→State→Action
Applications of RL
Self Driving Cars
Online Ad Recommendations
Robotics
Chatbot
Medication on patients
Stock Trading
Online Education
Components of Reinforcement Learning - Markov Decision Process
● State St: Environmental Condition
● Agent: The model/Robot which learns about
the environment and decides the action
● Action At: Agent’s action based on some
condition
● Policy π: Mapping from State → Action
● Reward Rt: Feedback received for the action
The central idea of a reinforcement learning is to
maximize the expected cumulative reward
Markov Property
For a sequence - {q1, q2, q3, q4.. qn} -
P(qn|qn-1,qn-2,.. q1) = P(qn|qn-1)
Example: India’s chance of winning tomorrow’s match only depends on the last match that
India played
“The future is
independent of the
past given the
present”
- Markov
Basic working of Reinforcement Learning
● Action Space: Left, Right, Jump
● State: Position of Mario, position of the
enemy, places where the reward is, etc.
● Reward: Coins
● Discounted cumulative expected reward:
Types of Reinforcement Learning Algorithms
Multi Arm Bandits:
● Used in A/B testing of marketing
Ads, Actual Drug vs Placebo usage
in clinical trials, etc.
● Explore-Exploit Dilemma
● Epsilon Greedy
Types of Reinforcement Learning Algorithms
Temporal Differencing
Value of a state V(S): Tells us how
good it is to be at a state at a time t
Cumulative Discounted Reward:
Gt = Rt+1 + ℽRt+2 + ℽ2Rt+3 + ℽ3Rt+4…
TD(1):
V(S)t = V(S)t + ⍺ (Gt - V(S)t)
Learning Resources
● David Silver Reinforcement Learning Videos
● Sutton and Barto - Reinforcement Learning
● Prof. Ravindran Balaraman Videos on Reinforcement Learning
● Github repo: Awesome RL
● Deep RL Bootcamp Lectures

More Related Content

Similar to Introduction to reinforcement learning

Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning Chandra Meena
 
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptxRL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptxdeeplearning6
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement LearningUsman Qayyum
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptxManiMaran230751
 
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...Lviv Startup Club
 
Online learning & adaptive game playing
Online learning & adaptive game playingOnline learning & adaptive game playing
Online learning & adaptive game playingSaeid Ghafouri
 
reiniforcement learning.ppt
reiniforcement learning.pptreiniforcement learning.ppt
reiniforcement learning.pptcharusharma165
 
Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai
Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai
Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai Pratik Bhavsar
 
REINFORCEMENT LEARNING
REINFORCEMENT LEARNINGREINFORCEMENT LEARNING
REINFORCEMENT LEARNINGpradiprahul
 
Reinforcement Learning with Amazon SageMaker RL
Reinforcement Learning with Amazon SageMaker RLReinforcement Learning with Amazon SageMaker RL
Reinforcement Learning with Amazon SageMaker RLThom Lane
 
Reinforcement Learning Guide For Beginners
Reinforcement Learning Guide For BeginnersReinforcement Learning Guide For Beginners
Reinforcement Learning Guide For Beginnersgokulprasath06
 
What is Reinforcement Learning.pdf
What is Reinforcement Learning.pdfWhat is Reinforcement Learning.pdf
What is Reinforcement Learning.pdfAiblogtech
 
Reinforcement Learning.ppt
Reinforcement Learning.pptReinforcement Learning.ppt
Reinforcement Learning.pptPOOJASHREEC1
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement LearningSVijaylakshmi
 
RL_online _presentation_1.ppt
RL_online _presentation_1.pptRL_online _presentation_1.ppt
RL_online _presentation_1.pptssuser43a599
 
reinforcement learning in artificial intelligence
reinforcement learning in artificial intelligencereinforcement learning in artificial intelligence
reinforcement learning in artificial intelligencepanditadesh123
 

Similar to Introduction to reinforcement learning (20)

Reinforcement learning
Reinforcement learning Reinforcement learning
Reinforcement learning
 
RL.ppt
RL.pptRL.ppt
RL.ppt
 
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptxRL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
 
Deep Reinforcement Learning
Deep Reinforcement LearningDeep Reinforcement Learning
Deep Reinforcement Learning
 
24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx24.09.2021 Reinforcement Learning Algorithms.pptx
24.09.2021 Reinforcement Learning Algorithms.pptx
 
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
Andrii Prysiazhnyk: Why the amazon sellers are buiyng the RTX 3080: Dynamic p...
 
Online learning & adaptive game playing
Online learning & adaptive game playingOnline learning & adaptive game playing
Online learning & adaptive game playing
 
reiniforcement learning.ppt
reiniforcement learning.pptreiniforcement learning.ppt
reiniforcement learning.ppt
 
Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai
Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai
Machine learning Vs Deep learning Vs Reinforcement learning | Pydata Mumbai
 
REINFORCEMENT LEARNING
REINFORCEMENT LEARNINGREINFORCEMENT LEARNING
REINFORCEMENT LEARNING
 
Reinforcement Learning with Amazon SageMaker RL
Reinforcement Learning with Amazon SageMaker RLReinforcement Learning with Amazon SageMaker RL
Reinforcement Learning with Amazon SageMaker RL
 
Reinforcement Learning Guide For Beginners
Reinforcement Learning Guide For BeginnersReinforcement Learning Guide For Beginners
Reinforcement Learning Guide For Beginners
 
CS799_FinalReport
CS799_FinalReportCS799_FinalReport
CS799_FinalReport
 
What is Reinforcement Learning.pdf
What is Reinforcement Learning.pdfWhat is Reinforcement Learning.pdf
What is Reinforcement Learning.pdf
 
Reinforcement Learning.ppt
Reinforcement Learning.pptReinforcement Learning.ppt
Reinforcement Learning.ppt
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
YijueRL.ppt
YijueRL.pptYijueRL.ppt
YijueRL.ppt
 
RL_online _presentation_1.ppt
RL_online _presentation_1.pptRL_online _presentation_1.ppt
RL_online _presentation_1.ppt
 
reinforcement learning in artificial intelligence
reinforcement learning in artificial intelligencereinforcement learning in artificial intelligence
reinforcement learning in artificial intelligence
 
Machine learning
Machine learningMachine learning
Machine learning
 

Recently uploaded

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998YohFuh
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxJohnnyPlasten
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfSocial Samosa
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxolyaivanovalion
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxStephen266013
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusTimothy Spann
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 

Recently uploaded (20)

RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998RA-11058_IRR-COMPRESS Do 198 series of 1998
RA-11058_IRR-COMPRESS Do 198 series of 1998
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls Punjabi Bagh 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Log Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptxLog Analysis using OSSEC sasoasasasas.pptx
Log Analysis using OSSEC sasoasasasas.pptx
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdfKantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
Kantar AI Summit- Under Embargo till Wednesday, 24th April 2024, 4 PM, IST.pdf
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
B2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docxB2 Creative Industry Response Evaluation.docx
B2 Creative Industry Response Evaluation.docx
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in  KishangarhDelhi 99530 vip 56974 Genuine Escort Service Call Girls in  Kishangarh
Delhi 99530 vip 56974 Genuine Escort Service Call Girls in Kishangarh
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 
Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 

Introduction to reinforcement learning

  • 1. Introduction to Reinforcement Learning Pramod R, Senior Lead Data Scientist Fidelity
  • 2. Types of Machine Learning Supervised Learning Learn from labelled data - predict the right label. Eg: Fraudulent transaction classification, probability of a customer to purchase a product given an online Ad, etc. Unsupervised Learning No Labelled data - Instead, it relies of the underlying pattern of data to find the relationships between the data elements. Eg: Marketing segmentation of customers based on their demographic attributes, finding the product associations, etc.
  • 3. What is Reinforcement Learning Modelled against a human brain, where we take an action, seek reward for that action taken and determine what next action to take. Eg: A baby learning to walk There is no labelled data, nor do we find relationship between the data points - We just seek the reward from every step and determine the action based on the reward we get The data is positioned in time sequence manner following this paradigm: State→Action→Reward→State→Action
  • 4. Applications of RL Self Driving Cars Online Ad Recommendations Robotics Chatbot Medication on patients Stock Trading Online Education
  • 5. Components of Reinforcement Learning - Markov Decision Process ● State St: Environmental Condition ● Agent: The model/Robot which learns about the environment and decides the action ● Action At: Agent’s action based on some condition ● Policy π: Mapping from State → Action ● Reward Rt: Feedback received for the action The central idea of a reinforcement learning is to maximize the expected cumulative reward
  • 6. Markov Property For a sequence - {q1, q2, q3, q4.. qn} - P(qn|qn-1,qn-2,.. q1) = P(qn|qn-1) Example: India’s chance of winning tomorrow’s match only depends on the last match that India played “The future is independent of the past given the present” - Markov
  • 7. Basic working of Reinforcement Learning ● Action Space: Left, Right, Jump ● State: Position of Mario, position of the enemy, places where the reward is, etc. ● Reward: Coins ● Discounted cumulative expected reward:
  • 8. Types of Reinforcement Learning Algorithms Multi Arm Bandits: ● Used in A/B testing of marketing Ads, Actual Drug vs Placebo usage in clinical trials, etc. ● Explore-Exploit Dilemma ● Epsilon Greedy
  • 9. Types of Reinforcement Learning Algorithms Temporal Differencing Value of a state V(S): Tells us how good it is to be at a state at a time t Cumulative Discounted Reward: Gt = Rt+1 + ℽRt+2 + ℽ2Rt+3 + ℽ3Rt+4… TD(1): V(S)t = V(S)t + ⍺ (Gt - V(S)t)
  • 10. Learning Resources ● David Silver Reinforcement Learning Videos ● Sutton and Barto - Reinforcement Learning ● Prof. Ravindran Balaraman Videos on Reinforcement Learning ● Github repo: Awesome RL ● Deep RL Bootcamp Lectures