SlideShare a Scribd company logo
1 of 12
Download to read offline
Basics of
Reinforcement Learning
Spotle.ai Study Material
Spotle.ai/Learn
Spotle.ai Study Material
Spotle.ai/Learn
Let’s play chess!
I just don’t make any possible move
without thinking what my opponent’s
move can be to counter my move.
I try to consider all possible moves that
are safe. And then choose the one that I
feel is the best move among all.
Machines can learn this way. And this
learning is called reinforcement machine
learning.
Spotle.ai Study Material
Spotle.ai/Learn
What is reinforcement learning?
First, a particular situation in which the learning will be applicable.
You start at a point, you go through several steps to reach a level.
In the process you earn a reward point for every correct step and you lose a reward point
for every wrong step.
Finally, you choose the path with the highest reward point in that particular situation.
Agent Environment
State
Reward
Action
Spotle.ai Study Material
Spotle.ai/Learn
Terminologies
Agent: The learner and the decision maker.
Environment: Where the agent learns and decides what actions to perform.
Action: A set of actions which the agent can perform.
State: The state of the agent in the environment.
Reward: For each action selected by the agent the environment provides a reward.
Usually a scalar value.
Agent Environment
State
Reward
Action
In supervised learning the training data has the output, that is, the answer in it. Here
the model is trained with the correct answer. But in case of reinforcement learning,
there is no answer given. The reinforcement agent decides the action to perform based
on the maximum reward it receives. There is no training data in reinforcement
learning. The machine learns from its experience.
Supervised learning? No
Spotle.ai Study Material
Spotle.ai/Learn
Training
data
Not available
Spotle.ai Study Material
Spotle.ai/Learn
Reinforcing your learning
Which one to choose?
Give reward to all
possible ones step by step
Choose the one with the
maximum reward.Topic A Topic B Topic C
Spotle.ai Study Material
Spotle.ai/Learn
Pavlov Experiment
TRIAL 1
In the first trial Pavlov
gives meat to his dog and
the dog starts salivating.
Spotle.ai Study Material
Spotle.ai/Learn
Pavlov Experiment
TRIAL 2
In the second trial Pavlov
does not give meat to his
dog but rings a bell.
Without seeing the meat
the dog does not start
salivating.
Spotle.ai Study Material
Spotle.ai/Learn
Pavlov Experiment
TRIAL 3
In trial 3 Pavlov rings the
bell and gives meat to his
dog and seeing meat the
dog starts salivating.
Spotle.ai Study Material
Spotle.ai/Learn
Pavlov Experiment
TRIAL 4
In trial 4 Pavlov rings the
bell and at this his dog
starts salivating, hoping
that meat will follow the
ringing of the bell. This is
learning by reinforcement.
The dog was rewarded
with meat after the
ringing of the bell.
Summarizing
❖ The input is an initial stage from which the machine starts learning.
❖ There are more than one possible output in a particular problem.
❖ Each output state is given a reward or punishment.
❖ The output with maximum reward is selected to be performed.
❖ The reinforcement learning process is continuous.
Spotle.ai Study Material
Spotle.ai/Learn
#HappyLearning
#BeCareerReady
That’s all for today.

More Related Content

Similar to Basics of Reinforcement Learning

Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8butest
 
ET in Agile Context
ET in Agile ContextET in Agile Context
ET in Agile ContextSandra C
 
Supervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine LearningSupervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine LearningSpotle.ai
 
Ten stepcopy.....ten steps to go viral
Ten stepcopy.....ten steps to go viralTen stepcopy.....ten steps to go viral
Ten stepcopy.....ten steps to go viralNikhil Das
 
Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8butest
 
Component Training in Obedience for Police K9s
Component Training in Obedience for Police K9sComponent Training in Obedience for Police K9s
Component Training in Obedience for Police K9sTarheel Canine
 
Illusion of control TestBash Netherlands
Illusion of control   TestBash NetherlandsIllusion of control   TestBash Netherlands
Illusion of control TestBash NetherlandsDrew Pontikis
 
Behaviourist oprant conditioning
Behaviourist oprant conditioningBehaviourist oprant conditioning
Behaviourist oprant conditioningJill Jan
 
A Scientific Approach to Off Leash Control
A Scientific Approach to Off Leash ControlA Scientific Approach to Off Leash Control
A Scientific Approach to Off Leash ControlTarheel Canine
 
Scientific method 1
Scientific method 1Scientific method 1
Scientific method 1shannonbandy
 
Reinforcement learning in Machine learning
 Reinforcement learning in Machine learning Reinforcement learning in Machine learning
Reinforcement learning in Machine learningMegha Sharma
 
Behavior Workshop: Clicker Training for Parrots
Behavior Workshop: Clicker Training for ParrotsBehavior Workshop: Clicker Training for Parrots
Behavior Workshop: Clicker Training for ParrotsPhoenix Landing Foundation
 
USMLE Step 1 Guide 2014
USMLE Step 1 Guide 2014USMLE Step 1 Guide 2014
USMLE Step 1 Guide 2014USMLEstep1
 
Operant Conditioning Tutorial
Operant Conditioning TutorialOperant Conditioning Tutorial
Operant Conditioning Tutorialhokapelli
 

Similar to Basics of Reinforcement Learning (18)

Writing essays
Writing essaysWriting essays
Writing essays
 
Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8
 
ET in Agile Context
ET in Agile ContextET in Agile Context
ET in Agile Context
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 
Supervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine LearningSupervised and Unsupervised Machine Learning
Supervised and Unsupervised Machine Learning
 
Ten stepcopy.....ten steps to go viral
Ten stepcopy.....ten steps to go viralTen stepcopy.....ten steps to go viral
Ten stepcopy.....ten steps to go viral
 
Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8Lesson12: Reinforcement Learning for Critterbot Science 8
Lesson12: Reinforcement Learning for Critterbot Science 8
 
Component Training in Obedience for Police K9s
Component Training in Obedience for Police K9sComponent Training in Obedience for Police K9s
Component Training in Obedience for Police K9s
 
Illusion of control TestBash Netherlands
Illusion of control   TestBash NetherlandsIllusion of control   TestBash Netherlands
Illusion of control TestBash Netherlands
 
Behaviourist oprant conditioning
Behaviourist oprant conditioningBehaviourist oprant conditioning
Behaviourist oprant conditioning
 
A Scientific Approach to Off Leash Control
A Scientific Approach to Off Leash ControlA Scientific Approach to Off Leash Control
A Scientific Approach to Off Leash Control
 
Scientific method 1
Scientific method 1Scientific method 1
Scientific method 1
 
OTAPE
OTAPEOTAPE
OTAPE
 
Discrimination
DiscriminationDiscrimination
Discrimination
 
Reinforcement learning in Machine learning
 Reinforcement learning in Machine learning Reinforcement learning in Machine learning
Reinforcement learning in Machine learning
 
Behavior Workshop: Clicker Training for Parrots
Behavior Workshop: Clicker Training for ParrotsBehavior Workshop: Clicker Training for Parrots
Behavior Workshop: Clicker Training for Parrots
 
USMLE Step 1 Guide 2014
USMLE Step 1 Guide 2014USMLE Step 1 Guide 2014
USMLE Step 1 Guide 2014
 
Operant Conditioning Tutorial
Operant Conditioning TutorialOperant Conditioning Tutorial
Operant Conditioning Tutorial
 

More from Spotle.ai

Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...
Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...
Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...Spotle.ai
 
Spotle AI-thon - AI For Good Business Plan Showcase - Cummins College
Spotle AI-thon - AI For Good Business Plan Showcase - Cummins CollegeSpotle AI-thon - AI For Good Business Plan Showcase - Cummins College
Spotle AI-thon - AI For Good Business Plan Showcase - Cummins CollegeSpotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer...
 Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer... Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...Spotle.ai
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar...
 Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar... Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar...Spotle.ai
 
Artificial intelligence in fintech
Artificial intelligence in fintechArtificial intelligence in fintech
Artificial intelligence in fintechSpotle.ai
 
Semi-supervised Machine Learning
Semi-supervised Machine LearningSemi-supervised Machine Learning
Semi-supervised Machine LearningSpotle.ai
 
Tableau And Data Visualization - Get Started
Tableau And Data Visualization - Get StartedTableau And Data Visualization - Get Started
Tableau And Data Visualization - Get StartedSpotle.ai
 
Artificial Intelligence in FinTech
Artificial Intelligence in FinTechArtificial Intelligence in FinTech
Artificial Intelligence in FinTechSpotle.ai
 
Growing-up With AI
Growing-up With AIGrowing-up With AI
Growing-up With AISpotle.ai
 
AI And Cyber-security Threats
AI And Cyber-security ThreatsAI And Cyber-security Threats
AI And Cyber-security ThreatsSpotle.ai
 
Robotic Process Automation With Blue Prism
Robotic Process Automation With Blue PrismRobotic Process Automation With Blue Prism
Robotic Process Automation With Blue PrismSpotle.ai
 
Get started with Microsoft Azure
Get started with Microsoft AzureGet started with Microsoft Azure
Get started with Microsoft AzureSpotle.ai
 
Google Cloud Platform Bootcamp
Google Cloud Platform BootcampGoogle Cloud Platform Bootcamp
Google Cloud Platform BootcampSpotle.ai
 

More from Spotle.ai (20)

Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...
Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...
Spotle AI-thon - AI For Good Business Plan Showcase - Team IIM Indore - AI Ro...
 
Spotle AI-thon - AI For Good Business Plan Showcase - Cummins College
Spotle AI-thon - AI For Good Business Plan Showcase - Cummins CollegeSpotle AI-thon - AI For Good Business Plan Showcase - Cummins College
Spotle AI-thon - AI For Good Business Plan Showcase - Cummins College
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Elit...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India- Ankur chat...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team La c...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Temp...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer...
 Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer... Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Zer...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Shivam Gi...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Cyber Pun...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Tech Owls...
 
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar...
 Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar... Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar...
Spotle AI-thon Top 10 Showcase - Analysing Mental Health Of India - Team Jar...
 
Artificial intelligence in fintech
Artificial intelligence in fintechArtificial intelligence in fintech
Artificial intelligence in fintech
 
Semi-supervised Machine Learning
Semi-supervised Machine LearningSemi-supervised Machine Learning
Semi-supervised Machine Learning
 
Tableau And Data Visualization - Get Started
Tableau And Data Visualization - Get StartedTableau And Data Visualization - Get Started
Tableau And Data Visualization - Get Started
 
Artificial Intelligence in FinTech
Artificial Intelligence in FinTechArtificial Intelligence in FinTech
Artificial Intelligence in FinTech
 
Growing-up With AI
Growing-up With AIGrowing-up With AI
Growing-up With AI
 
AI And Cyber-security Threats
AI And Cyber-security ThreatsAI And Cyber-security Threats
AI And Cyber-security Threats
 
Robotic Process Automation With Blue Prism
Robotic Process Automation With Blue PrismRobotic Process Automation With Blue Prism
Robotic Process Automation With Blue Prism
 
Get started with Microsoft Azure
Get started with Microsoft AzureGet started with Microsoft Azure
Get started with Microsoft Azure
 
Google Cloud Platform Bootcamp
Google Cloud Platform BootcampGoogle Cloud Platform Bootcamp
Google Cloud Platform Bootcamp
 

Recently uploaded

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Neo4j
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 

Recently uploaded (20)

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort ServiceHot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024Build your next Gen AI Breakthrough - April 2024
Build your next Gen AI Breakthrough - April 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 

Basics of Reinforcement Learning

  • 1. Basics of Reinforcement Learning Spotle.ai Study Material Spotle.ai/Learn
  • 2. Spotle.ai Study Material Spotle.ai/Learn Let’s play chess! I just don’t make any possible move without thinking what my opponent’s move can be to counter my move. I try to consider all possible moves that are safe. And then choose the one that I feel is the best move among all. Machines can learn this way. And this learning is called reinforcement machine learning.
  • 3. Spotle.ai Study Material Spotle.ai/Learn What is reinforcement learning? First, a particular situation in which the learning will be applicable. You start at a point, you go through several steps to reach a level. In the process you earn a reward point for every correct step and you lose a reward point for every wrong step. Finally, you choose the path with the highest reward point in that particular situation. Agent Environment State Reward Action
  • 4. Spotle.ai Study Material Spotle.ai/Learn Terminologies Agent: The learner and the decision maker. Environment: Where the agent learns and decides what actions to perform. Action: A set of actions which the agent can perform. State: The state of the agent in the environment. Reward: For each action selected by the agent the environment provides a reward. Usually a scalar value. Agent Environment State Reward Action
  • 5. In supervised learning the training data has the output, that is, the answer in it. Here the model is trained with the correct answer. But in case of reinforcement learning, there is no answer given. The reinforcement agent decides the action to perform based on the maximum reward it receives. There is no training data in reinforcement learning. The machine learns from its experience. Supervised learning? No Spotle.ai Study Material Spotle.ai/Learn Training data Not available
  • 6. Spotle.ai Study Material Spotle.ai/Learn Reinforcing your learning Which one to choose? Give reward to all possible ones step by step Choose the one with the maximum reward.Topic A Topic B Topic C
  • 7. Spotle.ai Study Material Spotle.ai/Learn Pavlov Experiment TRIAL 1 In the first trial Pavlov gives meat to his dog and the dog starts salivating.
  • 8. Spotle.ai Study Material Spotle.ai/Learn Pavlov Experiment TRIAL 2 In the second trial Pavlov does not give meat to his dog but rings a bell. Without seeing the meat the dog does not start salivating.
  • 9. Spotle.ai Study Material Spotle.ai/Learn Pavlov Experiment TRIAL 3 In trial 3 Pavlov rings the bell and gives meat to his dog and seeing meat the dog starts salivating.
  • 10. Spotle.ai Study Material Spotle.ai/Learn Pavlov Experiment TRIAL 4 In trial 4 Pavlov rings the bell and at this his dog starts salivating, hoping that meat will follow the ringing of the bell. This is learning by reinforcement. The dog was rewarded with meat after the ringing of the bell.
  • 11. Summarizing ❖ The input is an initial stage from which the machine starts learning. ❖ There are more than one possible output in a particular problem. ❖ Each output state is given a reward or punishment. ❖ The output with maximum reward is selected to be performed. ❖ The reinforcement learning process is continuous.