SlideShare a Scribd company logo
1 of 21
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Rania Ramadan Mohamed
Department of Computer Science
Faculty of Computers and Artificial intelligence
Sohag University
rania_abdrabou@science.Sohag.edu.eg
2/21/2024
AI 421, Spring 2024
Introduction to Reinforcement Learning
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
What is Reinforcement Learning?
• Reinforcement Learning is a feedback-based Machine learning technique in
which an agent learns to behave in an environment by performing the
actions and seeing the results of actions. For each good action, the agent
gets positive feedback, and for each bad action, the agent gets negative
feedback or a penalty.
• In Reinforcement Learning, the agent learns automatically using feedback
without any labeled data, unlike supervised learning. Since there is no
labeled data, so the agent is bound to learn by its experience only.
the agent is bound to learn only by experience.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
What is Reinforcement Learning?
• RL solves a specific type of problem where decision-making is sequential,
and the goal is long-term, such as game-playing, robotics, etc.
• The agent interacts with the environment and explores it by itself. The
primary goal of an agent in reinforcement learning is to improve performance
by getting the maximum positive rewards.
• The agent learns with the process of error and trial, and based on the
experience, it learns to perform the task in a better way. Hence, we can say
that "Reinforcement learning is a type of machine learning method where an
intelligent agent (computer program) interacts with the environment and
learns to act within that.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
What is Reinforcement Learning?
• How a car in the race learns the movement is an example of Reinforcement
learning.
• It is a core part of Artificial intelligence, and all AI agent works on the concept
of reinforcement learning. Here we do not need to pre-program the agent, as
it learns from its own experience without any human intervention.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
What is Reinforcement Learning?
• Example: Suppose there is an AI agent present within a maze environment,
and his goal is to find the diamond. The agent interacts with the environment
by performing some actions, and based on those actions, the state of the
agent gets changed, and it also receives a reward or penalty as feedback.
•
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
What is Reinforcement Learning?
• The agent continues doing these three things (take action, change
state/remain in the same state, and get feedback), and by doing these
actions, he learns and explores the environment.
• The agent learns that what actions lead to positive feedback or rewards and
what actions lead to negative feedback penalty. As a positive reward, the
agent gets a positive point, and as a penalty, it gets a negative point.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Terms used in Reinforcement Learning
• Agent(): An entity that can perceive/explore the
environment and act upon it.
• Environment(): A situation in which an agent is
present or surrounded. In RL, we assume the
stochastic environment, which means it is random
in nature.
• Action(): Actions are the moves taken by an agent
within the environment.
• State(): State is a situation returned by the
environment after each action taken by the agent.
7
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Terms used in Reinforcement Learning
• Reward(): A feedback returned to the agent from the
environment to evaluate the action of the agent.
• Policy(): Policy is a strategy applied by the agent for
the next action based on the current state.
• Value(): It is expected long-term return with the
discount factor and opposite to the short-term reward.
• Q-value(): It is mostly similar to the value, but it takes
one additional parameter as a current action (a).
8
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Key Features of Reinforcement Learning
• In RL, the agent is not instructed about the environment and what actions
need to be taken.
• It is based on the errors and trial process.
• The agent takes the next action and changes states according to the
feedback of the previous action.
• The agent may get a delayed reward.
• The environment is stochastic, and the agent needs to explore it to reach to
get the maximum positive rewards.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
How does Reinforcement Learning Work?
• To understand the working process of the RL, we need to consider two main
things:
• Environment: It can be anything such as a room, maze, football ground,
etc.
• Agent: An intelligent agent such as AI robot.
• Let's take an example of a maze environment that the agent needs to
explore. Consider the below image:
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
How does Reinforcement Learning Work?
• In the opposite image, the agent is at the very
first block of the maze. The maze is consisting of
an S6 block, which is a wall, S8 a fire pit, and S4
a diamond block.
• The agent cannot cross the S6 block, as it is a
solid wall. If the agent reaches the S4 block, then
get the +1 reward; if it reaches the fire pit, then
gets -1 reward point. It can take four actions:
move up, move down, move left, and move right.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
How does Reinforcement Learning Work?
• he agent can take any path to reach to the final
point, but he needs to make it in possible fewer
steps. Suppose the agent considers the path S9-
S5-S1-S2-S3, so he will get the +1-reward point.
• The agent will try to remember the preceding
steps that it has taken to reach the final step. To
memorize the steps, it assigns 1 value to each
previous step. Consider the below step:
Now, the agent has successfully stored the previous steps assigning the 1 value
to each previous block.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
How does Reinforcement Learning Work?
• To understand reinforcement learning better, consider a dog that we have to
house-train. Here, the dog is the agent, and the house, is the environment.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
How does Reinforcement Learning Work?
• We can get the dog to perform various actions by offering incentives such as
dog biscuits as a reward.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
How does Reinforcement Learning Work?
• The dog will follow a policy to maximize its reward and hence will follow
every command and might even learn a new action, like begging, all by itself.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
How does Reinforcement Learning Work?
• The dog will also want to run around and play and explore its environment.
This quality of a model is called Exploration. The tendency of the dog to
maximize rewards is called Exploitation. There is always a tradeoff between
exploration and exploitation, as exploration actions may lead to lesser
rewards..
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Supervised vs Unsupervised vs Reinforcement Learning
• The below table shows the differences between the three main sub-
branches of machine learning.
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Supervised vs Unsupervised vs Reinforcement Learning
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Supervised vs Unsupervised vs Reinforcement Learning
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024
Supervised vs Unsupervised vs Reinforcement Learning
2024, Rania Ramadan (Sohag University)
AI 421, Spring 2024

More Related Content

Similar to lecture in Reinforcement learning in Ai

RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptxRL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptxdeeplearning6
 
Introduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptxIntroduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptxHarsha Patel
 
Rl chapter 1 introduction
Rl chapter 1 introductionRl chapter 1 introduction
Rl chapter 1 introductionConnorShorten2
 
leewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdf
leewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdfleewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdf
leewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdfKristiLBurns
 
I-lluminating the path to authentic assessment: generative AI as a catalyst f...
I-lluminating the path to authentic assessment: generative AI as a catalyst f...I-lluminating the path to authentic assessment: generative AI as a catalyst f...
I-lluminating the path to authentic assessment: generative AI as a catalyst f...SEDA
 
Making smart decisions in real-time with Reinforcement Learning
Making smart decisions in real-time with Reinforcement LearningMaking smart decisions in real-time with Reinforcement Learning
Making smart decisions in real-time with Reinforcement LearningRuth Yakubu
 
Student Learnng Outcomes - Gandhian Way
Student Learnng Outcomes - Gandhian WayStudent Learnng Outcomes - Gandhian Way
Student Learnng Outcomes - Gandhian Waydrmandi
 
REINFORCEMENT LEARNING (reinforced through trial and error).pptx
REINFORCEMENT LEARNING (reinforced through trial and error).pptxREINFORCEMENT LEARNING (reinforced through trial and error).pptx
REINFORCEMENT LEARNING (reinforced through trial and error).pptxarchayacb21
 
Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]
Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]
Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]Shakeeb Ahmad Mohammad Mukhtar
 
5 learning edited 2012.ppt
5 learning edited 2012.ppt5 learning edited 2012.ppt
5 learning edited 2012.pptHenokGetachew15
 
UNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdf
UNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdfUNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdf
UNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdfJenishaR1
 
Briefly About Reinforcement Learning which we are using in our Esports project?
Briefly About Reinforcement Learning which we are using in our Esports project?Briefly About Reinforcement Learning which we are using in our Esports project?
Briefly About Reinforcement Learning which we are using in our Esports project?Neeraj Bedi
 
Reinforcement learning Markov principle
Reinforcement learning  Markov principleReinforcement learning  Markov principle
Reinforcement learning Markov principleAmitSinghYadav21
 
Introduction to Reinforcement Learning
Introduction to Reinforcement LearningIntroduction to Reinforcement Learning
Introduction to Reinforcement LearningMihir Thakkar
 
Introduction to Reinforcement Learning - Code Heroku
Introduction to Reinforcement Learning - Code HerokuIntroduction to Reinforcement Learning - Code Heroku
Introduction to Reinforcement Learning - Code Herokucodeheroku
 
Measuring Impact For Social Businesses And NGOs
Measuring Impact For Social Businesses And NGOsMeasuring Impact For Social Businesses And NGOs
Measuring Impact For Social Businesses And NGOsSarthak Satapathy
 
UNIT-I Part2 Heuristic Search Techniques.pptx
UNIT-I Part2 Heuristic Search Techniques.pptxUNIT-I Part2 Heuristic Search Techniques.pptx
UNIT-I Part2 Heuristic Search Techniques.pptxVijayaAhire2
 

Similar to lecture in Reinforcement learning in Ai (20)

RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptxRL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
RL_Dr.SNR Final ppt for Presentation 28.05.2021.pptx
 
Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 
Introduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptxIntroduction to Reinforcement Learning.pptx
Introduction to Reinforcement Learning.pptx
 
Rl chapter 1 introduction
Rl chapter 1 introductionRl chapter 1 introduction
Rl chapter 1 introduction
 
leewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdf
leewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdfleewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdf
leewayhertz.com-Reinforcement Learning from Human Feedback RLHF.pdf
 
I-lluminating the path to authentic assessment: generative AI as a catalyst f...
I-lluminating the path to authentic assessment: generative AI as a catalyst f...I-lluminating the path to authentic assessment: generative AI as a catalyst f...
I-lluminating the path to authentic assessment: generative AI as a catalyst f...
 
Making smart decisions in real-time with Reinforcement Learning
Making smart decisions in real-time with Reinforcement LearningMaking smart decisions in real-time with Reinforcement Learning
Making smart decisions in real-time with Reinforcement Learning
 
Student Learnng Outcomes - Gandhian Way
Student Learnng Outcomes - Gandhian WayStudent Learnng Outcomes - Gandhian Way
Student Learnng Outcomes - Gandhian Way
 
REINFORCEMENT LEARNING (reinforced through trial and error).pptx
REINFORCEMENT LEARNING (reinforced through trial and error).pptxREINFORCEMENT LEARNING (reinforced through trial and error).pptx
REINFORCEMENT LEARNING (reinforced through trial and error).pptx
 
2 semai.pptx
2 semai.pptx2 semai.pptx
2 semai.pptx
 
Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]
Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]
Q-Learning Algorithm: A Concise Introduction [Shakeeb A.]
 
5 learning edited 2012.ppt
5 learning edited 2012.ppt5 learning edited 2012.ppt
5 learning edited 2012.ppt
 
UNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdf
UNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdfUNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdf
UNIT - I PROBLEM SOLVING AGENTS and EXAMPLES.pptx.pdf
 
Briefly About Reinforcement Learning which we are using in our Esports project?
Briefly About Reinforcement Learning which we are using in our Esports project?Briefly About Reinforcement Learning which we are using in our Esports project?
Briefly About Reinforcement Learning which we are using in our Esports project?
 
Reinforcement learning Markov principle
Reinforcement learning  Markov principleReinforcement learning  Markov principle
Reinforcement learning Markov principle
 
Introduction to Reinforcement Learning
Introduction to Reinforcement LearningIntroduction to Reinforcement Learning
Introduction to Reinforcement Learning
 
Introduction to Reinforcement Learning - Code Heroku
Introduction to Reinforcement Learning - Code HerokuIntroduction to Reinforcement Learning - Code Heroku
Introduction to Reinforcement Learning - Code Heroku
 
Active Object Localization with Deep Reinforcement Learning
Active Object Localization with Deep Reinforcement LearningActive Object Localization with Deep Reinforcement Learning
Active Object Localization with Deep Reinforcement Learning
 
Measuring Impact For Social Businesses And NGOs
Measuring Impact For Social Businesses And NGOsMeasuring Impact For Social Businesses And NGOs
Measuring Impact For Social Businesses And NGOs
 
UNIT-I Part2 Heuristic Search Techniques.pptx
UNIT-I Part2 Heuristic Search Techniques.pptxUNIT-I Part2 Heuristic Search Techniques.pptx
UNIT-I Part2 Heuristic Search Techniques.pptx
 

Recently uploaded

CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalSwarnaSLcse
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024EMMANUELLEFRANCEHELI
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfJNTUA
 
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...Nitin Sonavane
 
Online crime reporting system project.pdf
Online crime reporting system project.pdfOnline crime reporting system project.pdf
Online crime reporting system project.pdfKamal Acharya
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailingAshishSingh1301
 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...Amil baba
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docxrahulmanepalli02
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..MaherOthman7
 
Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...IJECEIAES
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsMathias Magdowski
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxKarpagam Institute of Teechnology
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...IJECEIAES
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Studentskannan348865
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxMustafa Ahmed
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdfAlexander Litvinenko
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashidFaiyazSheikh
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxMustafa Ahmed
 

Recently uploaded (20)

CLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference ModalCLOUD COMPUTING SERVICES - Cloud Reference Modal
CLOUD COMPUTING SERVICES - Cloud Reference Modal
 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
 
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
Module-III Varried Flow.pptx GVF Definition, Water Surface Profile Dynamic Eq...
 
Online crime reporting system project.pdf
Online crime reporting system project.pdfOnline crime reporting system project.pdf
Online crime reporting system project.pdf
 
handbook on reinforce concrete and detailing
handbook on reinforce concrete and detailinghandbook on reinforce concrete and detailing
handbook on reinforce concrete and detailing
 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx
 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
 
Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...Performance enhancement of machine learning algorithm for breast cancer diagn...
Performance enhancement of machine learning algorithm for breast cancer diagn...
 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
 
analog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptxanalog-vs-digital-communication (concept of analog and digital).pptx
analog-vs-digital-communication (concept of analog and digital).pptx
 
Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...Fuzzy logic method-based stress detector with blood pressure and body tempera...
Fuzzy logic method-based stress detector with blood pressure and body tempera...
 
Basics of Relay for Engineering Students
Basics of Relay for Engineering StudentsBasics of Relay for Engineering Students
Basics of Relay for Engineering Students
 
Augmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptxAugmented Reality (AR) with Augin Software.pptx
Augmented Reality (AR) with Augin Software.pptx
 
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdflitvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
litvinenko_Henry_Intrusion_Hong-Kong_2024.pdf
 
Raashid final report on Embedded Systems
Raashid final report on Embedded SystemsRaashid final report on Embedded Systems
Raashid final report on Embedded Systems
 
Dynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptxDynamo Scripts for Task IDs and Space Naming.pptx
Dynamo Scripts for Task IDs and Space Naming.pptx
 

lecture in Reinforcement learning in Ai

  • 1. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Rania Ramadan Mohamed Department of Computer Science Faculty of Computers and Artificial intelligence Sohag University rania_abdrabou@science.Sohag.edu.eg 2/21/2024 AI 421, Spring 2024 Introduction to Reinforcement Learning
  • 2. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 What is Reinforcement Learning? • Reinforcement Learning is a feedback-based Machine learning technique in which an agent learns to behave in an environment by performing the actions and seeing the results of actions. For each good action, the agent gets positive feedback, and for each bad action, the agent gets negative feedback or a penalty. • In Reinforcement Learning, the agent learns automatically using feedback without any labeled data, unlike supervised learning. Since there is no labeled data, so the agent is bound to learn by its experience only. the agent is bound to learn only by experience.
  • 3. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 What is Reinforcement Learning? • RL solves a specific type of problem where decision-making is sequential, and the goal is long-term, such as game-playing, robotics, etc. • The agent interacts with the environment and explores it by itself. The primary goal of an agent in reinforcement learning is to improve performance by getting the maximum positive rewards. • The agent learns with the process of error and trial, and based on the experience, it learns to perform the task in a better way. Hence, we can say that "Reinforcement learning is a type of machine learning method where an intelligent agent (computer program) interacts with the environment and learns to act within that.
  • 4. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 What is Reinforcement Learning? • How a car in the race learns the movement is an example of Reinforcement learning. • It is a core part of Artificial intelligence, and all AI agent works on the concept of reinforcement learning. Here we do not need to pre-program the agent, as it learns from its own experience without any human intervention.
  • 5. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 What is Reinforcement Learning? • Example: Suppose there is an AI agent present within a maze environment, and his goal is to find the diamond. The agent interacts with the environment by performing some actions, and based on those actions, the state of the agent gets changed, and it also receives a reward or penalty as feedback. •
  • 6. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 What is Reinforcement Learning? • The agent continues doing these three things (take action, change state/remain in the same state, and get feedback), and by doing these actions, he learns and explores the environment. • The agent learns that what actions lead to positive feedback or rewards and what actions lead to negative feedback penalty. As a positive reward, the agent gets a positive point, and as a penalty, it gets a negative point.
  • 7. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Terms used in Reinforcement Learning • Agent(): An entity that can perceive/explore the environment and act upon it. • Environment(): A situation in which an agent is present or surrounded. In RL, we assume the stochastic environment, which means it is random in nature. • Action(): Actions are the moves taken by an agent within the environment. • State(): State is a situation returned by the environment after each action taken by the agent. 7
  • 8. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Terms used in Reinforcement Learning • Reward(): A feedback returned to the agent from the environment to evaluate the action of the agent. • Policy(): Policy is a strategy applied by the agent for the next action based on the current state. • Value(): It is expected long-term return with the discount factor and opposite to the short-term reward. • Q-value(): It is mostly similar to the value, but it takes one additional parameter as a current action (a). 8
  • 9. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Key Features of Reinforcement Learning • In RL, the agent is not instructed about the environment and what actions need to be taken. • It is based on the errors and trial process. • The agent takes the next action and changes states according to the feedback of the previous action. • The agent may get a delayed reward. • The environment is stochastic, and the agent needs to explore it to reach to get the maximum positive rewards.
  • 10. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 How does Reinforcement Learning Work? • To understand the working process of the RL, we need to consider two main things: • Environment: It can be anything such as a room, maze, football ground, etc. • Agent: An intelligent agent such as AI robot. • Let's take an example of a maze environment that the agent needs to explore. Consider the below image:
  • 11. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 How does Reinforcement Learning Work? • In the opposite image, the agent is at the very first block of the maze. The maze is consisting of an S6 block, which is a wall, S8 a fire pit, and S4 a diamond block. • The agent cannot cross the S6 block, as it is a solid wall. If the agent reaches the S4 block, then get the +1 reward; if it reaches the fire pit, then gets -1 reward point. It can take four actions: move up, move down, move left, and move right.
  • 12. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 How does Reinforcement Learning Work? • he agent can take any path to reach to the final point, but he needs to make it in possible fewer steps. Suppose the agent considers the path S9- S5-S1-S2-S3, so he will get the +1-reward point. • The agent will try to remember the preceding steps that it has taken to reach the final step. To memorize the steps, it assigns 1 value to each previous step. Consider the below step: Now, the agent has successfully stored the previous steps assigning the 1 value to each previous block.
  • 13. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 How does Reinforcement Learning Work? • To understand reinforcement learning better, consider a dog that we have to house-train. Here, the dog is the agent, and the house, is the environment.
  • 14. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 How does Reinforcement Learning Work? • We can get the dog to perform various actions by offering incentives such as dog biscuits as a reward.
  • 15. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 How does Reinforcement Learning Work? • The dog will follow a policy to maximize its reward and hence will follow every command and might even learn a new action, like begging, all by itself.
  • 16. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 How does Reinforcement Learning Work? • The dog will also want to run around and play and explore its environment. This quality of a model is called Exploration. The tendency of the dog to maximize rewards is called Exploitation. There is always a tradeoff between exploration and exploitation, as exploration actions may lead to lesser rewards..
  • 17. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Supervised vs Unsupervised vs Reinforcement Learning • The below table shows the differences between the three main sub- branches of machine learning.
  • 18. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Supervised vs Unsupervised vs Reinforcement Learning
  • 19. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Supervised vs Unsupervised vs Reinforcement Learning
  • 20. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024 Supervised vs Unsupervised vs Reinforcement Learning
  • 21. 2024, Rania Ramadan (Sohag University) AI 421, Spring 2024