SlideShare a Scribd company logo
1 of 7
Artificial Intelligence
Reinforcement Learning
Introduction
Portland Data Science Group
Created by Andrew Ferlitsch
Community Outreach Officer
July, 2017
Introduction
• A machine learning method for adaptation to an
environment.
• An agent (e.g., robot) interacts with a dynamic
environment.
• An agent learns from interacting with the environment
the best actions to take.
• Concepts include:
• Markov Principles
• Exploration / Exploitation
• Dynamic Programming
Actions / Environment
Reflex Agent
Environment
Action Observation
Continuous Cycle:
Observe Environment,
Take Action,
Observe Environment,
Take Action
Actions are determined
based on predefined
rules.
Preprogrammed
Intelligent Agent
Wikipedia: An Intelligent Agent is an autonomous entity which observes through sensors
and acts upon an environment using actuators (i.e. it is an agent) and directs its activity
towards achieving goals (i.e. it is "rational", as defined in economics).
Intelligent Agent
Sensors
Actuators
Environment
Senses the environment (e.g., camera,
audio, LIDAR, GPS, ultrasonic)
Room, Street,
Warehouse, etc.
Modifies the environment
(e.g., walk, pickup, drive)
State
Actions
Current State of the Agent
relative to the environment.
Possible Actions the Agent
can take.
Policy
A set of rules that
map a state to an
action.
State / Reward
Intelligent Agent
Environment
Action Observation
State Reward
How the action
effected the agent
and environment.
How positive or
negative is the
new state.
LEARN
What was
learned from
the reward.
Policy
Learned set of
rules of:
States -> Actions
Example Positive Reward:
Robot Stands Up,
Closer to Destination
Example Negative Reward:
Robot Falls Down,
Further from Destination
Reinforcement Learning
Learn by Trial and Error
Un-programmed
Predefined set
of Actions
Reward
Function
Delivered From Factory
Trial and Error
Programmed
Predefined set
of Actions
Reward
Function
Reinforcement Learning
Policy
States ->
Actions
Simple Reinforcement Learning Example
Actions:
Stand
Walk
Run
Rewards:
Stand -> 0
Walk -> 1
Run -> 2
Fall -> -2
Un-programmed
Initial State -> Fall State->Action
Actions: Stand -> 0 Fall -> Stand
Walk -> Robot Falls Down -> -2
Run -> Robot Falls Down -> -2
State->Stand
Stand -> 0
Walk -> 1 Stand->Walk
Run -> Robot Falls Down -> -2
State->Walk
Stand -> 0
Walk -> 1
Run -> 2 Walk->Run
Learned PolicyTrials
Highest Reward

More Related Content

What's hot

Artificial intelligence(03)
Artificial intelligence(03)Artificial intelligence(03)
Artificial intelligence(03)Nazir Ahmed
 
Intelligent Agent Perception
Intelligent Agent PerceptionIntelligent Agent Perception
Intelligent Agent PerceptionMolly Maymar
 
Chapter 2 intelligent agents
Chapter 2 intelligent agentsChapter 2 intelligent agents
Chapter 2 intelligent agentsLukasJohnny
 
An Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted LivingAn Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted LivingAndrea Monacchi
 

What's hot (6)

Artificial intelligence(03)
Artificial intelligence(03)Artificial intelligence(03)
Artificial intelligence(03)
 
Lecture 02-agents
Lecture 02-agentsLecture 02-agents
Lecture 02-agents
 
AI
AIAI
AI
 
Intelligent Agent Perception
Intelligent Agent PerceptionIntelligent Agent Perception
Intelligent Agent Perception
 
Chapter 2 intelligent agents
Chapter 2 intelligent agentsChapter 2 intelligent agents
Chapter 2 intelligent agents
 
An Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted LivingAn Early Warning System for Ambient Assisted Living
An Early Warning System for Ambient Assisted Living
 

Similar to AI - Introduction to Reinforcement Learning

introduction to inteligent IntelligentAgent.ppt
introduction to inteligent IntelligentAgent.pptintroduction to inteligent IntelligentAgent.ppt
introduction to inteligent IntelligentAgent.pptdejene3
 
Jarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagentsJarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagentsPalGov
 
Jarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagentsJarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagentsPalGov
 
A.i lecture 04
A.i lecture 04A.i lecture 04
A.i lecture 04yarafghani
 
Introduction To Artificial Intelligence
Introduction To Artificial IntelligenceIntroduction To Artificial Intelligence
Introduction To Artificial IntelligenceNeHal VeRma
 
Introduction To Artificial Intelligence
Introduction To Artificial IntelligenceIntroduction To Artificial Intelligence
Introduction To Artificial IntelligenceNeHal VeRma
 

Similar to AI - Introduction to Reinforcement Learning (20)

Lecture 2
Lecture 2Lecture 2
Lecture 2
 
m2-agents.pptx
m2-agents.pptxm2-agents.pptx
m2-agents.pptx
 
AI Basic.pptx
AI Basic.pptxAI Basic.pptx
AI Basic.pptx
 
Lecture 2 Agents.pptx
Lecture 2 Agents.pptxLecture 2 Agents.pptx
Lecture 2 Agents.pptx
 
Unit-1.pptx
Unit-1.pptxUnit-1.pptx
Unit-1.pptx
 
introduction to inteligent IntelligentAgent.ppt
introduction to inteligent IntelligentAgent.pptintroduction to inteligent IntelligentAgent.ppt
introduction to inteligent IntelligentAgent.ppt
 
Intelligent agents
Intelligent agentsIntelligent agents
Intelligent agents
 
Jarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagentsJarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagents
 
Jarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagentsJarrar.lecture notes.aai.2011s.ch2.intelligentagents
Jarrar.lecture notes.aai.2011s.ch2.intelligentagents
 
Agents1
Agents1Agents1
Agents1
 
Lec 2-agents
Lec 2-agentsLec 2-agents
Lec 2-agents
 
A.i lecture 04
A.i lecture 04A.i lecture 04
A.i lecture 04
 
Ai u1
Ai u1Ai u1
Ai u1
 
agents in ai ppt
agents in ai pptagents in ai ppt
agents in ai ppt
 
Introduction To Artificial Intelligence
Introduction To Artificial IntelligenceIntroduction To Artificial Intelligence
Introduction To Artificial Intelligence
 
Introduction To Artificial Intelligence
Introduction To Artificial IntelligenceIntroduction To Artificial Intelligence
Introduction To Artificial Intelligence
 
Artificial Intelligence
Artificial IntelligenceArtificial Intelligence
Artificial Intelligence
 
Lec 2 agents
Lec 2 agentsLec 2 agents
Lec 2 agents
 
Unit 1.ppt
Unit 1.pptUnit 1.ppt
Unit 1.ppt
 
M2 agents
M2 agentsM2 agents
M2 agents
 

More from Andrew Ferlitsch

Pareto Principle Applied to QA
Pareto Principle Applied to QAPareto Principle Applied to QA
Pareto Principle Applied to QAAndrew Ferlitsch
 
Whiteboarding Coding Challenges in Python
Whiteboarding Coding Challenges in PythonWhiteboarding Coding Challenges in Python
Whiteboarding Coding Challenges in PythonAndrew Ferlitsch
 
Object Oriented Programming Principles
Object Oriented Programming PrinciplesObject Oriented Programming Principles
Object Oriented Programming PrinciplesAndrew Ferlitsch
 
Python - Installing and Using Python and Jupyter Notepad
Python - Installing and Using Python and Jupyter NotepadPython - Installing and Using Python and Jupyter Notepad
Python - Installing and Using Python and Jupyter NotepadAndrew Ferlitsch
 
Natural Language Processing - Groupings (Associations) Generation
Natural Language Processing - Groupings (Associations) GenerationNatural Language Processing - Groupings (Associations) Generation
Natural Language Processing - Groupings (Associations) GenerationAndrew Ferlitsch
 
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...Andrew Ferlitsch
 
Machine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural NetworksMachine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural NetworksAndrew Ferlitsch
 
Machine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural NetworksMachine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural NetworksAndrew Ferlitsch
 
Machine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural NetworksMachine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural NetworksAndrew Ferlitsch
 
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning LibrariesPython - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning LibrariesAndrew Ferlitsch
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixAndrew Ferlitsch
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsAndrew Ferlitsch
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear RegressionAndrew Ferlitsch
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear RegressionAndrew Ferlitsch
 
Machine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable ConversionMachine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable ConversionAndrew Ferlitsch
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsAndrew Ferlitsch
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset PreparationAndrew Ferlitsch
 
Machine Learning - Introduction to Tensorflow
Machine Learning - Introduction to TensorflowMachine Learning - Introduction to Tensorflow
Machine Learning - Introduction to TensorflowAndrew Ferlitsch
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningAndrew Ferlitsch
 

More from Andrew Ferlitsch (20)

Pareto Principle Applied to QA
Pareto Principle Applied to QAPareto Principle Applied to QA
Pareto Principle Applied to QA
 
Whiteboarding Coding Challenges in Python
Whiteboarding Coding Challenges in PythonWhiteboarding Coding Challenges in Python
Whiteboarding Coding Challenges in Python
 
Object Oriented Programming Principles
Object Oriented Programming PrinciplesObject Oriented Programming Principles
Object Oriented Programming Principles
 
Python - OOP Programming
Python - OOP ProgrammingPython - OOP Programming
Python - OOP Programming
 
Python - Installing and Using Python and Jupyter Notepad
Python - Installing and Using Python and Jupyter NotepadPython - Installing and Using Python and Jupyter Notepad
Python - Installing and Using Python and Jupyter Notepad
 
Natural Language Processing - Groupings (Associations) Generation
Natural Language Processing - Groupings (Associations) GenerationNatural Language Processing - Groupings (Associations) Generation
Natural Language Processing - Groupings (Associations) Generation
 
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
Natural Language Provessing - Handling Narrarive Fields in Datasets for Class...
 
Machine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural NetworksMachine Learning - Introduction to Recurrent Neural Networks
Machine Learning - Introduction to Recurrent Neural Networks
 
Machine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural NetworksMachine Learning - Introduction to Convolutional Neural Networks
Machine Learning - Introduction to Convolutional Neural Networks
 
Machine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural NetworksMachine Learning - Introduction to Neural Networks
Machine Learning - Introduction to Neural Networks
 
Python - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning LibrariesPython - Numpy/Pandas/Matplot Machine Learning Libraries
Python - Numpy/Pandas/Matplot Machine Learning Libraries
 
Machine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion MatrixMachine Learning - Accuracy and Confusion Matrix
Machine Learning - Accuracy and Confusion Matrix
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
 
ML - Multiple Linear Regression
ML - Multiple Linear RegressionML - Multiple Linear Regression
ML - Multiple Linear Regression
 
ML - Simple Linear Regression
ML - Simple Linear RegressionML - Simple Linear Regression
ML - Simple Linear Regression
 
Machine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable ConversionMachine Learning - Dummy Variable Conversion
Machine Learning - Dummy Variable Conversion
 
Machine Learning - Splitting Datasets
Machine Learning - Splitting DatasetsMachine Learning - Splitting Datasets
Machine Learning - Splitting Datasets
 
Machine Learning - Dataset Preparation
Machine Learning - Dataset PreparationMachine Learning - Dataset Preparation
Machine Learning - Dataset Preparation
 
Machine Learning - Introduction to Tensorflow
Machine Learning - Introduction to TensorflowMachine Learning - Introduction to Tensorflow
Machine Learning - Introduction to Tensorflow
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 

Recently uploaded

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 

Recently uploaded (20)

Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024SQL Database Design For Developers at php[tek] 2024
SQL Database Design For Developers at php[tek] 2024
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsSnow Chain-Integrated Tire for a Safe Drive on Winter Roads
Snow Chain-Integrated Tire for a Safe Drive on Winter Roads
 
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
"Federated learning: out of reach no matter how close",Oleksandr Lapshyn
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 

AI - Introduction to Reinforcement Learning

  • 1. Artificial Intelligence Reinforcement Learning Introduction Portland Data Science Group Created by Andrew Ferlitsch Community Outreach Officer July, 2017
  • 2. Introduction • A machine learning method for adaptation to an environment. • An agent (e.g., robot) interacts with a dynamic environment. • An agent learns from interacting with the environment the best actions to take. • Concepts include: • Markov Principles • Exploration / Exploitation • Dynamic Programming
  • 3. Actions / Environment Reflex Agent Environment Action Observation Continuous Cycle: Observe Environment, Take Action, Observe Environment, Take Action Actions are determined based on predefined rules. Preprogrammed
  • 4. Intelligent Agent Wikipedia: An Intelligent Agent is an autonomous entity which observes through sensors and acts upon an environment using actuators (i.e. it is an agent) and directs its activity towards achieving goals (i.e. it is "rational", as defined in economics). Intelligent Agent Sensors Actuators Environment Senses the environment (e.g., camera, audio, LIDAR, GPS, ultrasonic) Room, Street, Warehouse, etc. Modifies the environment (e.g., walk, pickup, drive) State Actions Current State of the Agent relative to the environment. Possible Actions the Agent can take. Policy A set of rules that map a state to an action.
  • 5. State / Reward Intelligent Agent Environment Action Observation State Reward How the action effected the agent and environment. How positive or negative is the new state. LEARN What was learned from the reward. Policy Learned set of rules of: States -> Actions Example Positive Reward: Robot Stands Up, Closer to Destination Example Negative Reward: Robot Falls Down, Further from Destination Reinforcement Learning
  • 6. Learn by Trial and Error Un-programmed Predefined set of Actions Reward Function Delivered From Factory Trial and Error Programmed Predefined set of Actions Reward Function Reinforcement Learning Policy States -> Actions
  • 7. Simple Reinforcement Learning Example Actions: Stand Walk Run Rewards: Stand -> 0 Walk -> 1 Run -> 2 Fall -> -2 Un-programmed Initial State -> Fall State->Action Actions: Stand -> 0 Fall -> Stand Walk -> Robot Falls Down -> -2 Run -> Robot Falls Down -> -2 State->Stand Stand -> 0 Walk -> 1 Stand->Walk Run -> Robot Falls Down -> -2 State->Walk Stand -> 0 Walk -> 1 Run -> 2 Walk->Run Learned PolicyTrials Highest Reward