Artificial Intelligence
Learning Agents
Introduction
Artificial Intelligence Training
Created by Andrew Ferlitsch
Instructor
October, 2017
Introduction
• An agent (e.g., robot) interacts with a dynamic
environment.
• An agent learns from interacting with the environment
the best actions to take.
• Four Types of Agents (in increasing capability):
• Simple Reflex agents
• Model-based agents
• Goal-based agents
• Utility-based agents
Simple Reflex Agent
Simple Reflex Agent
Sensors
Actuators
Environment
Senses the environment (e.g., camera,
audio, LIDAR, GPS, ultrasonic)
Room, Street,
Warehouse, etc.
Modifies the environment
(e.g., walk, pickup, drive)
State
Actions
Current State of the Agent
relative to the environment.
Actions the Agent
can take.
Rules
A set of pre-defined
rules that map a
state to an action.
A simple reflex agent always executes the same action for the same observation.
Works in Environments that are fully observable.
Actions / Environment (Simple Reflex)
Reflex Agent
Environment
Action Observation
Continuous Cycle:
Observe Environment,
Take Action,
Observe Environment,
Take Action
Actions are determined
based on predefined
rules.
Preprogrammed
StateRules
How the action
effected the agent
and environment.
Predefined Rules select
Action based on predicted
State.
Model-Based (Reflex) Agent
Model-Based Reflex Agent
Sensors
Actuators
Environment
Senses the environment (e.g., camera,
audio, LIDAR, GPS, ultrasonic)
Room, Street,
Warehouse, etc.
Modifies the environment
(e.g., walk, pickup, drive)
State
Actions
Presumed State of the Agent
relative to the environment.
Actions the Agent
can take.
Rules
A set of pre-defined
rules that map a
state to an action.
A model-based agent uses model to predict the unobserved portion of the environment.
Works in Environments that are only partially observable.
Past
Model of how the
environment responds
to predict unobserved
changes to the
Environments.
Model
Short-term memory
of past observations.
Actions / Environment (Model-Based)
Model-based Reflex Agent
Environment
Action Observation
Preprogrammed
StatePast
How the action
effected the agent
and environment.
History of Past Observations
Predefined Model of
how the environment
behaves.
The model
combines the past and
present observations
to predict the state of
the environment.
Predefined Rules select
Action based on predicted
State.
Rules
MODEL
Goal-Based Agent
Goal-Based Agent
Sensors
Actuators
Environment
Senses the environment (e.g., camera,
audio, LIDAR, GPS, ultrasonic)
Room, Street,
Warehouse, etc.
Modifies the environment
(e.g., walk, pickup, drive)
State
Actions
(Presumed/Known) State of the Agent
relative to the environment.
Actions the Agent
can take.
Rules
A set of pre-defined
rules that map a
state to an action.
A goal-based agent uses a goal(s) to evaluate how close to achieving the goal is the
next possible action.
Works in Environments which need to predict the future.
Past
Model of how the
environment responds
to predict unobserved
changes to the
Environments.
Model
Goal
Optional
A goal(s) to achieve,
when evaluating the
next action (i.e., how
closer to achieving the
goal).
Actions / Environment (Goal-Based)
Goal-based Agent
Environment
Action Observation
StatePast
How the action
effected the agent
and environment.
History of Past Observations
Predefined Model
of how the
environment
behaves.
Predefined Rules select
Action based on how close is
the predicted State to the goal.
Rules
MODEL
GOAL(s)
A Goal(s) for
evaluating how
close is an
action/state to
the goal.
Utility-Based (ā€œRationalā€) Agent
Utility-Based Agent
Sensors
Actuators
Environment
Senses the environment (e.g., camera,
audio, LIDAR, GPS, ultrasonic)
Room, Street,
Warehouse, etc.
Modifies the environment
(e.g., walk, pickup, drive)
State
Actions
(Presumed/Known) State of the Agent
relative to the environment.
Actions the Agent
can take.
Rules
A set of pre-defined
rules that map a
state to an action.
A utility-based agent uses a utility to measure the value of the next possible action
to achieving the goal.
Works in Environments which must optimize achieving the Goal.
Past
Model of how the
environment responds
to predict unobserved
changes to the
Environments.
Model
Goal
Optional
A goal(s) to achieve,
when evaluating the
next action (i.e., how
closer to achieving the
goal). Utility
A measurement of
the value of an action
towards the goal.
Actions / Environment (Utility-Based)
Utility-based Agent
Environment
Action Observation
StatePast
How the action
effected the agent
and environment.
History of Past Observations
Predefined Model
of how the
environment
behaves.
Predefined Rules select
Action based on the value of the
predicted State to achieving the goal.
Rules
MODEL
GOAL(s)
A Goal(s) for
evaluating how
close is an
action/state to
the goal.
(š‘ŗ, š‘Ø)
A utility for measuring
The value of an State/Action
Towards achieving a goal.
What’s Missing?
• There is no learning!
• Learn the Model (learn to model the environment)
• Learn the Utility (learn to measure the value of a state)
Learning (ā€œIntelligentā€) Agent
Learning Agent
Sensors
Actuators
Environment
Senses the environment (e.g., camera,
audio, LIDAR, GPS, ultrasonic)
Room, Street,
Warehouse, etc.
Modifies the environment
(e.g., walk, pickup, drive)
State
Actions
(Presumed/Known) State of the Agent
relative to the environment.
Actions the Agent
can take.
A learning agent dynamically learns a Policy to model the Environment and build
Action/State rules.
Works in Environments that are Dynamically Changing (Stochastic)
Policy
A learned model of
the environment
and learned
State/Action rules.
Goal
Critic
Utility
A goal(s) to achieve,
when evaluating the
next action (i.e., how
closer to achieving the
goal).
A measurement of
the value of an action
towards the goal.
A measurement on
how good an action
actually was.
State / Reward
Intelligent Agent
Environment
Action Observation
State Reward
How the action
effected the agent
and environment.
How positive or
negative is the
new state.
LEARN
What was
learned from
the reward.
Policy
Learned set of
rules of:
States -> Actions
Example Positive Reward:
Robot Stands Up,
Closer to Destination
Example Negative Reward:
Robot Falls Down,
Further from Destination
Reinforcement Learning

AI - Intelligent Agents

  • 1.
    Artificial Intelligence Learning Agents Introduction ArtificialIntelligence Training Created by Andrew Ferlitsch Instructor October, 2017
  • 2.
    Introduction • An agent(e.g., robot) interacts with a dynamic environment. • An agent learns from interacting with the environment the best actions to take. • Four Types of Agents (in increasing capability): • Simple Reflex agents • Model-based agents • Goal-based agents • Utility-based agents
  • 3.
    Simple Reflex Agent SimpleReflex Agent Sensors Actuators Environment Senses the environment (e.g., camera, audio, LIDAR, GPS, ultrasonic) Room, Street, Warehouse, etc. Modifies the environment (e.g., walk, pickup, drive) State Actions Current State of the Agent relative to the environment. Actions the Agent can take. Rules A set of pre-defined rules that map a state to an action. A simple reflex agent always executes the same action for the same observation. Works in Environments that are fully observable.
  • 4.
    Actions / Environment(Simple Reflex) Reflex Agent Environment Action Observation Continuous Cycle: Observe Environment, Take Action, Observe Environment, Take Action Actions are determined based on predefined rules. Preprogrammed StateRules How the action effected the agent and environment. Predefined Rules select Action based on predicted State.
  • 5.
    Model-Based (Reflex) Agent Model-BasedReflex Agent Sensors Actuators Environment Senses the environment (e.g., camera, audio, LIDAR, GPS, ultrasonic) Room, Street, Warehouse, etc. Modifies the environment (e.g., walk, pickup, drive) State Actions Presumed State of the Agent relative to the environment. Actions the Agent can take. Rules A set of pre-defined rules that map a state to an action. A model-based agent uses model to predict the unobserved portion of the environment. Works in Environments that are only partially observable. Past Model of how the environment responds to predict unobserved changes to the Environments. Model Short-term memory of past observations.
  • 6.
    Actions / Environment(Model-Based) Model-based Reflex Agent Environment Action Observation Preprogrammed StatePast How the action effected the agent and environment. History of Past Observations Predefined Model of how the environment behaves. The model combines the past and present observations to predict the state of the environment. Predefined Rules select Action based on predicted State. Rules MODEL
  • 7.
    Goal-Based Agent Goal-Based Agent Sensors Actuators Environment Sensesthe environment (e.g., camera, audio, LIDAR, GPS, ultrasonic) Room, Street, Warehouse, etc. Modifies the environment (e.g., walk, pickup, drive) State Actions (Presumed/Known) State of the Agent relative to the environment. Actions the Agent can take. Rules A set of pre-defined rules that map a state to an action. A goal-based agent uses a goal(s) to evaluate how close to achieving the goal is the next possible action. Works in Environments which need to predict the future. Past Model of how the environment responds to predict unobserved changes to the Environments. Model Goal Optional A goal(s) to achieve, when evaluating the next action (i.e., how closer to achieving the goal).
  • 8.
    Actions / Environment(Goal-Based) Goal-based Agent Environment Action Observation StatePast How the action effected the agent and environment. History of Past Observations Predefined Model of how the environment behaves. Predefined Rules select Action based on how close is the predicted State to the goal. Rules MODEL GOAL(s) A Goal(s) for evaluating how close is an action/state to the goal.
  • 9.
    Utility-Based (ā€œRationalā€) Agent Utility-BasedAgent Sensors Actuators Environment Senses the environment (e.g., camera, audio, LIDAR, GPS, ultrasonic) Room, Street, Warehouse, etc. Modifies the environment (e.g., walk, pickup, drive) State Actions (Presumed/Known) State of the Agent relative to the environment. Actions the Agent can take. Rules A set of pre-defined rules that map a state to an action. A utility-based agent uses a utility to measure the value of the next possible action to achieving the goal. Works in Environments which must optimize achieving the Goal. Past Model of how the environment responds to predict unobserved changes to the Environments. Model Goal Optional A goal(s) to achieve, when evaluating the next action (i.e., how closer to achieving the goal). Utility A measurement of the value of an action towards the goal.
  • 10.
    Actions / Environment(Utility-Based) Utility-based Agent Environment Action Observation StatePast How the action effected the agent and environment. History of Past Observations Predefined Model of how the environment behaves. Predefined Rules select Action based on the value of the predicted State to achieving the goal. Rules MODEL GOAL(s) A Goal(s) for evaluating how close is an action/state to the goal. (š‘ŗ, š‘Ø) A utility for measuring The value of an State/Action Towards achieving a goal.
  • 11.
    What’s Missing? • Thereis no learning! • Learn the Model (learn to model the environment) • Learn the Utility (learn to measure the value of a state)
  • 12.
    Learning (ā€œIntelligentā€) Agent LearningAgent Sensors Actuators Environment Senses the environment (e.g., camera, audio, LIDAR, GPS, ultrasonic) Room, Street, Warehouse, etc. Modifies the environment (e.g., walk, pickup, drive) State Actions (Presumed/Known) State of the Agent relative to the environment. Actions the Agent can take. A learning agent dynamically learns a Policy to model the Environment and build Action/State rules. Works in Environments that are Dynamically Changing (Stochastic) Policy A learned model of the environment and learned State/Action rules. Goal Critic Utility A goal(s) to achieve, when evaluating the next action (i.e., how closer to achieving the goal). A measurement of the value of an action towards the goal. A measurement on how good an action actually was.
  • 13.
    State / Reward IntelligentAgent Environment Action Observation State Reward How the action effected the agent and environment. How positive or negative is the new state. LEARN What was learned from the reward. Policy Learned set of rules of: States -> Actions Example Positive Reward: Robot Stands Up, Closer to Destination Example Negative Reward: Robot Falls Down, Further from Destination Reinforcement Learning