Introduction to AI - Eight Lecture

817 views

Published on

Published in: Technology, Economy & Finance
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
817
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
5
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Introduction to AI - Eight Lecture

  1. 1. Introduction to AI 8th Lecture 1990’s – Agents Wouter Beek me@wouterbeek.com 10 November 2010
  2. 2. Origin 0 Rational agent concept from economy. 0 Utility theory: the theory of preferred outcomes. 0 Decision theory: the dynamics of utility maximization in an unpredictable environment. 0 Game theory: the dynamics of utility maximization when participants affect each other’s utility in a predictable way.
  3. 3. Agent 0 Agent: 0 Perceive the environment through sensors. 0 Act on the environment through actuators. 0 The environment can be non‐physical. 0 Percept: the set of perceptions at some point in time. 0 Percept sequence: the set of a perception‐time pairs. 0 Agent function: percept sequence  action 0 Agent program: an implementation of an agent function. 0 Agent architecture
  4. 4. Rationality 0 A rational being considers all the consequences of all possible actions, and makes these consequences part of the decision processes for performing each of those actions. 0 Given an environment and a percept sequence, what is the ‘best’ thing to do? 0 Performance measure: objective assessment of the value of success of an arbitrary environment sequence.
  5. 5. Rational agent Dependent variables: 1. Prior knowledge of the agent. 2. Performance measure of environment state sequence. 3. Possible actions the agent can perform. 4. Percept sequence of the agent. 0 Information gathering: performing (3) in order to enrich (4) and thereby increase (1). 0 Learning: increase (1) through (4). 0 Autonomy: all of (1) relates back to (4).
  6. 6. Task environment 0 Fully / partially observable 0 Single / multiagent (competitive / cooperative) 0 Deterministic / stochastic 0 Episodic / sequential 0 Static / dynamic / semidynamic 0 Discrete / continuous 0 Known / unknown 0 Blocks world: fully observable, single agent, deterministic, episodic, static, known environment. 0 1990’s: partially observable, multiagent, stochastic, sequential, dynamic, continuous, unknown environments.
  7. 7. Example 0 Percepts: location (A, B), contents (dirty, clean). 0 Actions: left, right, suck, idle.
  8. 8. Table‐driven 0 Low intelligence 0 High complexity 0 The task of AI is to improve on this complexity metric.
  9. 9. Simple reflex agent 0 No memory 0 Low complexity: the number of percepts for which a reaction is defined. 0 Condition‐action rules
  10. 10. Model‐based agent
  11. 11. Model‐based agent Inputs to deliberation: 0 Current percepts 0 State: model or internal representation. 0 Condition‐action rules. 0 Recent actions. 0 The state is updated based on previous state, most recent action, and percept. 0 The action is chosen based on state and rules.
  12. 12. Goal‐based agent
  13. 13. Utility‐based agent
  14. 14. Utility‐based agent 0 Utility function: internalization of the performance measure. 0 The action is chosen based on state, goal, and cost.
  15. 15. Learning agent
  16. 16. Multiagent 0 Cooperation 0 Competition 0 Swarm intelligence: performance measure applied to collective behavior. 0 Decentralized representation 0 Emergent behavior 0 Weak emergence: the qualities of the system are reducible to the system's constituent parts. 0 Strong emergence: e.g. qualia. 0 The concepts of utility and rationality change!
  17. 17. Prisoner’s dilemma Prisoner B silent Prisoner B betray Prisoner A silent A:0.5, B:0.5 A:10, B:0 Prisoner A betray A:0, B:10 A:5, B:5 Two suspects are arrested. If one testifies against the other (betray) and the other remains silent, the betrayer goes free and the silent accomplice receives the full 10‐year sentence. If both remain silent, both prisoners are sentenced to only six months for a minor charge. If each betrays the other, each receives a 5‐year sentence. How should the prisoners act? • No matter what the other player does, a player will always gain a greater payoff by playing defect. • Since in any situation betraying is more beneficial than remaining silent, all rational players will betray.

×