SlideShare a Scribd company logo
1 of 81
Download to read offline
1	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
This	
  work	
  is	
  licensed	
  under	
  a	
  CreaCve	
  Commons	
  AFribuCon-­‐NonCommercial-­‐ShareAlike	
  4.0	
  InternaConal	
  License.	
  
Chapter	
  2	
  	
  
Delibera.on	
  with	
  Determinis.c	
  Models	
  
Dana S. Nau and Vikas Shivashankar
University of Maryland
2	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Purpose	
  of	
  this	
  Chapter	
  
●  Last time, Vikas mentioned conventional AI planning
Ø  Given
•  a domain model (descriptions of the states and actions)
•  initial state s0, and goal g
Ø  Find a plan or a policy that
•  is executable starting in s0
•  produces a state that satisfies g
●  This chapter discusses some techniques for doing that
Ø  Also, how to use those techniques in acting systems
●  Outline:
Ø  2a: Representing planning domains
Ø  2b: Planning algorithms
Ø  2c: Acting and planning
3	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
This	
  work	
  is	
  licensed	
  under	
  a	
  CreaCve	
  Commons	
  AFribuCon-­‐NonCommercial-­‐ShareAlike	
  4.0	
  InternaConal	
  License.	
  
Chapter	
  2	
  	
  
Delibera.on	
  with	
  Determinis.c	
  Models	
  
	
  
2a:	
  Represen.ng	
  Planning	
  Domains	
  
Dana S. Nau and Vikas Shivashankar
University of Maryland
4	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Domain	
  Model	
  
●  Planning domain: an abstract model of the environment
Ø  Many different kinds of environments, various ways to model them
●  In this chapter, the model is a deterministic state-transition system
Ø  Σ = (S,A,γ)
•  S is a finite set of states
▸  States of the world
•  A is a finite set of actions
▸  Things an actor can do
•  γ: S × A → S is a prediction function (or state-transition function)
▸  Given a state s and action a, γ(s,a) is another state
•  Prediction of what state will be produced by executing a in s
•  γ is partial: γ(s,a) is undefined if a is inapplicable in s
▸  Dom(a) = {s ∈ S | γ(s,a) is defined} = {states where a is applicable}
▸  Range(a) = {γ(s,a) | s ∈ Dom(a)}
5	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Implicit	
  Assump.ons	
  
●  The state-transition model incorporates the following assumptions
Ø  Static world
•  Changes occur only in response to the actor’s actions
Ø  Perfect information
•  Actor always has all the information it needs
Ø  Instantaneous actions
•  Each action causes an instantaneous transition from one state to the next
Ø  Determinism
•  Actions are deterministic
Ø  Correct prediction function
•  Outcome of action a in state s is always γ(s,a)
Ø  Flat search space
•  Only one level of abstraction; ignore how to refine actions at a lower level
6	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
How	
  to	
  Represent	
  Σ?	
  
●  If the domain is small enough
Ø  Give each state and action a
unique name
Ø  For each s and a, store γ(s,a)
in a lookup table
loc1
loc3
loc2
loc6
loc5
loc4
loc7
loc8
loc0
x
y
4
3
2
1
1 2 3 4 5 60
loc9
●  If a domain is larger, don’t represent all states explicitly
Ø  Have a formalism for describing states by describing their properties
Ø  Represent each action by describing how it changes those properties
Ø  Start with initial state, use actions to produce other states
7	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Determinis.c	
  Operator	
  (General	
  Form)	
  
●  Domain-specific format for representing states
Ø  Invent your own format
●  General form of a deterministic operator:
Ø  o = (head(o), pre(o), eff(o), cost(o))
•  head(o): name and parameter list
•  pre(o): preconditions
▸  Computational tests to predict whether an action can be performed in
a state s
▸  In principle, should be necessary/sufficient for the action to run
without error
•  eff(o): effects
▸  Procedures that assign new values to some of the state variables
•  cost(o): procedure that returns a number
▸  Can be omitted, in which case cost(o) = 1
▸  Could represent monetary cost, time required, something else
8	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Example	
  
●  Suppose we want to plan how to create a metal hole in the workpiece
●  a state s includes
Ø  geometric model of the workpiece, variables describing its location,
orientation, and other status information,
Ø  capabilities and status of drilling machine and drill
●  Several actions (getting the workpiece onto the machine, clamping it, loading a
drill bit, etc.)
Ø  Next slide: the drilling operation itself
9	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Drilling	
  Ac.on	
  
●  head(o) = drill-­‐hole(m, w, b, x1, …, xn)
Ø  m, w, b are the names or ID numbers of the drilling machine, the workpiece,
and the drill bit
Ø  x1, …, xn are specifications of the hole’s location, orientation, depth, and
required machining tolerances
●  pre(o): computational tests
Ø  Can the drilling machine and drill bit produce a hole having the desired
geometry and machining tolerances?
Ø  Is the drill loaded into the drilling machine? Is the workpiece is properly
clamped onto the drilling platform? Etc.
●  eff(o): geometric computation
Ø  Modify the geometric model of the workpiece to include a hole having the
desired specifications.
●  cost(o)
Ø  could be an estimate of the time required for the drilling operation
Ø  could be a cost estimate based on time + other criteria
10	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
State-­‐Variable	
  Representa.on	
  
●  Represent each state as a collection of properties of various objects
Ø  Represent each action by describing how it changes those properties
●  Let B = {all objects that matter for planning}
Ø  B may be classified into subsets: various kinds of objects
●  Example
Ø  B = Robots ∪ Containers ∪ Locs ∪ Booleans
•  Robots = {r1,	
  r2}
•  Containers = {c1,	
  c2}
•  Locs = {d1,	
  d2,	
  d3}
•  Booleans = {T,	
  F}	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
11	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Proper.es	
  of	
  Objects	
  
●  Define ways to represent properties of objects
Ø  Two kinds of properties: rigid and varying
●  A property is rigid if it stays the same in every state
Ø  Represent as a mathematical relation
●  Example:
Ø  adjacent = {(d1,d2), (d2,d1)}
Ø  Can also write as
•  adjacent(d1,d2)	
  
•  adjacent(d2,d1)	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
12	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Varying	
  Proper.es	
  
●  A property is varying if it may differ in different states
Ø  Represent using a state variable that we can assign a value to
Ø  Each state variable x has a range (set of possible values), Range(x)
Ø  For each state s, s(x) ∈ Range(x) is x’s value in state s
●  Example:
Ø  what we want to represent
Ø  Each robot can hold at most one container	
  
Ø  Each robot is at a one of the locations
Ø  Each container is on a robot or at one of the locations
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
13	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
A	
  Simple	
  Example	
  
●  Set of all state variables
Ø  X = {loc(r1), loaded(r1), loc(c1), loc(r2), loaded(r2), loc(c2)}
●  Range(loc(r1)) = Locs
●  Range(loc(r2)) = Locs
●  Range(loaded(r1)) = Booleans
●  Range(loaded(r2)) = Booleans
●  Range(loc(c1)) = Robots ∪ Locs
●  Range(loc(c2)) = Robots ∪ Locs
Why have both
loc and loaded?
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
14	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
States	
  as	
  Func.ons	
  
●  Write each state as a function s that maps each x ∈ X to a value in Range(x)
s1(loc(r1))	
  =	
  d1,	
  	
  	
  	
  	
  s1(loaded(r1))	
  =	
  F,	
  	
  	
  	
  	
  s1(loc(c1))	
  =	
  d1,	
  
	
  	
  	
  	
  	
  	
  	
  s1(loc(r2))	
  =	
  d2,	
  	
  	
  	
  	
  s1(loaded(r2))	
  =	
  F,	
  	
  	
  	
  	
  s1(loc(c2))	
  =	
  d2
Ø  Let S = the set of all such functions
●  Functions are sets of ordered pairs
s1 = {(loc(r1),	
  d1),	
  	
  (loaded(r1),	
  F),	
  	
  (loc(c1),	
  d1), …}
●  Rewrite as
s1 = {loc(r1)	
  =	
  d1,	
  	
  	
  loaded(r1)	
  =	
  F,	
  	
  	
  loc(c1)	
  =	
  d1,	
  	
  	
  
	
  loc(r2)	
  =	
  d2,	
  	
  	
  loaded(r2)	
  =	
  F,	
  	
  	
  loc(c2)	
  =	
  d2}
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
15	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
●  Just because a function maps each state variable x into a value in Range(x),
this doesn’t make it a state
Ø  Some sets of state-variable values may not make any sense as states
s = {loaded(r1) = F, loc(c1) = r1, …}
s = {loc(c1) = d1, loc(c2) = r1, …}
●  Need restrictions on what sets of state-variable values constitute real states
Ø  In most of the book, we won’t represent such restrictions explicitly
Ø  Just write the action models in such a way that none of them will ever
produce such a set of assignments
Discussion	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
16	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Descrip.ve	
  Ac.on	
  Models	
  
●  Actions often fall into closely related classes
●  For each class, write a parameterized schema called a planning operator to
describe all the actions in that class
action move(r, l, m)
Pre: loc(r) = l, adjacent(l, m)
Eff: loc(r) ← m
action take(r, l, c)
Pre: loaded(r) = F,	
  loc(r) = l, loc(c) = l, 	
  
	
  Eff: loaded(r) ← T, loc(c) ← r
action put(r, l, c)
Pre: loc(r) = l, loc(c) = r
Eff: loaded(r) ← F, loc(c) ← l
Each parameter has a range
of possible values, e.g.,
Range(r) = Robots
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
17	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
CSV	
  Operator	
  
●  Classical State Variable (CSV) Operator:
•  the kind of operator shown on the previous page
●  General form
Ø  o = (head(o), pre(o), eff(o), cost(o))
•  each precondition in pre(o) must have one of these forms:
▸  relname(t1,…,tk) varname(t1,…,tk) = t0
¬relname(t1,…,tk) varname(t1,…,tk) ≠ t0
•  relname = name of a rigid relation
•  varname = name of a state variable
•  each ti must be a constant (i.e., a member of B) or a parameter
•  each effect in eff(o) must have the form varname(t1,…,tk) ← t0
•  if cost(o) isn’t omitted, it must be a nonnegative number (not a formula)
●  Limited representational capability, but easy to compute, easy to reason about
Ø  Many algorithms have been written to use this kind of operator
18	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
CSV	
  Ac.ons	
  
●  A CSV action is an instance
of a planning operator
Ø  assign values to parameters
action move(r1,d1,d2)	
  
	
  	
  	
  	
  Pre: loc(r1)	
  =	
  d1, adjacent(d1,d2)
Eff: loc(r1) ← d2	
  
action take(r2,d2,c2)	
  
	
  	
  	
  	
  Pre: loaded(r2)	
  =	
  F,	
  loc(r2)	
  =	
  d2,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  loc(c2)	
  =	
  d2,	
  	
  
	
  	
  	
  	
  Eff: loaded(r2)	
  ← T,	
  loc(c1)	
  ← r2	
  
	
  
action put(r1,d1,c1)	
  
	
  	
  	
  	
  Pre: loc(r1)	
  =	
  d1,	
  loc(c1)	
  =	
  r1	
  
	
  	
  	
  	
  Eff: loaded(r1)	
  ← F,	
  loc(c1)	
  ← d1
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
19	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Compu.ng	
  γ	
  
●  s1 = {loaded(r1)	
  =	
  F,	
  loaded(r2)	
  =	
  F,	
  
	
  loc(r1)	
  =	
  d1,	
  loc(r2)	
  =	
  d2,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1,	
  	
  loc(c2)	
  =	
  d2}
●  action take(r2,d2,c2)	
  
	
  	
  	
  	
  Pre: loaded(r2)	
  =	
  F,	
  loc(r2)	
  =	
  d2,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  loc(c2)	
  =	
  d2,	
  	
  
	
  	
  	
  	
  Eff: loaded(r2)	
  ← T,	
  loc(c1)	
  ← r2	
  
	
  
	
  
●  γ(s1, take(r2,d2,c2)) =
{loaded(r1)	
  =	
  F,	
  loaded(r2)	
  =	
  T,	
  	
  
	
  loc(r1)	
  =	
  d1,	
  loc(r2)	
  =	
  d2,	
  
	
  loc(c1)	
  =	
  d1,	
  loc(c2)	
  =	
  r2}
	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
   c2	
  
20	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Plans	
  
●  Plan: a sequence of actions
π = 〈a1,a2,…,an〉
Ø  π is applicable in s0 if the actions
can be applied in the order given
•  γ (s0,a1) = s1, γ (s1,a2) = s2, …, γ (sn–1,an) = sn
●  If π is applicable, define γ (s0,π) = sn
●  Example
π = 〈take(r2,d2,c2),	
  	
  move(r2,d2,d1),	
  	
  put(r2,d1,c2)〉
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
c1	
  	
  
r2	
  	
  
c2	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
c1	
  	
  
c2	
  	
  
s0:
γ (s0,π):
r2	
  	
  
r1	
  	
  
r1	
  	
  
21	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Planning	
  Problems,	
  Solu.ons	
  
●  Planning problem: P=(Σ,s0,g)
Ø  Σ = planning domain
Ø  s0 = initial state
Ø  g = {g1,…,gk} is a set of constraints called the goal
●  π is a solution for P if γ (s0,π) satisfies g
Ø  π is a shortest solution if no shorter plan is also a solution
Ø  π is a minimal solution if no proper subsequence of π is also a solution
●  CSV planning problem:
Ø  Σ is a CSV planning domain,
Ø  g has the same form as a CSV operator’s preconditions
●  Example:
Ø  s0 and π as on previous page
Ø  g = {loc(r2)=d1,	
  loc(c2)=d1}
22	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Classical	
  and	
  State-­‐Variable	
  Representa.ons	
  
●  Motivation
Ø  The field of AI planning started out as automated theorem proving
Ø  It still uses a lot of that notation
●  Classical representation
Ø  Equivalent to CSV representation
Ø  Represents both rigid and varying properties using logical predicates
•  adjacent(l,m) - location l is adjacent to location m
•  loc(r,l) - robot r is at location l
•  loc(c,l), loc(c,r) - container c is at location l or on robot r
•  loaded(r) - there is a container on robot r
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
23	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
States	
  
●  Use ground atoms to represent both rigid and varying properties of Σ
●  To represent a state
Ø  s = {all ground atoms that are true in ŝ}
Ø  e.g., s1 = {loc(c1,d1),	
  loc(c2,d2),	
  
	
  loc(r1,d1),	
  loc(r2,d2),	
   	
  	
  
	
  adjacent(d1,d2),	
  adjacent(d2,d1),	
  	
  
	
  adjacent(d1,d3),	
  adjacent(d3,d1)}
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
24	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Classical	
  Planning	
  Operators	
  
●  Classical planning operator:
Ø  o = (head(o), pre(o), eff(o))
●  Each precondition and effect
must have one of these forms:
pred(t1,…,tk) ¬pred(t1,…,tk)
Ø  pred = predicate name
Ø  each ti must be a constant
(i.e., member of B) or
a parameter
●  Classical action: a ground
instance of a classical operator
move(r,l,m)	
  
Precond: loc(r,l),	
  	
  
Effects: ¬loc(r,l),	
  loc(r,m)	
  
	
  
take(r,l,c)	
  
Precond: loc(r,l),	
  loc(c,l),	
  ¬loaded(r)	
  
Effects: loc(c,r),	
  ¬loc(c,l),	
  loaded(r)	
  
put(r,l,c)	
  
Precond: loc(r,l),	
  loc(c,r)	
  
Effects: loc(c,l),	
  ¬loc(c,r),	
  ¬loaded(r)
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
r2	
  
c2	
  
25	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Discussion	
  
●  Classical representation is equivalent to state-variable representation in
expressive power
Ø  Each can be converted to the other in linear time and space
●  Classical representation is more natural for logicians
●  CSV is more natural for engineers and most computer scientists
Ø  When changing a value, you don’t have to explicitly delete the old one
●  Historically, classical representation has been more widely used
Ø  Many of the algorithms in the book were originally written to use
classical representation
Ø  That’s starting to change
Classical
rep.
CSV
rep.
P(b1,…,bk) becomes xP(b1,…,bk)=1
x(b1,…,bn)=b0
becomes
Px(b1,…,bn,b0)
26	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
This	
  work	
  is	
  licensed	
  under	
  a	
  CreaCve	
  Commons	
  AFribuCon-­‐NonCommercial-­‐ShareAlike	
  4.0	
  InternaConal	
  License.	
  
Chapter	
  2	
  	
  
Delibera.on	
  with	
  Determinis.c	
  Models	
  
	
  
2b:	
  Planning	
  Algorithms	
  
Dana S. Nau and Vikas Shivashankar
University of Maryland
27	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Mo.va.on	
  
●  Nearly all planning procedures are search procedures
●  Different planning procedures have different search spaces
Ø  This section: state-space planning
•  Each node represents a state of the world
▸  A plan is a path through the space
28	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Forward	
  Search	
  
Which forward-search algorithm
depends on how you implement
the nondeterministic choice
I’ll discuss several such algorithms
29	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Depth-­‐First	
  Search	
  
●  At each state, select the action that has the lowest heuristic-functionvalue
●  Visited is for cycle-checking
Ø  If you come to a state you’ve already seen on the current path, then backtrack
Ø  In a finite domain, this guarantees termination
●  Guaranteed to find a solution if one exists
30	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Greedy	
  Search	
  
●  Like DFFS, but never backtracks
Ø  Not guaranteed to find a solution
31	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
●  Requires a heuristic function in line 3
Ø  Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s)
Ø  Expands the node (computes nodes for all applicable actions)
Ø  Prunes nodes that can be shown to be no better than nodes already expanded
A*	
  
32	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
●  Requires a heuristic function in line 3
Ø  Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s)
Ø  Expands the node (computes nodes for all applicable actions)
Ø  Prunes nodes that can be shown to be no better than nodes already expanded
A*	
  
33	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
Straight−line distance
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Arad
366=0+366
34	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
Straight−line distance
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu Timisoara
447=118+329 449=75+374393=140+253
35	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
Straight−line distance
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Rimnicu VilceaFagaras Oradea
447=118+329 449=75+374
646=280+366 413=220+193415=239+176 671=291+380
36	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
Straight−line distance
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Fagaras Oradea
447=118+329 449=75+374
646=280+366 415=239+176
Rimnicu Vilcea
Craiova Pitesti Sibiu
526=366+160 553=300+253417=317+100
671=291+380
X	
  
X	
  
37	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
Straight−line distance
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Sibiu Bucharest
Rimnicu VilceaFagaras Oradea
Craiova Pitesti Sibiu
447=118+329 449=75+374
646=280+366
591=338+253 450=450+0 526=366+160 553=300+253417=317+100
671=291+380
X	
  X	
  
X	
  
38	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
Straight−line distance
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Sibiu Bucharest
Rimnicu VilceaFagaras Oradea
Craiova Pitesti Sibiu
Bucharest Craiova Rimnicu Vilcea
418=418+0
447=118+329 449=75+374
646=280+366
591=338+253 450=450+0 526=366+160 553=300+253
615=455+160 607=414+193
671=291+380
X	
  X	
  
X	
  
X	
  
39	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
●  Requires a heuristic function in line 3
Ø  Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s)
Ø  Expands the node (computes nodes for all applicable actions)
Ø  Prunes nodes that can be shown to be no better than nodes already expanded
A*	
  
40	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
●  h is admissible if for every s, 0 ≤ h(s) ≤ h*(s)
Ø  where h*(s) = least cost of getting from s to a state that satisfies g
●  If h is admissible then A* is guaranteed to return an optimal solution
●  Inadmissible heuristics might get you to a solution faster, but the solution won’t
necessarily be optimal
Proper.es	
  of	
  A*	
  
41	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Depth-­‐First	
  Branch	
  and	
  Bound	
  
●  Depth-first search with heuristic function and pruning
Ø  π* and c*: least-cost solution found so far, and the cost of that solution
Ø  Any time you find a solution with lower cost, update π* and c*
Ø  Prune any plan π such that cost(π) + h(s) ≥ c*
42	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Discussion	
  
●  If the state space is not too large, then A* or DFBB may be preferable
Ø  They are guaranteed to return optimal solutions
●  DFFS returns the first solution it finds
Ø  can be arbitrarily far from optimal
●  Greedy isn’t guaranteed to return a solution at all
●  If S is very large, A* may require excessive memory, and both it and DFBB may
require excessive running time
Ø  In these cases DFFS and Greedy may be preferable
43	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Example	
  
●  One rigid relation, adjacent
●  State variables to represent each
location’s x and y coordinates:
Ø  x = {(loc0, 2), (loc1, 0),
(loc2, 4), (loc3, 0), . . .}
Ø  y = {(loc0, 4), (loc1, 3),
(loc2, 4), (loc3, 2), . . .}
●  One planning operator:
action move(r, l, m)
Ø  pre: adjacent(l, m), loc(r) = l
Ø  eff: loc(r) ← m
●  cost(loci, move(r,locj)) = ︎(xi −xj)2 +(yi − yj)2
= ︎(x(loci) − x(locj))2 + (y(loci) − y(locj))2
●  hsld(si) = ︎(x(loci) − x(locg))2 + (y(loci) − y(locg))2
Ø  Optimal solution to a relaxed problem in which the locations don’t
need to be adjacent
loc1
loc3
loc2
loc6
loc5
loc4
loc7
loc8
loc0
x
y
4
3
2
1
1 2 3 4 5 60
loc9
Not CSV
44	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
CSV	
  Example	
  
●  s0 = {loc(r1)	
  =	
  d3,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1};
●  g = {loc(r1)	
  =	
  d3,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  loc(c1)	
  =	
  r1}
●  action move(r, d, e)
Ø  pre: loc(r) = d
Ø  eff: loc(r) ← e
●  action load(r, c, l)
Ø  pre: loaded(r) = F, loc(c) = l, loc(r) = l
Ø  eff: loaded(r) ← T, loc(c) ← r
●  action unload(r, c, l)
Ø  pre: loc(c) = r, loc(r) = l
Ø  eff: loaded(r) ← F, loc(c) ← l
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
45	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
CSV	
  Example	
  
●  Two applicable actions
●  a1 = move(r1,	
  d3,	
  d1)	
  
s1 = {loc(r1)	
  =	
  d1,	
  loaded(r1)	
  =	
  F,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1}
●  a2 = move(r1,	
  d3,	
  d2)	
  
s2 = {loc(r1)	
  =	
  d2,	
  loaded(r1)	
  =	
  F,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1}
●  In line 4, compute
Ø  min(h(s1), h(s2))
g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
46	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Addi.ve-­‐Cost	
  and	
  Max-­‐Cost	
  Heuris.cs	
  
●  hadd(s) = Δadd(s,g), where
●  Minimum cost of a “plan tree” to achieve g from s
Ø  Pretend each element of g needs a completely separate plan
•  If an action achieves i elements of g, it’s included i times
Ø  Pretend each of an action’s preconditions needs a completely separate plan
Ø  Thus, get a “plan” that’s a tree of actions
•  Total cost can sometimes be much higher than h*(s)
●  Can implement as a tree search, going backward from g
47	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
true in s2
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,,loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
true in s1 … 0
0 > 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s1
loc(r1)=d1 loc(c1)=d1
true in s2
0 0
load(r1,c1,d1)
0+1 = 10+1 = 1 (>0)+1 > 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 0
> 0
(>0) +1 > 1
(>0)+1 > 1
sum
min(1,(>1),(>1)) = 1
sum(0,0,0) = 0
su m
alternatives
alternatives
hadd(s1) = Δadd(s1,g) = sum(1,1) = 2
d3
r1	
   c1	
  
s1:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
Addi.ve-­‐Cost	
  Heuris.c	
  
g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
48	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
hadd(s2) = Δadd(s2,g) = sum(1,2) = 3
move(r1,d2,d1)
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,0loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
… true in s2
0
> 0 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s2
loc(r1)=d1 loc(c1)=d1
true in s2
0
0+1 = 1
load(r1,c1,d1)
1+1 = 2(>0)+1 > 1 0+1 = 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 1
> 1
(>1) +1 > 2
(>1)+1 > 2
sum
true in s2
0
min(2,(>2),(>2)) = 2
sum(0,1,0) = 1
su m
alternatives
alternatives
Addi.ve-­‐Cost	
  Heuris.c	
  
g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
d3
r1	
   c1	
  s2:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
49	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Addi.ve-­‐Cost	
  and	
  Max-­‐Cost	
  Heuris.cs	
  
●  hmax(s) = Δmax(s,g), where
●  Like hadd(s), but doesn’t add all of the costs in the plan tree
Ø  Just the most costly path in the plan tree
●  Guaranteed to be a lower bound on h*(s)
●  Same tree search as for hadd(s)
50	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
true in s2
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,,loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
true in s1 … 0
0 > 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s1
loc(r1)=d1 loc(c1)=d1
true in s2
0 0
load(r1,c1,d1)
0+1 = 10+1 = 1 (>0)+1 > 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 0
> 0
(>0) +1 > 1
(>0)+1 > 1
max
min(1,(>1),(>1)) = 1
max(0,0,0) = 0
m ax
alternatives
alternatives
hmax(s1) = Δmax(s1,g) = max(1,1) = 1
s1:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
Max-­‐Cost	
  Heuris.c	
  
g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
d3
r1	
   c1	
  
51	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
move(r1,d2,d1)
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,0loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
… true in s2
0
> 0 0
min(1,(>1)) = 1
hmax(s2) = Δmax(s2,g) = max(1,2) = 2
pre:
loaded(r1)=nil
true in s2
loc(r1)=d1 loc(c1)=d1
true in s2
0
0+1 = 1
load(r1,c1,d1)
1+1 = 2(>0)+1 > 1 0+1 = 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 1
> 1
(>1) +1 > 2
(>1)+1 > 2
max
true in s2
0
min(2,(>2),(>2)) = 2
max(0,1,0) = 1
m ax
alternatives
alternatives
Max-­‐Cost	
  Heuris.c	
  
g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
d3
r1	
   c1	
  s2:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
  
52	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Delete-­‐Relaxa.on	
  Heuris.cs	
  
●  Suppose a state s includes an assignment x = v
●  Suppose an action a has an effect x ← w
●  Then γ+(s, a) includes both x = v and x = w
●  Relaxed state (or r-state)
Ø  any set ŝ of state-variable values
Ø  may include more than one value for each state variable
●  ŝ r-satisfies a goal g if ŝ contains a subset that satisfies g
●  An action a is r-applicable in ŝ if ŝ r-satisfies a’s preconditions
Ø  In this case, γ+(ŝ,a) = ŝ ∪ γ(s,a)
●  π = ⟨a1, …, an⟩ is r-applicable in ŝ0 if there are r-states ŝ1, ŝ2, …, ŝn such that
•  a1 is r-applicable in ŝ0 and γ+(ŝ0,a1) = ŝ1
•  a2 is r-applicable in ŝ1 and γ+(ŝ1,a2) = ŝ2
•  …
Ø  In this case, γ+(ŝ,a) = ŝn
●  π is a relaxed solution for P = (Σ, s0, g) if γ+(s0, π) r-satisfies g
Name is from classical planning
“don’t do the deletion”	
  
move(r,l,m)	
  
Precond: loc(r,l),	
  	
  
Effects: ¬loc(r,l),	
  loc(r,m)	
  
53	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
d3
r1	
   c1	
  ŝ1:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
r1	
  
c1	
  
Op.mal	
  Relaxed	
  Solu.on	
  Heuris.c	
  
●  ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g
●  Optimal relaxed solution heuristic: h+(s) = ∆+(s, g)
●  Example:
Ø  ŝ1 = γ+(s0, move(r1,d3,d1))
= {loc(r1)	
  =	
  d1,	
  loaded(r1)	
  =	
  F,	
  loc(c1)	
  =	
  d1,	
  loc(r1)	
  =	
  d3}
g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
54	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
d3
r1	
   c1	
  ŝ2:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
r1	
  
c1	
  
c1	
   r1	
  
r1	
  
c1	
  
Op.mal	
  Relaxed	
  Solu.on	
  Heuris.c	
  
●  ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g
●  Optimal relaxed solution heuristic: h+(s) = ∆+(s, g)
●  Example:
Ø  ŝ1 = γ+(s0, move(r1,d3,d1))
= {loc(r1)	
  =	
  d1,	
  loaded(r1)	
  =	
  F,	
  loc(c1)	
  =	
  d1,	
  loc(r1)	
  =	
  d3}
Ø  ŝ2 = γ+(s1, load(r1,c1,d1))
= {loc(r1)	
  =	
  d1,	
  loaded(r1)	
  =	
  F,	
  loc(c1)	
  =	
  d1,	
  loc(r1)	
  =	
  d3,	
  	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  loaded(r1)	
  =	
  T,	
  loc(c1)	
  =	
  r1}.
Ø  ŝ2 r-satisfies g, so ⟨move(r1,d3,d1), load(r1,c1,d1)⟩ is a relaxed solution
Ø  It’s optimal, so h+(s0) = 2
g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
55	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Op.mal	
  Relaxed	
  Solu.on	
  Heuris.c	
  
●  ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g
●  Optimal relaxed solution heuristic: h+(s) = ∆+(s, g)
●  Every solution is also a relaxed solution
Ø  Thus h+ is admissible
Ø  Problem: computing it is NP-hard
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
56	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG	
  Heuris.c	
  
●  Relaxed planning graph (RPG) heuristic
Ø  An approximation of h+ that’s easier to compute
●  Based on the fact that
γ+ doesn’t depend on
the order in which
actions are applied
●  If a1 and a2 are both
applicable in s0 then
γ+(s0, ⟨a1, a2⟩)
= γ+(s0, ⟨a2, a1⟩)
= s0 ∪ eff(a1) ∪ eff(a2)
57	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG	
  Heuris.c	
  
●  Computation of RPG(s1,g)
Ø  First line of RPG assigns ŝ0 = s1 , A0 = ∅
ŝ0:
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F
ŝ1:
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d2	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  T	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  r1	
  	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F
A1:
	
  	
  	
  	
  	
  move(r1,d1,d3)	
  
	
  	
  	
  	
  	
  move(r1,d1,d2)	
  
	
  	
  	
  	
  	
  load(r1,c1,d1)
From ŝ0
d3
r1	
   c1	
  s1:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
58	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG	
  Heuris.c	
  
●  Computation of RPG(s1,g)
Ø  First line of RPG assigns ŝ0 = s1 , A0 = ∅
●  RPG(s1,g) returns 2
ŝ0:
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F
ŝ1:
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d2	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  T	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  r1	
  	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F
A1:
	
  	
  	
  	
  	
  move(r1,d1,d3)	
  
	
  	
  	
  	
  	
  move(r1,d1,d2)	
  
	
  	
  	
  	
  	
  load(r1,c1,d1)
From ŝ0
d3
r1	
   c1	
  s1:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
59	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG	
  Heuris.c	
  
ŝ0:
	
  loaded(r1)	
  =	
  F	
  
	
  loc(c1)	
  =	
  d1	
  
	
  loc(r1)	
  =	
  d2	
  
ŝ1:
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d2
A1:
	
  	
  	
  	
  	
  move(r1,d2,d3)	
  
	
  	
  	
  	
  	
  move(r1,d2,d1)	
  
From ŝ0
A2:
	
  	
  	
  	
  	
  move(r1,d3,d1)	
  
	
  	
  	
  	
  	
  move(r1,d3,d2)	
  
	
  	
  	
  	
  	
  move(r1,d1,d2)	
  
	
  	
  	
  	
  	
  move(r1,d1,d3)	
  
	
  	
  	
  	
  	
  	
  	
  	
  load(r1,c1,d1)	
  
	
  	
  	
  	
  	
  move(r1,d2,d3)	
  
	
  	
  	
  	
  	
  move(r1,d2,d1)	
  
	
  	
  	
  	
  	
  	
  ŝ2:
	
  	
  loc(c1)	
  =	
  r1	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  T	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  
	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d2	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F
From ŝ0
d3
r1	
   c1	
  s2:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
●  Computation of RPG(s2,g)
Ø  First line of RPG assigns ŝ0 = s2 , A0 = ∅
60	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG	
  Heuris.c	
  
ŝ0:
	
  loaded(r1)	
  =	
  F	
  
	
  loc(c1)	
  =	
  d1	
  
	
  loc(r1)	
  =	
  d2	
  
ŝ1:
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d2
A1:
	
  	
  	
  	
  	
  move(r1,d2,d3)	
  
	
  	
  	
  	
  	
  move(r1,d2,d1)	
  
From ŝ0
A2:
	
  	
  	
  	
  	
  move(r1,d3,d1)	
  
	
  	
  	
  	
  	
  move(r1,d3,d2)	
  
	
  	
  	
  	
  	
  move(r1,d1,d2)	
  
	
  	
  	
  	
  	
  move(r1,d1,d3)	
  
	
  	
  	
  	
  	
  	
  	
  	
  load(r1,c1,d1)	
  
	
  	
  	
  	
  	
  move(r1,d2,d3)	
  
	
  	
  	
  	
  	
  move(r1,d2,d1)	
  
	
  	
  	
  	
  	
  	
  ŝ2:
	
  	
  loc(c1)	
  =	
  r1	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  T	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  
	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d2	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  F
From ŝ0
d3
r1	
   c1	
  s2:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
●  Computation of RPG(s2,g)
Ø  First line of RPG assigns ŝ0 = s2 , A0 = ∅
●  RPG(s2,g) returns 3
61	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Landmark	
  Heuris.cs	
  
●  P = (Σ,s0,g) be a planning problem
●  Let φ = φ1 ∨ ... ∨ φm be a disjunction of state-variable assignments
●  Definition: φ is a landmark for P if φ is true at some point in every solution plan
of P
●  Example Landmarks
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
r1	
  
A complete state s0
A single state-variable
Assignment (loc(r1)=d1)
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
r1	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  
r1	
  
Can be a disjunction of
state-variable assignments
loc(r1)=d1 ∨	
 loc(r1)=d2
62	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Why	
  are	
  Landmarks	
  Useful?	
  
●  Help in breaking down the given problem into smaller subproblems
gs0
63	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Why	
  are	
  Landmarks	
  Useful?	
  
●  Help in breaking down the given problem into smaller subproblems
●  Every solution to P has to achieve these landmarks
●  Possible strategy:
Ø  find a plan that takes us from s0 to any state s1 that satisfies lm1
Ø  find a plan that takes us from s1 to any state s2 that satisfies lm2
Ø  …
lm1
gs0
lm2
lm3
P1
P2 P3
P4
64	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Compu.ng	
  Landmarks	
  
●  Question: How do we compute landmarks for a problem P?
●  Not easy:
Ø  Deciding whether a state-variable assignment φ is a landmark is in the worst-
case PSPACE-hard
Ø  To put it in perspective: as hard as solving the planning problem itself!
●  However, all is not lost:
Ø  There are often useful landmarks that can be found more easily
Ø  There are polynomial-time procedures that can compute these landmarks
Ø  Going to see one such procedure based on Relaxed Planning Graphs
●  Why Relaxed Planning Graphs?
Ø  Solving relaxed planning problems easier
•  Computing landmarks for relaxed planning problems easier
Ø  A landmark for a relaxed planning problem is a landmark for the original
planning problem as well
65	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
●  Main intuition: if a state-variable assignment φ is a landmark, then preconditions
of actions that achieve φ is also a landmark
Ø  In other words: we’re going to start from known landmarks and discover new
ones by looking at preconditions of actions that achieve these landmarks
●  Example:
Ø  g is the goal
Ø  Actions a1 and a2 can achieve g
Ø  Therefore, either p1∧q or p2∧q
must be true to be able to achieve g
Ø  In other words: (p1∧q)∨(p2∧q) is
a landmark
Ø  By rearranging terms: we get
(p1∧q)∨(p2∧q) ≅ q∧(p1∨p2)
Ø  Since q∧(p1∨p2) is a landmark, both q and (p1∨p2) are landmarks
●  In practice, we try to rearrange assignments in a similar manner to group terms
with the same state variable together
g
a2
a1
p1
q
p2
66	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
●  Question: What landmarks can we start with?
Ø  Every goal is trivially a landmark; can start from there
●  E.g. loc(r1)	
  =	
  d3	
  is a landmark
Ø  Two actions achieve this landmark: move (r1, d3, d1)
and move (r1, d2, d1)
Ø  Can infer a new landmark that is the disjunction of the
preconditions of these two actions:
φ‘ = loc(r1)	
  =	
  d3∨loc(r1)	
  =	
  d2
action move(r, d, e)
pre: loc(r) = d
eff: loc(r) ← e
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
67	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0
and add to φ
Ø  The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
68	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0
and add to φ
Ø  The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
69	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0
and add to φ
Ø  The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
70	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0
and add to φ
Ø  The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
71	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0
and add to φ
Ø  The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
72	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0
and add to φ
Ø  The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3 p3∨q1: a new landmark
73	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
loc(r1)	
  =	
  d3 loc(c1)	
  =	
  r1ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-applicable in the
r-state resulting from foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0 and add to φ
Ø  The resulting disjunction φ is a landmark; add it to
queue
74	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
loc(r1)	
  =	
  d3 loc(c1)	
  =	
  r1
True in
current state, always going
to be true in RPG;
No use expanding
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-applicable in the
r-state resulting from foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0 and add to φ
Ø  The resulting disjunction φ is a landmark; add it to
queue
75	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
loc(r1)	
  =	
  d3 loc(c1)	
  =	
  r1
3 actions achieve this: load	
  
(r1,c1,d1),	
  load	
  (r1,c1,d2),	
  	
  
load	
  (r1,c1,d3)	
  	
  	
  
action load(r, c, l)
pre: loaded(r) = nil, loc(c) =
l, loc(r) = l
eff: loaded(r) ← c, loc(c) ← r	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0 and add to φ
Ø  The resulting disjunction φ is a landmark; add it to
queue
76	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0 and add to φ
Ø  The resulting disjunction φ is a landmark; add it to
queue
RPG-­‐based	
  Landmark	
  Computa.on	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
loc(r1)	
  =	
  d3 loc(c1)	
  =	
  r1
3 actions achieve this: load	
  
(r1,c1,d1),	
  load	
  (r1,c1,d2),	
  	
  
load	
  (r1,c1,d3)	
  	
  	
  
action load(r, c, l)
pre: loaded(r) = nil, loc(c) =
l, loc(r) = l
eff: loaded(r) ← c, loc(c) ← r	
  
77	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
ŝ0 = {
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  nil	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }
A1 = {
	
  	
  	
  	
  	
  move(r1,d3,d1)	
  
	
  	
  	
  	
  	
  move(r1,d3,d2)	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }
ŝ1 = {
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d2	
  
	
  	
  	
  	
  	
  loc(c1)	
  =	
  d1	
  
	
  	
  	
  	
  	
  loc(r1)	
  =	
  d3	
  
	
  	
  	
  	
  	
  loaded(r1)	
  =	
  nil	
  
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  }
From ŝ0
	
  load(r1,c1,d1)	
  
	
  load(r1,c1,d2)	
  
	
  load(r1,c1,d3)	
  
Relaxed Planning Graph foo	
  
Only first load action
applicable in r-state
at the end of foo	
  
78	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
loc(r1)	
  =	
  d3 loc(c1)	
  =	
  r1
3 actions achieve this: load	
  
(r1,c1,d1),	
  load	
  (r1,c1,d2),	
  	
  
load	
  (r1,c1,d3)	
  	
  	
  
action load(r, c, l)
pre: loaded(r) = nil, loc(c) =
l, loc(r) = l
eff: loaded(r) ← c, loc(c) ← r	
  
Only load	
  (r1,c1,d1)	
  is applicable in
final level of RPG; these preconds
used to generate new landmarks
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0 and add to φ
Ø  The resulting disjunction φ is a landmark; add it to
queue
79	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
RPG-­‐based	
  Landmark	
  Computa.on	
  
d3
r1	
   c1	
  s0:
g:
	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d2	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  	
  d1	
  
d3
r1	
  
c1	
   g = {loc(r1)	
  =	
  d3,	
  loc(c1)	
  =	
  r1}
loc(r1)	
  =	
  d3 loc(c1)	
  =	
  r1
loaded(r1) = nil
loc(c1) = d1	
 
loc(r1) = d1
These new landmarks
are added
to the queue
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo	
  
Ø  act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0 and add to φ
Ø  The resulting disjunction φ is a landmark; add it to
queue
80	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Landmark	
  Heuris.c	
  
●  Every solution to the problem needs to achieve all the computed landmarks
●  One possible heuristic to estimate distance of state s from g:
Ø  Number of landmarks required to be accomplished from s
Ø  Planner biased towards actions that achieve landmarks
●  Is this heuristic admissible?
81	
  Dana	
  Nau	
  and	
  Vikas	
  Shivashankar:	
  Lecture	
  slides	
  for	
  Automated	
  Planning	
  and	
  Ac0ng	
   Updated	
  3/24/15	
  
Landmark	
  Heuris.c	
  
●  Every solution to the problem needs to achieve all the computed landmarks
●  One possible heuristic to estimate distance of state s from g:
Ø  Number of landmarks required to be accomplished from s
Ø  Planner biased towards actions that achieve landmarks
●  Is this heuristic admissible?
●  A number of more advanced landmark-based heuristics developed (including
admissible ones)
Ø  Check textbook for references
g1
a
s0
g2
Number of landmarks: | {g1, g2} | = 2
Optimal Plan Length = |<a>| = 1
g = g1∧g2

More Related Content

What's hot

Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement LearningPlaying Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning郁凱 黃
 
Solving connectivity problems via basic Linear Algebra
Solving connectivity problems via basic Linear AlgebraSolving connectivity problems via basic Linear Algebra
Solving connectivity problems via basic Linear Algebracseiitgn
 
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...Md. Al-Amin Khandaker Nipu
 
ICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – ExhibitionICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – Exhibitionirrrrr
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentaryirrrrr
 
Test s velocity_15_5_4
Test s velocity_15_5_4Test s velocity_15_5_4
Test s velocity_15_5_4Kunihiko Saito
 
Programming workshop
Programming workshopProgramming workshop
Programming workshopSandeep Joshi
 
Some contributions to PCA for time series
Some contributions to PCA for time seriesSome contributions to PCA for time series
Some contributions to PCA for time seriesIsabel Magalhães
 
Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...
Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...
Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...Yandex
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural NetworkRuochun Tzeng
 
Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2Abhimanyu Mishra
 
Introduction to Polyhedral Compilation
Introduction to Polyhedral CompilationIntroduction to Polyhedral Compilation
Introduction to Polyhedral CompilationAkihiro Hayashi
 
Daa divide-n-conquer
Daa divide-n-conquerDaa divide-n-conquer
Daa divide-n-conquersankara_rao
 
Sep logic slide
Sep logic slideSep logic slide
Sep logic sliderainoftime
 
2-Approximation Vertex Cover
2-Approximation Vertex Cover2-Approximation Vertex Cover
2-Approximation Vertex CoverKowshik Roy
 

What's hot (20)

Playing Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement LearningPlaying Atari with Deep Reinforcement Learning
Playing Atari with Deep Reinforcement Learning
 
Solving connectivity problems via basic Linear Algebra
Solving connectivity problems via basic Linear AlgebraSolving connectivity problems via basic Linear Algebra
Solving connectivity problems via basic Linear Algebra
 
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...
Efficient Scalar Multiplication for Ate Based Pairing over KSS Curve of Embed...
 
ICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – ExhibitionICPC Asia::Tokyo 2014 Problem J – Exhibition
ICPC Asia::Tokyo 2014 Problem J – Exhibition
 
ICPC 2015, Tsukuba : Unofficial Commentary
ICPC 2015, Tsukuba: Unofficial CommentaryICPC 2015, Tsukuba: Unofficial Commentary
ICPC 2015, Tsukuba : Unofficial Commentary
 
Test s velocity_15_5_4
Test s velocity_15_5_4Test s velocity_15_5_4
Test s velocity_15_5_4
 
qlp
qlpqlp
qlp
 
Programming workshop
Programming workshopProgramming workshop
Programming workshop
 
Some contributions to PCA for time series
Some contributions to PCA for time seriesSome contributions to PCA for time series
Some contributions to PCA for time series
 
Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...
Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...
Ilya Shkredov – Subsets of Z/pZ with small Wiener norm and arithmetic progres...
 
Test (S) on R
Test (S) on RTest (S) on R
Test (S) on R
 
Tensorizing Neural Network
Tensorizing Neural NetworkTensorizing Neural Network
Tensorizing Neural Network
 
Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2Theory of Automata and formal languages unit 2
Theory of Automata and formal languages unit 2
 
Block Diagram Algebra
Block Diagram AlgebraBlock Diagram Algebra
Block Diagram Algebra
 
Introduction to Polyhedral Compilation
Introduction to Polyhedral CompilationIntroduction to Polyhedral Compilation
Introduction to Polyhedral Compilation
 
Daa divide-n-conquer
Daa divide-n-conquerDaa divide-n-conquer
Daa divide-n-conquer
 
Sep logic slide
Sep logic slideSep logic slide
Sep logic slide
 
5 csp
5 csp5 csp
5 csp
 
2-Approximation Vertex Cover
2-Approximation Vertex Cover2-Approximation Vertex Cover
2-Approximation Vertex Cover
 
Pda
PdaPda
Pda
 

Similar to Automated Planning Lecture Slides

Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Sparkdatamantra
 
From Declarative to Imperative Operation Specifications (ER 2007)
From Declarative to Imperative Operation Specifications (ER 2007)From Declarative to Imperative Operation Specifications (ER 2007)
From Declarative to Imperative Operation Specifications (ER 2007)Jordi Cabot
 
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...Jitendra Bafna
 
Artificial Intelligence 1 Planning In The Real World
Artificial Intelligence 1 Planning In The Real WorldArtificial Intelligence 1 Planning In The Real World
Artificial Intelligence 1 Planning In The Real Worldahmad bassiouny
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingMark Kilgard
 
Goal stack planning.ppt
Goal stack planning.pptGoal stack planning.ppt
Goal stack planning.pptSadagopanS
 
Build Your Own VR Display Course - SIGGRAPH 2017: Part 2
Build Your Own VR Display Course - SIGGRAPH 2017: Part 2Build Your Own VR Display Course - SIGGRAPH 2017: Part 2
Build Your Own VR Display Course - SIGGRAPH 2017: Part 2StanfordComputationalImaging
 
Constructors and Destructors
Constructors and DestructorsConstructors and Destructors
Constructors and DestructorsKeyur Vadodariya
 
Class[2][29th may] [javascript]
Class[2][29th may] [javascript]Class[2][29th may] [javascript]
Class[2][29th may] [javascript]Saajid Akram
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfRajJain516913
 

Similar to Automated Planning Lecture Slides (20)

Spark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van HovellSpark Summit EU talk by Herman van Hovell
Spark Summit EU talk by Herman van Hovell
 
AI_Planning.pdf
AI_Planning.pdfAI_Planning.pdf
AI_Planning.pdf
 
Introduction to Machine Learning with Spark
Introduction to Machine Learning with SparkIntroduction to Machine Learning with Spark
Introduction to Machine Learning with Spark
 
From Declarative to Imperative Operation Specifications (ER 2007)
From Declarative to Imperative Operation Specifications (ER 2007)From Declarative to Imperative Operation Specifications (ER 2007)
From Declarative to Imperative Operation Specifications (ER 2007)
 
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...
Engineering Student MuleSoft Meetup#6 - Basic Understanding of DataWeave With...
 
Artificial Intelligence 1 Planning In The Real World
Artificial Intelligence 1 Planning In The Real WorldArtificial Intelligence 1 Planning In The Real World
Artificial Intelligence 1 Planning In The Real World
 
Basic concept of c++
Basic concept of c++Basic concept of c++
Basic concept of c++
 
CS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and CullingCS 354 Transformation, Clipping, and Culling
CS 354 Transformation, Clipping, and Culling
 
Goal stack planning.ppt
Goal stack planning.pptGoal stack planning.ppt
Goal stack planning.ppt
 
Planning
Planning Planning
Planning
 
Build Your Own VR Display Course - SIGGRAPH 2017: Part 2
Build Your Own VR Display Course - SIGGRAPH 2017: Part 2Build Your Own VR Display Course - SIGGRAPH 2017: Part 2
Build Your Own VR Display Course - SIGGRAPH 2017: Part 2
 
MapReduce Algorithm Design
MapReduce Algorithm DesignMapReduce Algorithm Design
MapReduce Algorithm Design
 
00-review.ppt
00-review.ppt00-review.ppt
00-review.ppt
 
Constructors and Destructors
Constructors and DestructorsConstructors and Destructors
Constructors and Destructors
 
Class[2][29th may] [javascript]
Class[2][29th may] [javascript]Class[2][29th may] [javascript]
Class[2][29th may] [javascript]
 
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdfCD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
CD504 CGM_Lab Manual_004e08d3838702ed11fc6d03cc82f7be.pdf
 
Planing presentation
Planing presentationPlaning presentation
Planing presentation
 
Emm3104 chapter 1 part2
Emm3104 chapter 1 part2Emm3104 chapter 1 part2
Emm3104 chapter 1 part2
 
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
CLIM Fall 2017 Course: Statistics for Climate Research, Spatial Data: Models ...
 
Software Developer Training
Software Developer TrainingSoftware Developer Training
Software Developer Training
 

More from Tianlu Wang

14 pro resolution
14 pro resolution14 pro resolution
14 pro resolutionTianlu Wang
 
13 propositional calculus
13 propositional calculus13 propositional calculus
13 propositional calculusTianlu Wang
 
12 adversal search
12 adversal search12 adversal search
12 adversal searchTianlu Wang
 
11 alternative search
11 alternative search11 alternative search
11 alternative searchTianlu Wang
 
21 situation calculus
21 situation calculus21 situation calculus
21 situation calculusTianlu Wang
 
20 bayes learning
20 bayes learning20 bayes learning
20 bayes learningTianlu Wang
 
19 uncertain evidence
19 uncertain evidence19 uncertain evidence
19 uncertain evidenceTianlu Wang
 
18 common knowledge
18 common knowledge18 common knowledge
18 common knowledgeTianlu Wang
 
17 2 expert systems
17 2 expert systems17 2 expert systems
17 2 expert systemsTianlu Wang
 
17 1 knowledge-based system
17 1 knowledge-based system17 1 knowledge-based system
17 1 knowledge-based systemTianlu Wang
 
16 2 predicate resolution
16 2 predicate resolution16 2 predicate resolution
16 2 predicate resolutionTianlu Wang
 
16 1 predicate resolution
16 1 predicate resolution16 1 predicate resolution
16 1 predicate resolutionTianlu Wang
 
09 heuristic search
09 heuristic search09 heuristic search
09 heuristic searchTianlu Wang
 
08 uninformed search
08 uninformed search08 uninformed search
08 uninformed searchTianlu Wang
 

More from Tianlu Wang (20)

L7 er2
L7 er2L7 er2
L7 er2
 
L8 design1
L8 design1L8 design1
L8 design1
 
L9 design2
L9 design2L9 design2
L9 design2
 
14 pro resolution
14 pro resolution14 pro resolution
14 pro resolution
 
13 propositional calculus
13 propositional calculus13 propositional calculus
13 propositional calculus
 
12 adversal search
12 adversal search12 adversal search
12 adversal search
 
11 alternative search
11 alternative search11 alternative search
11 alternative search
 
10 2 sum
10 2 sum10 2 sum
10 2 sum
 
22 planning
22 planning22 planning
22 planning
 
21 situation calculus
21 situation calculus21 situation calculus
21 situation calculus
 
20 bayes learning
20 bayes learning20 bayes learning
20 bayes learning
 
19 uncertain evidence
19 uncertain evidence19 uncertain evidence
19 uncertain evidence
 
18 common knowledge
18 common knowledge18 common knowledge
18 common knowledge
 
17 2 expert systems
17 2 expert systems17 2 expert systems
17 2 expert systems
 
17 1 knowledge-based system
17 1 knowledge-based system17 1 knowledge-based system
17 1 knowledge-based system
 
16 2 predicate resolution
16 2 predicate resolution16 2 predicate resolution
16 2 predicate resolution
 
16 1 predicate resolution
16 1 predicate resolution16 1 predicate resolution
16 1 predicate resolution
 
15 predicate
15 predicate15 predicate
15 predicate
 
09 heuristic search
09 heuristic search09 heuristic search
09 heuristic search
 
08 uninformed search
08 uninformed search08 uninformed search
08 uninformed search
 

Automated Planning Lecture Slides

  • 1. 1  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   This  work  is  licensed  under  a  CreaCve  Commons  AFribuCon-­‐NonCommercial-­‐ShareAlike  4.0  InternaConal  License.   Chapter  2     Delibera.on  with  Determinis.c  Models   Dana S. Nau and Vikas Shivashankar University of Maryland
  • 2. 2  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Purpose  of  this  Chapter   ●  Last time, Vikas mentioned conventional AI planning Ø  Given •  a domain model (descriptions of the states and actions) •  initial state s0, and goal g Ø  Find a plan or a policy that •  is executable starting in s0 •  produces a state that satisfies g ●  This chapter discusses some techniques for doing that Ø  Also, how to use those techniques in acting systems ●  Outline: Ø  2a: Representing planning domains Ø  2b: Planning algorithms Ø  2c: Acting and planning
  • 3. 3  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   This  work  is  licensed  under  a  CreaCve  Commons  AFribuCon-­‐NonCommercial-­‐ShareAlike  4.0  InternaConal  License.   Chapter  2     Delibera.on  with  Determinis.c  Models     2a:  Represen.ng  Planning  Domains   Dana S. Nau and Vikas Shivashankar University of Maryland
  • 4. 4  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Domain  Model   ●  Planning domain: an abstract model of the environment Ø  Many different kinds of environments, various ways to model them ●  In this chapter, the model is a deterministic state-transition system Ø  Σ = (S,A,γ) •  S is a finite set of states ▸  States of the world •  A is a finite set of actions ▸  Things an actor can do •  γ: S × A → S is a prediction function (or state-transition function) ▸  Given a state s and action a, γ(s,a) is another state •  Prediction of what state will be produced by executing a in s •  γ is partial: γ(s,a) is undefined if a is inapplicable in s ▸  Dom(a) = {s ∈ S | γ(s,a) is defined} = {states where a is applicable} ▸  Range(a) = {γ(s,a) | s ∈ Dom(a)}
  • 5. 5  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Implicit  Assump.ons   ●  The state-transition model incorporates the following assumptions Ø  Static world •  Changes occur only in response to the actor’s actions Ø  Perfect information •  Actor always has all the information it needs Ø  Instantaneous actions •  Each action causes an instantaneous transition from one state to the next Ø  Determinism •  Actions are deterministic Ø  Correct prediction function •  Outcome of action a in state s is always γ(s,a) Ø  Flat search space •  Only one level of abstraction; ignore how to refine actions at a lower level
  • 6. 6  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   How  to  Represent  Σ?   ●  If the domain is small enough Ø  Give each state and action a unique name Ø  For each s and a, store γ(s,a) in a lookup table loc1 loc3 loc2 loc6 loc5 loc4 loc7 loc8 loc0 x y 4 3 2 1 1 2 3 4 5 60 loc9 ●  If a domain is larger, don’t represent all states explicitly Ø  Have a formalism for describing states by describing their properties Ø  Represent each action by describing how it changes those properties Ø  Start with initial state, use actions to produce other states
  • 7. 7  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Determinis.c  Operator  (General  Form)   ●  Domain-specific format for representing states Ø  Invent your own format ●  General form of a deterministic operator: Ø  o = (head(o), pre(o), eff(o), cost(o)) •  head(o): name and parameter list •  pre(o): preconditions ▸  Computational tests to predict whether an action can be performed in a state s ▸  In principle, should be necessary/sufficient for the action to run without error •  eff(o): effects ▸  Procedures that assign new values to some of the state variables •  cost(o): procedure that returns a number ▸  Can be omitted, in which case cost(o) = 1 ▸  Could represent monetary cost, time required, something else
  • 8. 8  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Example   ●  Suppose we want to plan how to create a metal hole in the workpiece ●  a state s includes Ø  geometric model of the workpiece, variables describing its location, orientation, and other status information, Ø  capabilities and status of drilling machine and drill ●  Several actions (getting the workpiece onto the machine, clamping it, loading a drill bit, etc.) Ø  Next slide: the drilling operation itself
  • 9. 9  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Drilling  Ac.on   ●  head(o) = drill-­‐hole(m, w, b, x1, …, xn) Ø  m, w, b are the names or ID numbers of the drilling machine, the workpiece, and the drill bit Ø  x1, …, xn are specifications of the hole’s location, orientation, depth, and required machining tolerances ●  pre(o): computational tests Ø  Can the drilling machine and drill bit produce a hole having the desired geometry and machining tolerances? Ø  Is the drill loaded into the drilling machine? Is the workpiece is properly clamped onto the drilling platform? Etc. ●  eff(o): geometric computation Ø  Modify the geometric model of the workpiece to include a hole having the desired specifications. ●  cost(o) Ø  could be an estimate of the time required for the drilling operation Ø  could be a cost estimate based on time + other criteria
  • 10. 10  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   State-­‐Variable  Representa.on   ●  Represent each state as a collection of properties of various objects Ø  Represent each action by describing how it changes those properties ●  Let B = {all objects that matter for planning} Ø  B may be classified into subsets: various kinds of objects ●  Example Ø  B = Robots ∪ Containers ∪ Locs ∪ Booleans •  Robots = {r1,  r2} •  Containers = {c1,  c2} •  Locs = {d1,  d2,  d3} •  Booleans = {T,  F}                            d2                            d1   d3 r1   c1   r2   c2  
  • 11. 11  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Proper.es  of  Objects   ●  Define ways to represent properties of objects Ø  Two kinds of properties: rigid and varying ●  A property is rigid if it stays the same in every state Ø  Represent as a mathematical relation ●  Example: Ø  adjacent = {(d1,d2), (d2,d1)} Ø  Can also write as •  adjacent(d1,d2)   •  adjacent(d2,d1)                              d2                            d1   d3 r1   c1   r2   c2  
  • 12. 12  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Varying  Proper.es   ●  A property is varying if it may differ in different states Ø  Represent using a state variable that we can assign a value to Ø  Each state variable x has a range (set of possible values), Range(x) Ø  For each state s, s(x) ∈ Range(x) is x’s value in state s ●  Example: Ø  what we want to represent Ø  Each robot can hold at most one container   Ø  Each robot is at a one of the locations Ø  Each container is on a robot or at one of the locations                          d2                            d1   d3 r1   c1   r2   c2  
  • 13. 13  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   A  Simple  Example   ●  Set of all state variables Ø  X = {loc(r1), loaded(r1), loc(c1), loc(r2), loaded(r2), loc(c2)} ●  Range(loc(r1)) = Locs ●  Range(loc(r2)) = Locs ●  Range(loaded(r1)) = Booleans ●  Range(loaded(r2)) = Booleans ●  Range(loc(c1)) = Robots ∪ Locs ●  Range(loc(c2)) = Robots ∪ Locs Why have both loc and loaded?                          d2                            d1   d3 r1   c1   r2   c2  
  • 14. 14  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   States  as  Func.ons   ●  Write each state as a function s that maps each x ∈ X to a value in Range(x) s1(loc(r1))  =  d1,          s1(loaded(r1))  =  F,          s1(loc(c1))  =  d1,                s1(loc(r2))  =  d2,          s1(loaded(r2))  =  F,          s1(loc(c2))  =  d2 Ø  Let S = the set of all such functions ●  Functions are sets of ordered pairs s1 = {(loc(r1),  d1),    (loaded(r1),  F),    (loc(c1),  d1), …} ●  Rewrite as s1 = {loc(r1)  =  d1,      loaded(r1)  =  F,      loc(c1)  =  d1,        loc(r2)  =  d2,      loaded(r2)  =  F,      loc(c2)  =  d2}                          d2                            d1   d3 r1   c1   r2   c2  
  • 15. 15  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   ●  Just because a function maps each state variable x into a value in Range(x), this doesn’t make it a state Ø  Some sets of state-variable values may not make any sense as states s = {loaded(r1) = F, loc(c1) = r1, …} s = {loc(c1) = d1, loc(c2) = r1, …} ●  Need restrictions on what sets of state-variable values constitute real states Ø  In most of the book, we won’t represent such restrictions explicitly Ø  Just write the action models in such a way that none of them will ever produce such a set of assignments Discussion                            d2                            d1   d3 r1   c1   r2   c2  
  • 16. 16  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Descrip.ve  Ac.on  Models   ●  Actions often fall into closely related classes ●  For each class, write a parameterized schema called a planning operator to describe all the actions in that class action move(r, l, m) Pre: loc(r) = l, adjacent(l, m) Eff: loc(r) ← m action take(r, l, c) Pre: loaded(r) = F,  loc(r) = l, loc(c) = l,    Eff: loaded(r) ← T, loc(c) ← r action put(r, l, c) Pre: loc(r) = l, loc(c) = r Eff: loaded(r) ← F, loc(c) ← l Each parameter has a range of possible values, e.g., Range(r) = Robots                          d2                            d1   d3 r1   c1   r2   c2  
  • 17. 17  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   CSV  Operator   ●  Classical State Variable (CSV) Operator: •  the kind of operator shown on the previous page ●  General form Ø  o = (head(o), pre(o), eff(o), cost(o)) •  each precondition in pre(o) must have one of these forms: ▸  relname(t1,…,tk) varname(t1,…,tk) = t0 ¬relname(t1,…,tk) varname(t1,…,tk) ≠ t0 •  relname = name of a rigid relation •  varname = name of a state variable •  each ti must be a constant (i.e., a member of B) or a parameter •  each effect in eff(o) must have the form varname(t1,…,tk) ← t0 •  if cost(o) isn’t omitted, it must be a nonnegative number (not a formula) ●  Limited representational capability, but easy to compute, easy to reason about Ø  Many algorithms have been written to use this kind of operator
  • 18. 18  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   CSV  Ac.ons   ●  A CSV action is an instance of a planning operator Ø  assign values to parameters action move(r1,d1,d2)          Pre: loc(r1)  =  d1, adjacent(d1,d2) Eff: loc(r1) ← d2   action take(r2,d2,c2)          Pre: loaded(r2)  =  F,  loc(r2)  =  d2,                              loc(c2)  =  d2,            Eff: loaded(r2)  ← T,  loc(c1)  ← r2     action put(r1,d1,c1)          Pre: loc(r1)  =  d1,  loc(c1)  =  r1          Eff: loaded(r1)  ← F,  loc(c1)  ← d1                          d2                            d1   d3 r1   c1   r2   c2  
  • 19. 19  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Compu.ng  γ   ●  s1 = {loaded(r1)  =  F,  loaded(r2)  =  F,    loc(r1)  =  d1,  loc(r2)  =  d2,                        loc(c1)  =  d1,    loc(c2)  =  d2} ●  action take(r2,d2,c2)          Pre: loaded(r2)  =  F,  loc(r2)  =  d2,                              loc(c2)  =  d2,            Eff: loaded(r2)  ← T,  loc(c1)  ← r2       ●  γ(s1, take(r2,d2,c2)) = {loaded(r1)  =  F,  loaded(r2)  =  T,      loc(r1)  =  d1,  loc(r2)  =  d2,    loc(c1)  =  d1,  loc(c2)  =  r2}                            d2                            d1   d3 r1   c1   r2   c2                            d2                            d1   d3 r1   c1   r2   c2  
  • 20. 20  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Plans   ●  Plan: a sequence of actions π = 〈a1,a2,…,an〉 Ø  π is applicable in s0 if the actions can be applied in the order given •  γ (s0,a1) = s1, γ (s1,a2) = s2, …, γ (sn–1,an) = sn ●  If π is applicable, define γ (s0,π) = sn ●  Example π = 〈take(r2,d2,c2),    move(r2,d2,d1),    put(r2,d1,c2)〉                          d2                            d1   d3 c1     r2     c2                              d2                            d1   d3 c1     c2     s0: γ (s0,π): r2     r1     r1    
  • 21. 21  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Planning  Problems,  Solu.ons   ●  Planning problem: P=(Σ,s0,g) Ø  Σ = planning domain Ø  s0 = initial state Ø  g = {g1,…,gk} is a set of constraints called the goal ●  π is a solution for P if γ (s0,π) satisfies g Ø  π is a shortest solution if no shorter plan is also a solution Ø  π is a minimal solution if no proper subsequence of π is also a solution ●  CSV planning problem: Ø  Σ is a CSV planning domain, Ø  g has the same form as a CSV operator’s preconditions ●  Example: Ø  s0 and π as on previous page Ø  g = {loc(r2)=d1,  loc(c2)=d1}
  • 22. 22  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Classical  and  State-­‐Variable  Representa.ons   ●  Motivation Ø  The field of AI planning started out as automated theorem proving Ø  It still uses a lot of that notation ●  Classical representation Ø  Equivalent to CSV representation Ø  Represents both rigid and varying properties using logical predicates •  adjacent(l,m) - location l is adjacent to location m •  loc(r,l) - robot r is at location l •  loc(c,l), loc(c,r) - container c is at location l or on robot r •  loaded(r) - there is a container on robot r                          d2                            d1   d3 r1   c1   r2   c2  
  • 23. 23  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   States   ●  Use ground atoms to represent both rigid and varying properties of Σ ●  To represent a state Ø  s = {all ground atoms that are true in ŝ} Ø  e.g., s1 = {loc(c1,d1),  loc(c2,d2),    loc(r1,d1),  loc(r2,d2),        adjacent(d1,d2),  adjacent(d2,d1),      adjacent(d1,d3),  adjacent(d3,d1)}                          d2                            d1   d3 r1   c1   r2   c2  
  • 24. 24  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Classical  Planning  Operators   ●  Classical planning operator: Ø  o = (head(o), pre(o), eff(o)) ●  Each precondition and effect must have one of these forms: pred(t1,…,tk) ¬pred(t1,…,tk) Ø  pred = predicate name Ø  each ti must be a constant (i.e., member of B) or a parameter ●  Classical action: a ground instance of a classical operator move(r,l,m)   Precond: loc(r,l),     Effects: ¬loc(r,l),  loc(r,m)     take(r,l,c)   Precond: loc(r,l),  loc(c,l),  ¬loaded(r)   Effects: loc(c,r),  ¬loc(c,l),  loaded(r)   put(r,l,c)   Precond: loc(r,l),  loc(c,r)   Effects: loc(c,l),  ¬loc(c,r),  ¬loaded(r)                          d2                            d1   d3 r1   c1   r2   c2  
  • 25. 25  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Discussion   ●  Classical representation is equivalent to state-variable representation in expressive power Ø  Each can be converted to the other in linear time and space ●  Classical representation is more natural for logicians ●  CSV is more natural for engineers and most computer scientists Ø  When changing a value, you don’t have to explicitly delete the old one ●  Historically, classical representation has been more widely used Ø  Many of the algorithms in the book were originally written to use classical representation Ø  That’s starting to change Classical rep. CSV rep. P(b1,…,bk) becomes xP(b1,…,bk)=1 x(b1,…,bn)=b0 becomes Px(b1,…,bn,b0)
  • 26. 26  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   This  work  is  licensed  under  a  CreaCve  Commons  AFribuCon-­‐NonCommercial-­‐ShareAlike  4.0  InternaConal  License.   Chapter  2     Delibera.on  with  Determinis.c  Models     2b:  Planning  Algorithms   Dana S. Nau and Vikas Shivashankar University of Maryland
  • 27. 27  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Mo.va.on   ●  Nearly all planning procedures are search procedures ●  Different planning procedures have different search spaces Ø  This section: state-space planning •  Each node represents a state of the world ▸  A plan is a path through the space
  • 28. 28  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Forward  Search   Which forward-search algorithm depends on how you implement the nondeterministic choice I’ll discuss several such algorithms
  • 29. 29  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Depth-­‐First  Search   ●  At each state, select the action that has the lowest heuristic-functionvalue ●  Visited is for cycle-checking Ø  If you come to a state you’ve already seen on the current path, then backtrack Ø  In a finite domain, this guarantees termination ●  Guaranteed to find a solution if one exists
  • 30. 30  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Greedy  Search   ●  Like DFFS, but never backtracks Ø  Not guaranteed to find a solution
  • 31. 31  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   ●  Requires a heuristic function in line 3 Ø  Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s) Ø  Expands the node (computes nodes for all applicable actions) Ø  Prunes nodes that can be shown to be no better than nodes already expanded A*  
  • 32. 32  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   ●  Requires a heuristic function in line 3 Ø  Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s) Ø  Expands the node (computes nodes for all applicable actions) Ø  Prunes nodes that can be shown to be no better than nodes already expanded A*  
  • 33. 33  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Bucharest Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Rimnicu Vilcea Vaslui Iasi Straight−line distance to Bucharest 0 160 242 161 77 151 241 366 193 178 253 329 80 199 244 380 226 234 374 98 Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Vaslui Iasi Rimnicu Vilcea Bucharest 71 75 118 111 70 75 120 151 140 99 80 97 101 211 138 146 85 90 98 142 92 87 86 Arad 366=0+366
  • 34. 34  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Bucharest Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Rimnicu Vilcea Vaslui Iasi Straight−line distance to Bucharest 0 160 242 161 77 151 241 366 193 178 253 329 80 199 244 380 226 234 374 98 Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Vaslui Iasi Rimnicu Vilcea Bucharest 71 75 118 111 70 75 120 151 140 99 80 97 101 211 138 146 85 90 98 142 92 87 86 Zerind Arad Sibiu Timisoara 447=118+329 449=75+374393=140+253
  • 35. 35  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Bucharest Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Rimnicu Vilcea Vaslui Iasi Straight−line distance to Bucharest 0 160 242 161 77 151 241 366 193 178 253 329 80 199 244 380 226 234 374 98 Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Vaslui Iasi Rimnicu Vilcea Bucharest 71 75 118 111 70 75 120 151 140 99 80 97 101 211 138 146 85 90 98 142 92 87 86 Zerind Arad Sibiu Arad Timisoara Rimnicu VilceaFagaras Oradea 447=118+329 449=75+374 646=280+366 413=220+193415=239+176 671=291+380
  • 36. 36  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Bucharest Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Rimnicu Vilcea Vaslui Iasi Straight−line distance to Bucharest 0 160 242 161 77 151 241 366 193 178 253 329 80 199 244 380 226 234 374 98 Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Vaslui Iasi Rimnicu Vilcea Bucharest 71 75 118 111 70 75 120 151 140 99 80 97 101 211 138 146 85 90 98 142 92 87 86 Zerind Arad Sibiu Arad Timisoara Fagaras Oradea 447=118+329 449=75+374 646=280+366 415=239+176 Rimnicu Vilcea Craiova Pitesti Sibiu 526=366+160 553=300+253417=317+100 671=291+380 X   X  
  • 37. 37  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Bucharest Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Rimnicu Vilcea Vaslui Iasi Straight−line distance to Bucharest 0 160 242 161 77 151 241 366 193 178 253 329 80 199 244 380 226 234 374 98 Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Vaslui Iasi Rimnicu Vilcea Bucharest 71 75 118 111 70 75 120 151 140 99 80 97 101 211 138 146 85 90 98 142 92 87 86 Zerind Arad Sibiu Arad Timisoara Sibiu Bucharest Rimnicu VilceaFagaras Oradea Craiova Pitesti Sibiu 447=118+329 449=75+374 646=280+366 591=338+253 450=450+0 526=366+160 553=300+253417=317+100 671=291+380 X  X   X  
  • 38. 38  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Bucharest Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Rimnicu Vilcea Vaslui Iasi Straight−line distance to Bucharest 0 160 242 161 77 151 241 366 193 178 253 329 80 199 244 380 226 234 374 98 Giurgiu Urziceni Hirsova Eforie Neamt Oradea Zerind Arad Timisoara Lugoj Mehadia Dobreta Craiova Sibiu Fagaras Pitesti Vaslui Iasi Rimnicu Vilcea Bucharest 71 75 118 111 70 75 120 151 140 99 80 97 101 211 138 146 85 90 98 142 92 87 86 Zerind Arad Sibiu Arad Timisoara Sibiu Bucharest Rimnicu VilceaFagaras Oradea Craiova Pitesti Sibiu Bucharest Craiova Rimnicu Vilcea 418=418+0 447=118+329 449=75+374 646=280+366 591=338+253 450=450+0 526=366+160 553=300+253 615=455+160 607=414+193 671=291+380 X  X   X   X  
  • 39. 39  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   ●  Requires a heuristic function in line 3 Ø  Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s) Ø  Expands the node (computes nodes for all applicable actions) Ø  Prunes nodes that can be shown to be no better than nodes already expanded A*  
  • 40. 40  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   ●  h is admissible if for every s, 0 ≤ h(s) ≤ h*(s) Ø  where h*(s) = least cost of getting from s to a state that satisfies g ●  If h is admissible then A* is guaranteed to return an optimal solution ●  Inadmissible heuristics might get you to a solution faster, but the solution won’t necessarily be optimal Proper.es  of  A*  
  • 41. 41  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Depth-­‐First  Branch  and  Bound   ●  Depth-first search with heuristic function and pruning Ø  π* and c*: least-cost solution found so far, and the cost of that solution Ø  Any time you find a solution with lower cost, update π* and c* Ø  Prune any plan π such that cost(π) + h(s) ≥ c*
  • 42. 42  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Discussion   ●  If the state space is not too large, then A* or DFBB may be preferable Ø  They are guaranteed to return optimal solutions ●  DFFS returns the first solution it finds Ø  can be arbitrarily far from optimal ●  Greedy isn’t guaranteed to return a solution at all ●  If S is very large, A* may require excessive memory, and both it and DFBB may require excessive running time Ø  In these cases DFFS and Greedy may be preferable
  • 43. 43  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Example   ●  One rigid relation, adjacent ●  State variables to represent each location’s x and y coordinates: Ø  x = {(loc0, 2), (loc1, 0), (loc2, 4), (loc3, 0), . . .} Ø  y = {(loc0, 4), (loc1, 3), (loc2, 4), (loc3, 2), . . .} ●  One planning operator: action move(r, l, m) Ø  pre: adjacent(l, m), loc(r) = l Ø  eff: loc(r) ← m ●  cost(loci, move(r,locj)) = ︎(xi −xj)2 +(yi − yj)2 = ︎(x(loci) − x(locj))2 + (y(loci) − y(locj))2 ●  hsld(si) = ︎(x(loci) − x(locg))2 + (y(loci) − y(locg))2 Ø  Optimal solution to a relaxed problem in which the locations don’t need to be adjacent loc1 loc3 loc2 loc6 loc5 loc4 loc7 loc8 loc0 x y 4 3 2 1 1 2 3 4 5 60 loc9 Not CSV
  • 44. 44  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   CSV  Example   ●  s0 = {loc(r1)  =  d3,                      loaded(r1)  =  F,                      loc(c1)  =  d1}; ●  g = {loc(r1)  =  d3,                    loc(c1)  =  r1} ●  action move(r, d, e) Ø  pre: loc(r) = d Ø  eff: loc(r) ← e ●  action load(r, c, l) Ø  pre: loaded(r) = F, loc(c) = l, loc(r) = l Ø  eff: loaded(r) ← T, loc(c) ← r ●  action unload(r, c, l) Ø  pre: loc(c) = r, loc(r) = l Ø  eff: loaded(r) ← F, loc(c) ← l d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 45. 45  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   CSV  Example   ●  Two applicable actions ●  a1 = move(r1,  d3,  d1)   s1 = {loc(r1)  =  d1,  loaded(r1)  =  F,                        loc(c1)  =  d1} ●  a2 = move(r1,  d3,  d2)   s2 = {loc(r1)  =  d2,  loaded(r1)  =  F,                        loc(c1)  =  d1} ●  In line 4, compute Ø  min(h(s1), h(s2)) g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 46. 46  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Addi.ve-­‐Cost  and  Max-­‐Cost  Heuris.cs   ●  hadd(s) = Δadd(s,g), where ●  Minimum cost of a “plan tree” to achieve g from s Ø  Pretend each element of g needs a completely separate plan •  If an action achieves i elements of g, it’s included i times Ø  Pretend each of an action’s preconditions needs a completely separate plan Ø  Thus, get a “plan” that’s a tree of actions •  Total cost can sometimes be much higher than h*(s) ●  Can implement as a tree search, going backward from g
  • 47. 47  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   true in s2 loc(r1)=d3 loc(c1)=r1 g = {loc(r1)=d3,,loc(c1)=r1} move(r1,d1,d3) move(r1,d2,d3) pre: loc(r1)=d1 pre: loc(r1)=d2 true in s1 … 0 0 > 0 min(1,(>1)) = 1 pre: loaded(r1)=nil true in s1 loc(r1)=d1 loc(c1)=d1 true in s2 0 0 load(r1,c1,d1) 0+1 = 10+1 = 1 (>0)+1 > 1 load(r1,c2,d2) load(r1,c3,d3) … … > 0 > 0 (>0) +1 > 1 (>0)+1 > 1 sum min(1,(>1),(>1)) = 1 sum(0,0,0) = 0 su m alternatives alternatives hadd(s1) = Δadd(s1,g) = sum(1,1) = 2 d3 r1   c1   s1:                          d2                            d1   d3 r1   c1   Addi.ve-­‐Cost  Heuris.c   g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 48. 48  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   hadd(s2) = Δadd(s2,g) = sum(1,2) = 3 move(r1,d2,d1) loc(r1)=d3 loc(c1)=r1 g = {loc(r1)=d3,0loc(c1)=r1} move(r1,d1,d3) move(r1,d2,d3) pre: loc(r1)=d1 pre: loc(r1)=d2 … true in s2 0 > 0 0 min(1,(>1)) = 1 pre: loaded(r1)=nil true in s2 loc(r1)=d1 loc(c1)=d1 true in s2 0 0+1 = 1 load(r1,c1,d1) 1+1 = 2(>0)+1 > 1 0+1 = 1 load(r1,c2,d2) load(r1,c3,d3) … … > 1 > 1 (>1) +1 > 2 (>1)+1 > 2 sum true in s2 0 min(2,(>2),(>2)) = 2 sum(0,1,0) = 1 su m alternatives alternatives Addi.ve-­‐Cost  Heuris.c   g = {loc(r1)  =  d3,  loc(c1)  =  r1} d3 r1   c1  s2:                          d2                            d1   d3 r1   c1  
  • 49. 49  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Addi.ve-­‐Cost  and  Max-­‐Cost  Heuris.cs   ●  hmax(s) = Δmax(s,g), where ●  Like hadd(s), but doesn’t add all of the costs in the plan tree Ø  Just the most costly path in the plan tree ●  Guaranteed to be a lower bound on h*(s) ●  Same tree search as for hadd(s)
  • 50. 50  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   true in s2 loc(r1)=d3 loc(c1)=r1 g = {loc(r1)=d3,,loc(c1)=r1} move(r1,d1,d3) move(r1,d2,d3) pre: loc(r1)=d1 pre: loc(r1)=d2 true in s1 … 0 0 > 0 min(1,(>1)) = 1 pre: loaded(r1)=nil true in s1 loc(r1)=d1 loc(c1)=d1 true in s2 0 0 load(r1,c1,d1) 0+1 = 10+1 = 1 (>0)+1 > 1 load(r1,c2,d2) load(r1,c3,d3) … … > 0 > 0 (>0) +1 > 1 (>0)+1 > 1 max min(1,(>1),(>1)) = 1 max(0,0,0) = 0 m ax alternatives alternatives hmax(s1) = Δmax(s1,g) = max(1,1) = 1 s1:                          d2                            d1   d3 r1   c1   Max-­‐Cost  Heuris.c   g = {loc(r1)  =  d3,  loc(c1)  =  r1} d3 r1   c1  
  • 51. 51  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   move(r1,d2,d1) loc(r1)=d3 loc(c1)=r1 g = {loc(r1)=d3,0loc(c1)=r1} move(r1,d1,d3) move(r1,d2,d3) pre: loc(r1)=d1 pre: loc(r1)=d2 … true in s2 0 > 0 0 min(1,(>1)) = 1 hmax(s2) = Δmax(s2,g) = max(1,2) = 2 pre: loaded(r1)=nil true in s2 loc(r1)=d1 loc(c1)=d1 true in s2 0 0+1 = 1 load(r1,c1,d1) 1+1 = 2(>0)+1 > 1 0+1 = 1 load(r1,c2,d2) load(r1,c3,d3) … … > 1 > 1 (>1) +1 > 2 (>1)+1 > 2 max true in s2 0 min(2,(>2),(>2)) = 2 max(0,1,0) = 1 m ax alternatives alternatives Max-­‐Cost  Heuris.c   g = {loc(r1)  =  d3,  loc(c1)  =  r1} d3 r1   c1  s2:                          d2                            d1   d3 r1   c1  
  • 52. 52  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Delete-­‐Relaxa.on  Heuris.cs   ●  Suppose a state s includes an assignment x = v ●  Suppose an action a has an effect x ← w ●  Then γ+(s, a) includes both x = v and x = w ●  Relaxed state (or r-state) Ø  any set ŝ of state-variable values Ø  may include more than one value for each state variable ●  ŝ r-satisfies a goal g if ŝ contains a subset that satisfies g ●  An action a is r-applicable in ŝ if ŝ r-satisfies a’s preconditions Ø  In this case, γ+(ŝ,a) = ŝ ∪ γ(s,a) ●  π = ⟨a1, …, an⟩ is r-applicable in ŝ0 if there are r-states ŝ1, ŝ2, …, ŝn such that •  a1 is r-applicable in ŝ0 and γ+(ŝ0,a1) = ŝ1 •  a2 is r-applicable in ŝ1 and γ+(ŝ1,a2) = ŝ2 •  … Ø  In this case, γ+(ŝ,a) = ŝn ●  π is a relaxed solution for P = (Σ, s0, g) if γ+(s0, π) r-satisfies g Name is from classical planning “don’t do the deletion”   move(r,l,m)   Precond: loc(r,l),     Effects: ¬loc(r,l),  loc(r,m)  
  • 53. 53  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   d3 r1   c1  ŝ1: g:                          d2                            d1   d3 r1   r1   c1   Op.mal  Relaxed  Solu.on  Heuris.c   ●  ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g ●  Optimal relaxed solution heuristic: h+(s) = ∆+(s, g) ●  Example: Ø  ŝ1 = γ+(s0, move(r1,d3,d1)) = {loc(r1)  =  d1,  loaded(r1)  =  F,  loc(c1)  =  d1,  loc(r1)  =  d3} g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 54. 54  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   d3 r1   c1  ŝ2: g:                          d2                            d1   d3 r1   r1   c1   c1   r1   r1   c1   Op.mal  Relaxed  Solu.on  Heuris.c   ●  ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g ●  Optimal relaxed solution heuristic: h+(s) = ∆+(s, g) ●  Example: Ø  ŝ1 = γ+(s0, move(r1,d3,d1)) = {loc(r1)  =  d1,  loaded(r1)  =  F,  loc(c1)  =  d1,  loc(r1)  =  d3} Ø  ŝ2 = γ+(s1, load(r1,c1,d1)) = {loc(r1)  =  d1,  loaded(r1)  =  F,  loc(c1)  =  d1,  loc(r1)  =  d3,                                                                                  loaded(r1)  =  T,  loc(c1)  =  r1}. Ø  ŝ2 r-satisfies g, so ⟨move(r1,d3,d1), load(r1,c1,d1)⟩ is a relaxed solution Ø  It’s optimal, so h+(s0) = 2 g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 55. 55  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Op.mal  Relaxed  Solu.on  Heuris.c   ●  ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g ●  Optimal relaxed solution heuristic: h+(s) = ∆+(s, g) ●  Every solution is also a relaxed solution Ø  Thus h+ is admissible Ø  Problem: computing it is NP-hard d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 56. 56  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG  Heuris.c   ●  Relaxed planning graph (RPG) heuristic Ø  An approximation of h+ that’s easier to compute ●  Based on the fact that γ+ doesn’t depend on the order in which actions are applied ●  If a1 and a2 are both applicable in s0 then γ+(s0, ⟨a1, a2⟩) = γ+(s0, ⟨a2, a1⟩) = s0 ∪ eff(a1) ∪ eff(a2)
  • 57. 57  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG  Heuris.c   ●  Computation of RPG(s1,g) Ø  First line of RPG assigns ŝ0 = s1 , A0 = ∅ ŝ0:          loc(r1)  =  d1            loc(c1)  =  d1            loaded(r1)  =  F ŝ1:          loc(r1)  =  d3            loc(r1)  =  d2            loaded(r1)  =  T            loc(c1)  =  r1              loc(c1)  =  d1            loc(r1)  =  d1            loaded(r1)  =  F A1:          move(r1,d1,d3)            move(r1,d1,d2)            load(r1,c1,d1) From ŝ0 d3 r1   c1  s1: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 58. 58  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG  Heuris.c   ●  Computation of RPG(s1,g) Ø  First line of RPG assigns ŝ0 = s1 , A0 = ∅ ●  RPG(s1,g) returns 2 ŝ0:          loc(r1)  =  d1            loc(c1)  =  d1            loaded(r1)  =  F ŝ1:          loc(r1)  =  d3            loc(r1)  =  d2            loaded(r1)  =  T            loc(c1)  =  r1              loc(c1)  =  d1            loc(r1)  =  d1            loaded(r1)  =  F A1:          move(r1,d1,d3)            move(r1,d1,d2)            load(r1,c1,d1) From ŝ0 d3 r1   c1  s1: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 59. 59  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG  Heuris.c   ŝ0:  loaded(r1)  =  F    loc(c1)  =  d1    loc(r1)  =  d2   ŝ1:          loc(r1)  =  d3            loc(r1)  =  d1              loaded(r1)  =  F            loc(c1)  =  d1            loc(r1)  =  d2 A1:          move(r1,d2,d3)            move(r1,d2,d1)   From ŝ0 A2:          move(r1,d3,d1)            move(r1,d3,d2)            move(r1,d1,d2)            move(r1,d1,d3)                  load(r1,c1,d1)            move(r1,d2,d3)            move(r1,d2,d1)              ŝ2:    loc(c1)  =  r1            loaded(r1)  =  T            loc(r1)  =  d3            loc(r1)  =  d1              loc(c1)  =  d1            loc(r1)  =  d2            loaded(r1)  =  F From ŝ0 d3 r1   c1  s2: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} ●  Computation of RPG(s2,g) Ø  First line of RPG assigns ŝ0 = s2 , A0 = ∅
  • 60. 60  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG  Heuris.c   ŝ0:  loaded(r1)  =  F    loc(c1)  =  d1    loc(r1)  =  d2   ŝ1:          loc(r1)  =  d3            loc(r1)  =  d1              loaded(r1)  =  F            loc(c1)  =  d1            loc(r1)  =  d2 A1:          move(r1,d2,d3)            move(r1,d2,d1)   From ŝ0 A2:          move(r1,d3,d1)            move(r1,d3,d2)            move(r1,d1,d2)            move(r1,d1,d3)                  load(r1,c1,d1)            move(r1,d2,d3)            move(r1,d2,d1)              ŝ2:    loc(c1)  =  r1            loaded(r1)  =  T            loc(r1)  =  d3            loc(r1)  =  d1              loc(c1)  =  d1            loc(r1)  =  d2            loaded(r1)  =  F From ŝ0 d3 r1   c1  s2: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} ●  Computation of RPG(s2,g) Ø  First line of RPG assigns ŝ0 = s2 , A0 = ∅ ●  RPG(s2,g) returns 3
  • 61. 61  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Landmark  Heuris.cs   ●  P = (Σ,s0,g) be a planning problem ●  Let φ = φ1 ∨ ... ∨ φm be a disjunction of state-variable assignments ●  Definition: φ is a landmark for P if φ is true at some point in every solution plan of P ●  Example Landmarks d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1}                          d1   r1   A complete state s0 A single state-variable Assignment (loc(r1)=d1)                          d1   r1                            d2   r1   Can be a disjunction of state-variable assignments loc(r1)=d1 ∨ loc(r1)=d2
  • 62. 62  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Why  are  Landmarks  Useful?   ●  Help in breaking down the given problem into smaller subproblems gs0
  • 63. 63  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Why  are  Landmarks  Useful?   ●  Help in breaking down the given problem into smaller subproblems ●  Every solution to P has to achieve these landmarks ●  Possible strategy: Ø  find a plan that takes us from s0 to any state s1 that satisfies lm1 Ø  find a plan that takes us from s1 to any state s2 that satisfies lm2 Ø  … lm1 gs0 lm2 lm3 P1 P2 P3 P4
  • 64. 64  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Compu.ng  Landmarks   ●  Question: How do we compute landmarks for a problem P? ●  Not easy: Ø  Deciding whether a state-variable assignment φ is a landmark is in the worst- case PSPACE-hard Ø  To put it in perspective: as hard as solving the planning problem itself! ●  However, all is not lost: Ø  There are often useful landmarks that can be found more easily Ø  There are polynomial-time procedures that can compute these landmarks Ø  Going to see one such procedure based on Relaxed Planning Graphs ●  Why Relaxed Planning Graphs? Ø  Solving relaxed planning problems easier •  Computing landmarks for relaxed planning problems easier Ø  A landmark for a relaxed planning problem is a landmark for the original planning problem as well
  • 65. 65  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ●  Main intuition: if a state-variable assignment φ is a landmark, then preconditions of actions that achieve φ is also a landmark Ø  In other words: we’re going to start from known landmarks and discover new ones by looking at preconditions of actions that achieve these landmarks ●  Example: Ø  g is the goal Ø  Actions a1 and a2 can achieve g Ø  Therefore, either p1∧q or p2∧q must be true to be able to achieve g Ø  In other words: (p1∧q)∨(p2∧q) is a landmark Ø  By rearranging terms: we get (p1∧q)∨(p2∧q) ≅ q∧(p1∨p2) Ø  Since q∧(p1∨p2) is a landmark, both q and (p1∨p2) are landmarks ●  In practice, we try to rearrange assignments in a similar manner to group terms with the same state variable together g a2 a1 p1 q p2
  • 66. 66  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ●  Question: What landmarks can we start with? Ø  Every goal is trivially a landmark; can start from there ●  E.g. loc(r1)  =  d3  is a landmark Ø  Two actions achieve this landmark: move (r1, d3, d1) and move (r1, d2, d1) Ø  Can infer a new landmark that is the disjunction of the preconditions of these two actions: φ‘ = loc(r1)  =  d3∨loc(r1)  =  d2 action move(r, d, e) pre: loc(r) = d eff: loc(r) ← e d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1}
  • 67. 67  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r- produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r- applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue gi a1 p1 q1 a2 p2 q2 a3 p3 q3
  • 68. 68  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r- produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r- applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue gi a1 p1 q1 a2 p2 q2 a3 p3 q3
  • 69. 69  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r- produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r- applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue gi a1 p1 q1 a2 p2 q2 a3 p3 q3
  • 70. 70  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r- produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r- applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue gi a1 p1 q1 a2 p2 q2 a3 p3 q3
  • 71. 71  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r- produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r- applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue gi a1 p1 q1 a2 p2 q2 a3 p3 q3
  • 72. 72  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r- produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r- applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue gi a1 p1 q1 a2 p2 q2 a3 p3 q3 p3∨q1: a new landmark
  • 73. 73  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} loc(r1)  =  d3 loc(c1)  =  r1ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r-produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r-applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue
  • 74. 74  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} loc(r1)  =  d3 loc(c1)  =  r1 True in current state, always going to be true in RPG; No use expanding ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r-produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r-applicable in the r-state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue
  • 75. 75  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} loc(r1)  =  d3 loc(c1)  =  r1 3 actions achieve this: load   (r1,c1,d1),  load  (r1,c1,d2),     load  (r1,c1,d3)       action load(r, c, l) pre: loaded(r) = nil, loc(c) = l, loc(r) = l eff: loaded(r) ← c, loc(c) ← r   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r-produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r-applicable in the r- state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue
  • 76. 76  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r-produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r-applicable in the r- state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue RPG-­‐based  Landmark  Computa.on   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} loc(r1)  =  d3 loc(c1)  =  r1 3 actions achieve this: load   (r1,c1,d1),  load  (r1,c1,d2),     load  (r1,c1,d3)       action load(r, c, l) pre: loaded(r) = nil, loc(c) = l, loc(r) = l eff: loaded(r) ← c, loc(c) ← r  
  • 77. 77  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} ŝ0 = {          loc(c1)  =  d1            loc(r1)  =  d3            loaded(r1)  =  nil                      } A1 = {          move(r1,d3,d1)            move(r1,d3,d2)                      } ŝ1 = {          loc(r1)  =  d1            loc(r1)  =  d2            loc(c1)  =  d1            loc(r1)  =  d3            loaded(r1)  =  nil                        } From ŝ0  load(r1,c1,d1)    load(r1,c1,d2)    load(r1,c1,d3)   Relaxed Planning Graph foo   Only first load action applicable in r-state at the end of foo  
  • 78. 78  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} loc(r1)  =  d3 loc(c1)  =  r1 3 actions achieve this: load   (r1,c1,d1),  load  (r1,c1,d2),     load  (r1,c1,d3)       action load(r, c, l) pre: loaded(r) = nil, loc(c) = l, loc(r) = l eff: loaded(r) ← c, loc(c) ← r   Only load  (r1,c1,d1)  is applicable in final level of RPG; these preconds used to generate new landmarks ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r-produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r-applicable in the r- state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue
  • 79. 79  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   RPG-­‐based  Landmark  Computa.on   d3 r1   c1  s0: g:                          d2                            d1   d3 r1   c1   g = {loc(r1)  =  d3,  loc(c1)  =  r1} loc(r1)  =  d3 loc(c1)  =  r1 loaded(r1) = nil loc(c1) = d1 loc(r1) = d1 These new landmarks are added to the queue ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk) ●  queue = {g1, g2,…, gk} ●  While queue is not empty: Ø  Remove a gi from queue Ø  Ai = {all actions that can achieve gi} Ø  Compute all assignments that can be r-produced starting from s0 without using Ai, thus generating the RPG foo   Ø  act(gi) = {all actions in Ai that are r-applicable in the r- state resulting from foo} Ø  For each action in act(gi): •  Pick a precondition not satisfied in s0 and add to φ Ø  The resulting disjunction φ is a landmark; add it to queue
  • 80. 80  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Landmark  Heuris.c   ●  Every solution to the problem needs to achieve all the computed landmarks ●  One possible heuristic to estimate distance of state s from g: Ø  Number of landmarks required to be accomplished from s Ø  Planner biased towards actions that achieve landmarks ●  Is this heuristic admissible?
  • 81. 81  Dana  Nau  and  Vikas  Shivashankar:  Lecture  slides  for  Automated  Planning  and  Ac0ng   Updated  3/24/15   Landmark  Heuris.c   ●  Every solution to the problem needs to achieve all the computed landmarks ●  One possible heuristic to estimate distance of state s from g: Ø  Number of landmarks required to be accomplished from s Ø  Planner biased towards actions that achieve landmarks ●  Is this heuristic admissible? ●  A number of more advanced landmark-based heuristics developed (including admissible ones) Ø  Check textbook for references g1 a s0 g2 Number of landmarks: | {g1, g2} | = 2 Optimal Plan Length = |<a>| = 1 g = g1∧g2