1. 1
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
This
work
is
licensed
under
a
CreaCve
Commons
AFribuCon-‐NonCommercial-‐ShareAlike
4.0
InternaConal
License.
Chapter
2
Delibera.on
with
Determinis.c
Models
Dana S. Nau and Vikas Shivashankar
University of Maryland
2. 2
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Purpose
of
this
Chapter
● Last time, Vikas mentioned conventional AI planning
Ø Given
• a domain model (descriptions of the states and actions)
• initial state s0, and goal g
Ø Find a plan or a policy that
• is executable starting in s0
• produces a state that satisfies g
● This chapter discusses some techniques for doing that
Ø Also, how to use those techniques in acting systems
● Outline:
Ø 2a: Representing planning domains
Ø 2b: Planning algorithms
Ø 2c: Acting and planning
3. 3
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
This
work
is
licensed
under
a
CreaCve
Commons
AFribuCon-‐NonCommercial-‐ShareAlike
4.0
InternaConal
License.
Chapter
2
Delibera.on
with
Determinis.c
Models
2a:
Represen.ng
Planning
Domains
Dana S. Nau and Vikas Shivashankar
University of Maryland
4. 4
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Domain
Model
● Planning domain: an abstract model of the environment
Ø Many different kinds of environments, various ways to model them
● In this chapter, the model is a deterministic state-transition system
Ø Σ = (S,A,γ)
• S is a finite set of states
▸ States of the world
• A is a finite set of actions
▸ Things an actor can do
• γ: S × A → S is a prediction function (or state-transition function)
▸ Given a state s and action a, γ(s,a) is another state
• Prediction of what state will be produced by executing a in s
• γ is partial: γ(s,a) is undefined if a is inapplicable in s
▸ Dom(a) = {s ∈ S | γ(s,a) is defined} = {states where a is applicable}
▸ Range(a) = {γ(s,a) | s ∈ Dom(a)}
5. 5
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Implicit
Assump.ons
● The state-transition model incorporates the following assumptions
Ø Static world
• Changes occur only in response to the actor’s actions
Ø Perfect information
• Actor always has all the information it needs
Ø Instantaneous actions
• Each action causes an instantaneous transition from one state to the next
Ø Determinism
• Actions are deterministic
Ø Correct prediction function
• Outcome of action a in state s is always γ(s,a)
Ø Flat search space
• Only one level of abstraction; ignore how to refine actions at a lower level
6. 6
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
How
to
Represent
Σ?
● If the domain is small enough
Ø Give each state and action a
unique name
Ø For each s and a, store γ(s,a)
in a lookup table
loc1
loc3
loc2
loc6
loc5
loc4
loc7
loc8
loc0
x
y
4
3
2
1
1 2 3 4 5 60
loc9
● If a domain is larger, don’t represent all states explicitly
Ø Have a formalism for describing states by describing their properties
Ø Represent each action by describing how it changes those properties
Ø Start with initial state, use actions to produce other states
7. 7
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Determinis.c
Operator
(General
Form)
● Domain-specific format for representing states
Ø Invent your own format
● General form of a deterministic operator:
Ø o = (head(o), pre(o), eff(o), cost(o))
• head(o): name and parameter list
• pre(o): preconditions
▸ Computational tests to predict whether an action can be performed in
a state s
▸ In principle, should be necessary/sufficient for the action to run
without error
• eff(o): effects
▸ Procedures that assign new values to some of the state variables
• cost(o): procedure that returns a number
▸ Can be omitted, in which case cost(o) = 1
▸ Could represent monetary cost, time required, something else
8. 8
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Example
● Suppose we want to plan how to create a metal hole in the workpiece
● a state s includes
Ø geometric model of the workpiece, variables describing its location,
orientation, and other status information,
Ø capabilities and status of drilling machine and drill
● Several actions (getting the workpiece onto the machine, clamping it, loading a
drill bit, etc.)
Ø Next slide: the drilling operation itself
9. 9
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Drilling
Ac.on
● head(o) = drill-‐hole(m, w, b, x1, …, xn)
Ø m, w, b are the names or ID numbers of the drilling machine, the workpiece,
and the drill bit
Ø x1, …, xn are specifications of the hole’s location, orientation, depth, and
required machining tolerances
● pre(o): computational tests
Ø Can the drilling machine and drill bit produce a hole having the desired
geometry and machining tolerances?
Ø Is the drill loaded into the drilling machine? Is the workpiece is properly
clamped onto the drilling platform? Etc.
● eff(o): geometric computation
Ø Modify the geometric model of the workpiece to include a hole having the
desired specifications.
● cost(o)
Ø could be an estimate of the time required for the drilling operation
Ø could be a cost estimate based on time + other criteria
10. 10
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
State-‐Variable
Representa.on
● Represent each state as a collection of properties of various objects
Ø Represent each action by describing how it changes those properties
● Let B = {all objects that matter for planning}
Ø B may be classified into subsets: various kinds of objects
● Example
Ø B = Robots ∪ Containers ∪ Locs ∪ Booleans
• Robots = {r1,
r2}
• Containers = {c1,
c2}
• Locs = {d1,
d2,
d3}
• Booleans = {T,
F}
d2
d1
d3
r1
c1
r2
c2
11. 11
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Proper.es
of
Objects
● Define ways to represent properties of objects
Ø Two kinds of properties: rigid and varying
● A property is rigid if it stays the same in every state
Ø Represent as a mathematical relation
● Example:
Ø adjacent = {(d1,d2), (d2,d1)}
Ø Can also write as
• adjacent(d1,d2)
• adjacent(d2,d1)
d2
d1
d3
r1
c1
r2
c2
12. 12
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Varying
Proper.es
● A property is varying if it may differ in different states
Ø Represent using a state variable that we can assign a value to
Ø Each state variable x has a range (set of possible values), Range(x)
Ø For each state s, s(x) ∈ Range(x) is x’s value in state s
● Example:
Ø what we want to represent
Ø Each robot can hold at most one container
Ø Each robot is at a one of the locations
Ø Each container is on a robot or at one of the locations
d2
d1
d3
r1
c1
r2
c2
13. 13
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
A
Simple
Example
● Set of all state variables
Ø X = {loc(r1), loaded(r1), loc(c1), loc(r2), loaded(r2), loc(c2)}
● Range(loc(r1)) = Locs
● Range(loc(r2)) = Locs
● Range(loaded(r1)) = Booleans
● Range(loaded(r2)) = Booleans
● Range(loc(c1)) = Robots ∪ Locs
● Range(loc(c2)) = Robots ∪ Locs
Why have both
loc and loaded?
d2
d1
d3
r1
c1
r2
c2
14. 14
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
States
as
Func.ons
● Write each state as a function s that maps each x ∈ X to a value in Range(x)
s1(loc(r1))
=
d1,
s1(loaded(r1))
=
F,
s1(loc(c1))
=
d1,
s1(loc(r2))
=
d2,
s1(loaded(r2))
=
F,
s1(loc(c2))
=
d2
Ø Let S = the set of all such functions
● Functions are sets of ordered pairs
s1 = {(loc(r1),
d1),
(loaded(r1),
F),
(loc(c1),
d1), …}
● Rewrite as
s1 = {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1,
loc(r2)
=
d2,
loaded(r2)
=
F,
loc(c2)
=
d2}
d2
d1
d3
r1
c1
r2
c2
15. 15
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
● Just because a function maps each state variable x into a value in Range(x),
this doesn’t make it a state
Ø Some sets of state-variable values may not make any sense as states
s = {loaded(r1) = F, loc(c1) = r1, …}
s = {loc(c1) = d1, loc(c2) = r1, …}
● Need restrictions on what sets of state-variable values constitute real states
Ø In most of the book, we won’t represent such restrictions explicitly
Ø Just write the action models in such a way that none of them will ever
produce such a set of assignments
Discussion
d2
d1
d3
r1
c1
r2
c2
16. 16
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Descrip.ve
Ac.on
Models
● Actions often fall into closely related classes
● For each class, write a parameterized schema called a planning operator to
describe all the actions in that class
action move(r, l, m)
Pre: loc(r) = l, adjacent(l, m)
Eff: loc(r) ← m
action take(r, l, c)
Pre: loaded(r) = F,
loc(r) = l, loc(c) = l,
Eff: loaded(r) ← T, loc(c) ← r
action put(r, l, c)
Pre: loc(r) = l, loc(c) = r
Eff: loaded(r) ← F, loc(c) ← l
Each parameter has a range
of possible values, e.g.,
Range(r) = Robots
d2
d1
d3
r1
c1
r2
c2
17. 17
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
CSV
Operator
● Classical State Variable (CSV) Operator:
• the kind of operator shown on the previous page
● General form
Ø o = (head(o), pre(o), eff(o), cost(o))
• each precondition in pre(o) must have one of these forms:
▸ relname(t1,…,tk) varname(t1,…,tk) = t0
¬relname(t1,…,tk) varname(t1,…,tk) ≠ t0
• relname = name of a rigid relation
• varname = name of a state variable
• each ti must be a constant (i.e., a member of B) or a parameter
• each effect in eff(o) must have the form varname(t1,…,tk) ← t0
• if cost(o) isn’t omitted, it must be a nonnegative number (not a formula)
● Limited representational capability, but easy to compute, easy to reason about
Ø Many algorithms have been written to use this kind of operator
18. 18
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
CSV
Ac.ons
● A CSV action is an instance
of a planning operator
Ø assign values to parameters
action move(r1,d1,d2)
Pre: loc(r1)
=
d1, adjacent(d1,d2)
Eff: loc(r1) ← d2
action take(r2,d2,c2)
Pre: loaded(r2)
=
F,
loc(r2)
=
d2,
loc(c2)
=
d2,
Eff: loaded(r2)
← T,
loc(c1)
← r2
action put(r1,d1,c1)
Pre: loc(r1)
=
d1,
loc(c1)
=
r1
Eff: loaded(r1)
← F,
loc(c1)
← d1
d2
d1
d3
r1
c1
r2
c2
20. 20
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Plans
● Plan: a sequence of actions
π = 〈a1,a2,…,an〉
Ø π is applicable in s0 if the actions
can be applied in the order given
• γ (s0,a1) = s1, γ (s1,a2) = s2, …, γ (sn–1,an) = sn
● If π is applicable, define γ (s0,π) = sn
● Example
π = 〈take(r2,d2,c2),
move(r2,d2,d1),
put(r2,d1,c2)〉
d2
d1
d3
c1
r2
c2
d2
d1
d3
c1
c2
s0:
γ (s0,π):
r2
r1
r1
21. 21
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Planning
Problems,
Solu.ons
● Planning problem: P=(Σ,s0,g)
Ø Σ = planning domain
Ø s0 = initial state
Ø g = {g1,…,gk} is a set of constraints called the goal
● π is a solution for P if γ (s0,π) satisfies g
Ø π is a shortest solution if no shorter plan is also a solution
Ø π is a minimal solution if no proper subsequence of π is also a solution
● CSV planning problem:
Ø Σ is a CSV planning domain,
Ø g has the same form as a CSV operator’s preconditions
● Example:
Ø s0 and π as on previous page
Ø g = {loc(r2)=d1,
loc(c2)=d1}
22. 22
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Classical
and
State-‐Variable
Representa.ons
● Motivation
Ø The field of AI planning started out as automated theorem proving
Ø It still uses a lot of that notation
● Classical representation
Ø Equivalent to CSV representation
Ø Represents both rigid and varying properties using logical predicates
• adjacent(l,m) - location l is adjacent to location m
• loc(r,l) - robot r is at location l
• loc(c,l), loc(c,r) - container c is at location l or on robot r
• loaded(r) - there is a container on robot r
d2
d1
d3
r1
c1
r2
c2
23. 23
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
States
● Use ground atoms to represent both rigid and varying properties of Σ
● To represent a state
Ø s = {all ground atoms that are true in ŝ}
Ø e.g., s1 = {loc(c1,d1),
loc(c2,d2),
loc(r1,d1),
loc(r2,d2),
adjacent(d1,d2),
adjacent(d2,d1),
adjacent(d1,d3),
adjacent(d3,d1)}
d2
d1
d3
r1
c1
r2
c2
24. 24
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Classical
Planning
Operators
● Classical planning operator:
Ø o = (head(o), pre(o), eff(o))
● Each precondition and effect
must have one of these forms:
pred(t1,…,tk) ¬pred(t1,…,tk)
Ø pred = predicate name
Ø each ti must be a constant
(i.e., member of B) or
a parameter
● Classical action: a ground
instance of a classical operator
move(r,l,m)
Precond: loc(r,l),
Effects: ¬loc(r,l),
loc(r,m)
take(r,l,c)
Precond: loc(r,l),
loc(c,l),
¬loaded(r)
Effects: loc(c,r),
¬loc(c,l),
loaded(r)
put(r,l,c)
Precond: loc(r,l),
loc(c,r)
Effects: loc(c,l),
¬loc(c,r),
¬loaded(r)
d2
d1
d3
r1
c1
r2
c2
25. 25
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Discussion
● Classical representation is equivalent to state-variable representation in
expressive power
Ø Each can be converted to the other in linear time and space
● Classical representation is more natural for logicians
● CSV is more natural for engineers and most computer scientists
Ø When changing a value, you don’t have to explicitly delete the old one
● Historically, classical representation has been more widely used
Ø Many of the algorithms in the book were originally written to use
classical representation
Ø That’s starting to change
Classical
rep.
CSV
rep.
P(b1,…,bk) becomes xP(b1,…,bk)=1
x(b1,…,bn)=b0
becomes
Px(b1,…,bn,b0)
26. 26
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
This
work
is
licensed
under
a
CreaCve
Commons
AFribuCon-‐NonCommercial-‐ShareAlike
4.0
InternaConal
License.
Chapter
2
Delibera.on
with
Determinis.c
Models
2b:
Planning
Algorithms
Dana S. Nau and Vikas Shivashankar
University of Maryland
27. 27
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Mo.va.on
● Nearly all planning procedures are search procedures
● Different planning procedures have different search spaces
Ø This section: state-space planning
• Each node represents a state of the world
▸ A plan is a path through the space
28. 28
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Forward
Search
Which forward-search algorithm
depends on how you implement
the nondeterministic choice
I’ll discuss several such algorithms
29. 29
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Depth-‐First
Search
● At each state, select the action that has the lowest heuristic-functionvalue
● Visited is for cycle-checking
Ø If you come to a state you’ve already seen on the current path, then backtrack
Ø In a finite domain, this guarantees termination
● Guaranteed to find a solution if one exists
30. 30
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Greedy
Search
● Like DFFS, but never backtracks
Ø Not guaranteed to find a solution
31. 31
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
● Requires a heuristic function in line 3
Ø Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s)
Ø Expands the node (computes nodes for all applicable actions)
Ø Prunes nodes that can be shown to be no better than nodes already expanded
A*
32. 32
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
● Requires a heuristic function in line 3
Ø Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s)
Ø Expands the node (computes nodes for all applicable actions)
Ø Prunes nodes that can be shown to be no better than nodes already expanded
A*
39. 39
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
● Requires a heuristic function in line 3
Ø Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s)
Ø Expands the node (computes nodes for all applicable actions)
Ø Prunes nodes that can be shown to be no better than nodes already expanded
A*
40. 40
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
● h is admissible if for every s, 0 ≤ h(s) ≤ h*(s)
Ø where h*(s) = least cost of getting from s to a state that satisfies g
● If h is admissible then A* is guaranteed to return an optimal solution
● Inadmissible heuristics might get you to a solution faster, but the solution won’t
necessarily be optimal
Proper.es
of
A*
41. 41
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Depth-‐First
Branch
and
Bound
● Depth-first search with heuristic function and pruning
Ø π* and c*: least-cost solution found so far, and the cost of that solution
Ø Any time you find a solution with lower cost, update π* and c*
Ø Prune any plan π such that cost(π) + h(s) ≥ c*
42. 42
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Discussion
● If the state space is not too large, then A* or DFBB may be preferable
Ø They are guaranteed to return optimal solutions
● DFFS returns the first solution it finds
Ø can be arbitrarily far from optimal
● Greedy isn’t guaranteed to return a solution at all
● If S is very large, A* may require excessive memory, and both it and DFBB may
require excessive running time
Ø In these cases DFFS and Greedy may be preferable
43. 43
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Example
● One rigid relation, adjacent
● State variables to represent each
location’s x and y coordinates:
Ø x = {(loc0, 2), (loc1, 0),
(loc2, 4), (loc3, 0), . . .}
Ø y = {(loc0, 4), (loc1, 3),
(loc2, 4), (loc3, 2), . . .}
● One planning operator:
action move(r, l, m)
Ø pre: adjacent(l, m), loc(r) = l
Ø eff: loc(r) ← m
● cost(loci, move(r,locj)) = ︎(xi −xj)2 +(yi − yj)2
= ︎(x(loci) − x(locj))2 + (y(loci) − y(locj))2
● hsld(si) = ︎(x(loci) − x(locg))2 + (y(loci) − y(locg))2
Ø Optimal solution to a relaxed problem in which the locations don’t
need to be adjacent
loc1
loc3
loc2
loc6
loc5
loc4
loc7
loc8
loc0
x
y
4
3
2
1
1 2 3 4 5 60
loc9
Not CSV
44. 44
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
CSV
Example
● s0 = {loc(r1)
=
d3,
loaded(r1)
=
F,
loc(c1)
=
d1};
● g = {loc(r1)
=
d3,
loc(c1)
=
r1}
● action move(r, d, e)
Ø pre: loc(r) = d
Ø eff: loc(r) ← e
● action load(r, c, l)
Ø pre: loaded(r) = F, loc(c) = l, loc(r) = l
Ø eff: loaded(r) ← T, loc(c) ← r
● action unload(r, c, l)
Ø pre: loc(c) = r, loc(r) = l
Ø eff: loaded(r) ← F, loc(c) ← l
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
45. 45
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
CSV
Example
● Two applicable actions
● a1 = move(r1,
d3,
d1)
s1 = {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1}
● a2 = move(r1,
d3,
d2)
s2 = {loc(r1)
=
d2,
loaded(r1)
=
F,
loc(c1)
=
d1}
● In line 4, compute
Ø min(h(s1), h(s2))
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
46. 46
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Addi.ve-‐Cost
and
Max-‐Cost
Heuris.cs
● hadd(s) = Δadd(s,g), where
● Minimum cost of a “plan tree” to achieve g from s
Ø Pretend each element of g needs a completely separate plan
• If an action achieves i elements of g, it’s included i times
Ø Pretend each of an action’s preconditions needs a completely separate plan
Ø Thus, get a “plan” that’s a tree of actions
• Total cost can sometimes be much higher than h*(s)
● Can implement as a tree search, going backward from g
47. 47
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
true in s2
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,,loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
true in s1 … 0
0 > 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s1
loc(r1)=d1 loc(c1)=d1
true in s2
0 0
load(r1,c1,d1)
0+1 = 10+1 = 1 (>0)+1 > 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 0
> 0
(>0) +1 > 1
(>0)+1 > 1
sum
min(1,(>1),(>1)) = 1
sum(0,0,0) = 0
su m
alternatives
alternatives
hadd(s1) = Δadd(s1,g) = sum(1,1) = 2
d3
r1
c1
s1:
d2
d1
d3
r1
c1
Addi.ve-‐Cost
Heuris.c
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
48. 48
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
hadd(s2) = Δadd(s2,g) = sum(1,2) = 3
move(r1,d2,d1)
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,0loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
… true in s2
0
> 0 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s2
loc(r1)=d1 loc(c1)=d1
true in s2
0
0+1 = 1
load(r1,c1,d1)
1+1 = 2(>0)+1 > 1 0+1 = 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 1
> 1
(>1) +1 > 2
(>1)+1 > 2
sum
true in s2
0
min(2,(>2),(>2)) = 2
sum(0,1,0) = 1
su m
alternatives
alternatives
Addi.ve-‐Cost
Heuris.c
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
d3
r1
c1
s2:
d2
d1
d3
r1
c1
49. 49
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Addi.ve-‐Cost
and
Max-‐Cost
Heuris.cs
● hmax(s) = Δmax(s,g), where
● Like hadd(s), but doesn’t add all of the costs in the plan tree
Ø Just the most costly path in the plan tree
● Guaranteed to be a lower bound on h*(s)
● Same tree search as for hadd(s)
50. 50
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
true in s2
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,,loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
true in s1 … 0
0 > 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s1
loc(r1)=d1 loc(c1)=d1
true in s2
0 0
load(r1,c1,d1)
0+1 = 10+1 = 1 (>0)+1 > 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 0
> 0
(>0) +1 > 1
(>0)+1 > 1
max
min(1,(>1),(>1)) = 1
max(0,0,0) = 0
m ax
alternatives
alternatives
hmax(s1) = Δmax(s1,g) = max(1,1) = 1
s1:
d2
d1
d3
r1
c1
Max-‐Cost
Heuris.c
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
d3
r1
c1
51. 51
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
move(r1,d2,d1)
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,0loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
… true in s2
0
> 0 0
min(1,(>1)) = 1
hmax(s2) = Δmax(s2,g) = max(1,2) = 2
pre:
loaded(r1)=nil
true in s2
loc(r1)=d1 loc(c1)=d1
true in s2
0
0+1 = 1
load(r1,c1,d1)
1+1 = 2(>0)+1 > 1 0+1 = 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 1
> 1
(>1) +1 > 2
(>1)+1 > 2
max
true in s2
0
min(2,(>2),(>2)) = 2
max(0,1,0) = 1
m ax
alternatives
alternatives
Max-‐Cost
Heuris.c
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
d3
r1
c1
s2:
d2
d1
d3
r1
c1
52. 52
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Delete-‐Relaxa.on
Heuris.cs
● Suppose a state s includes an assignment x = v
● Suppose an action a has an effect x ← w
● Then γ+(s, a) includes both x = v and x = w
● Relaxed state (or r-state)
Ø any set ŝ of state-variable values
Ø may include more than one value for each state variable
● ŝ r-satisfies a goal g if ŝ contains a subset that satisfies g
● An action a is r-applicable in ŝ if ŝ r-satisfies a’s preconditions
Ø In this case, γ+(ŝ,a) = ŝ ∪ γ(s,a)
● π = ⟨a1, …, an⟩ is r-applicable in ŝ0 if there are r-states ŝ1, ŝ2, …, ŝn such that
• a1 is r-applicable in ŝ0 and γ+(ŝ0,a1) = ŝ1
• a2 is r-applicable in ŝ1 and γ+(ŝ1,a2) = ŝ2
• …
Ø In this case, γ+(ŝ,a) = ŝn
● π is a relaxed solution for P = (Σ, s0, g) if γ+(s0, π) r-satisfies g
Name is from classical planning
“don’t do the deletion”
move(r,l,m)
Precond: loc(r,l),
Effects: ¬loc(r,l),
loc(r,m)
53. 53
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
d3
r1
c1
ŝ1:
g:
d2
d1
d3
r1
r1
c1
Op.mal
Relaxed
Solu.on
Heuris.c
● ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g
● Optimal relaxed solution heuristic: h+(s) = ∆+(s, g)
● Example:
Ø ŝ1 = γ+(s0, move(r1,d3,d1))
= {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1,
loc(r1)
=
d3}
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
54. 54
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
d3
r1
c1
ŝ2:
g:
d2
d1
d3
r1
r1
c1
c1
r1
r1
c1
Op.mal
Relaxed
Solu.on
Heuris.c
● ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g
● Optimal relaxed solution heuristic: h+(s) = ∆+(s, g)
● Example:
Ø ŝ1 = γ+(s0, move(r1,d3,d1))
= {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1,
loc(r1)
=
d3}
Ø ŝ2 = γ+(s1, load(r1,c1,d1))
= {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1,
loc(r1)
=
d3,
loaded(r1)
=
T,
loc(c1)
=
r1}.
Ø ŝ2 r-satisfies g, so ⟨move(r1,d3,d1), load(r1,c1,d1)⟩ is a relaxed solution
Ø It’s optimal, so h+(s0) = 2
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
55. 55
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Op.mal
Relaxed
Solu.on
Heuris.c
● ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g
● Optimal relaxed solution heuristic: h+(s) = ∆+(s, g)
● Every solution is also a relaxed solution
Ø Thus h+ is admissible
Ø Problem: computing it is NP-hard
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
56. 56
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG
Heuris.c
● Relaxed planning graph (RPG) heuristic
Ø An approximation of h+ that’s easier to compute
● Based on the fact that
γ+ doesn’t depend on
the order in which
actions are applied
● If a1 and a2 are both
applicable in s0 then
γ+(s0, ⟨a1, a2⟩)
= γ+(s0, ⟨a2, a1⟩)
= s0 ∪ eff(a1) ∪ eff(a2)
57. 57
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG
Heuris.c
● Computation of RPG(s1,g)
Ø First line of RPG assigns ŝ0 = s1 , A0 = ∅
ŝ0:
loc(r1)
=
d1
loc(c1)
=
d1
loaded(r1)
=
F
ŝ1:
loc(r1)
=
d3
loc(r1)
=
d2
loaded(r1)
=
T
loc(c1)
=
r1
loc(c1)
=
d1
loc(r1)
=
d1
loaded(r1)
=
F
A1:
move(r1,d1,d3)
move(r1,d1,d2)
load(r1,c1,d1)
From ŝ0
d3
r1
c1
s1:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
58. 58
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG
Heuris.c
● Computation of RPG(s1,g)
Ø First line of RPG assigns ŝ0 = s1 , A0 = ∅
● RPG(s1,g) returns 2
ŝ0:
loc(r1)
=
d1
loc(c1)
=
d1
loaded(r1)
=
F
ŝ1:
loc(r1)
=
d3
loc(r1)
=
d2
loaded(r1)
=
T
loc(c1)
=
r1
loc(c1)
=
d1
loc(r1)
=
d1
loaded(r1)
=
F
A1:
move(r1,d1,d3)
move(r1,d1,d2)
load(r1,c1,d1)
From ŝ0
d3
r1
c1
s1:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
59. 59
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG
Heuris.c
ŝ0:
loaded(r1)
=
F
loc(c1)
=
d1
loc(r1)
=
d2
ŝ1:
loc(r1)
=
d3
loc(r1)
=
d1
loaded(r1)
=
F
loc(c1)
=
d1
loc(r1)
=
d2
A1:
move(r1,d2,d3)
move(r1,d2,d1)
From ŝ0
A2:
move(r1,d3,d1)
move(r1,d3,d2)
move(r1,d1,d2)
move(r1,d1,d3)
load(r1,c1,d1)
move(r1,d2,d3)
move(r1,d2,d1)
ŝ2:
loc(c1)
=
r1
loaded(r1)
=
T
loc(r1)
=
d3
loc(r1)
=
d1
loc(c1)
=
d1
loc(r1)
=
d2
loaded(r1)
=
F
From ŝ0
d3
r1
c1
s2:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
● Computation of RPG(s2,g)
Ø First line of RPG assigns ŝ0 = s2 , A0 = ∅
60. 60
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG
Heuris.c
ŝ0:
loaded(r1)
=
F
loc(c1)
=
d1
loc(r1)
=
d2
ŝ1:
loc(r1)
=
d3
loc(r1)
=
d1
loaded(r1)
=
F
loc(c1)
=
d1
loc(r1)
=
d2
A1:
move(r1,d2,d3)
move(r1,d2,d1)
From ŝ0
A2:
move(r1,d3,d1)
move(r1,d3,d2)
move(r1,d1,d2)
move(r1,d1,d3)
load(r1,c1,d1)
move(r1,d2,d3)
move(r1,d2,d1)
ŝ2:
loc(c1)
=
r1
loaded(r1)
=
T
loc(r1)
=
d3
loc(r1)
=
d1
loc(c1)
=
d1
loc(r1)
=
d2
loaded(r1)
=
F
From ŝ0
d3
r1
c1
s2:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
● Computation of RPG(s2,g)
Ø First line of RPG assigns ŝ0 = s2 , A0 = ∅
● RPG(s2,g) returns 3
61. 61
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Landmark
Heuris.cs
● P = (Σ,s0,g) be a planning problem
● Let φ = φ1 ∨ ... ∨ φm be a disjunction of state-variable assignments
● Definition: φ is a landmark for P if φ is true at some point in every solution plan
of P
● Example Landmarks
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
d1
r1
A complete state s0
A single state-variable
Assignment (loc(r1)=d1)
d1
r1
d2
r1
Can be a disjunction of
state-variable assignments
loc(r1)=d1 ∨
loc(r1)=d2
62. 62
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Why
are
Landmarks
Useful?
● Help in breaking down the given problem into smaller subproblems
gs0
63. 63
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Why
are
Landmarks
Useful?
● Help in breaking down the given problem into smaller subproblems
● Every solution to P has to achieve these landmarks
● Possible strategy:
Ø find a plan that takes us from s0 to any state s1 that satisfies lm1
Ø find a plan that takes us from s1 to any state s2 that satisfies lm2
Ø …
lm1
gs0
lm2
lm3
P1
P2 P3
P4
64. 64
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Compu.ng
Landmarks
● Question: How do we compute landmarks for a problem P?
● Not easy:
Ø Deciding whether a state-variable assignment φ is a landmark is in the worst-
case PSPACE-hard
Ø To put it in perspective: as hard as solving the planning problem itself!
● However, all is not lost:
Ø There are often useful landmarks that can be found more easily
Ø There are polynomial-time procedures that can compute these landmarks
Ø Going to see one such procedure based on Relaxed Planning Graphs
● Why Relaxed Planning Graphs?
Ø Solving relaxed planning problems easier
• Computing landmarks for relaxed planning problems easier
Ø A landmark for a relaxed planning problem is a landmark for the original
planning problem as well
65. 65
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
● Main intuition: if a state-variable assignment φ is a landmark, then preconditions
of actions that achieve φ is also a landmark
Ø In other words: we’re going to start from known landmarks and discover new
ones by looking at preconditions of actions that achieve these landmarks
● Example:
Ø g is the goal
Ø Actions a1 and a2 can achieve g
Ø Therefore, either p1∧q or p2∧q
must be true to be able to achieve g
Ø In other words: (p1∧q)∨(p2∧q) is
a landmark
Ø By rearranging terms: we get
(p1∧q)∨(p2∧q) ≅ q∧(p1∨p2)
Ø Since q∧(p1∨p2) is a landmark, both q and (p1∨p2) are landmarks
● In practice, we try to rearrange assignments in a similar manner to group terms
with the same state variable together
g
a2
a1
p1
q
p2
66. 66
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
● Question: What landmarks can we start with?
Ø Every goal is trivially a landmark; can start from there
● E.g. loc(r1)
=
d3
is a landmark
Ø Two actions achieve this landmark: move (r1, d3, d1)
and move (r1, d2, d1)
Ø Can infer a new landmark that is the disjunction of the
preconditions of these two actions:
φ‘ = loc(r1)
=
d3∨loc(r1)
=
d2
action move(r, d, e)
pre: loc(r) = d
eff: loc(r) ← e
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
67. 67
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo
Ø act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0
and add to φ
Ø The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
68. 68
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo
Ø act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0
and add to φ
Ø The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
69. 69
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo
Ø act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0
and add to φ
Ø The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
70. 70
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo
Ø act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0
and add to φ
Ø The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
71. 71
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo
Ø act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0
and add to φ
Ø The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3
72. 72
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo
Ø act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0
and add to φ
Ø The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3 p3∨q1: a new landmark
73. 73
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo
Ø act(gi) = {all actions in Ai that are r-applicable in the
r-state resulting from foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0 and add to φ
Ø The resulting disjunction φ is a landmark; add it to
queue
74. 74
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
True in
current state, always going
to be true in RPG;
No use expanding
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo
Ø act(gi) = {all actions in Ai that are r-applicable in the
r-state resulting from foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0 and add to φ
Ø The resulting disjunction φ is a landmark; add it to
queue
75. 75
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
3 actions achieve this: load
(r1,c1,d1),
load
(r1,c1,d2),
load
(r1,c1,d3)
action load(r, c, l)
pre: loaded(r) = nil, loc(c) =
l, loc(r) = l
eff: loaded(r) ← c, loc(c) ← r
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo
Ø act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0 and add to φ
Ø The resulting disjunction φ is a landmark; add it to
queue
76. 76
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo
Ø act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0 and add to φ
Ø The resulting disjunction φ is a landmark; add it to
queue
RPG-‐based
Landmark
Computa.on
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
3 actions achieve this: load
(r1,c1,d1),
load
(r1,c1,d2),
load
(r1,c1,d3)
action load(r, c, l)
pre: loaded(r) = nil, loc(c) =
l, loc(r) = l
eff: loaded(r) ← c, loc(c) ← r
77. 77
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
ŝ0 = {
loc(c1)
=
d1
loc(r1)
=
d3
loaded(r1)
=
nil
}
A1 = {
move(r1,d3,d1)
move(r1,d3,d2)
}
ŝ1 = {
loc(r1)
=
d1
loc(r1)
=
d2
loc(c1)
=
d1
loc(r1)
=
d3
loaded(r1)
=
nil
}
From ŝ0
load(r1,c1,d1)
load(r1,c1,d2)
load(r1,c1,d3)
Relaxed Planning Graph foo
Only first load action
applicable in r-state
at the end of foo
78. 78
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
3 actions achieve this: load
(r1,c1,d1),
load
(r1,c1,d2),
load
(r1,c1,d3)
action load(r, c, l)
pre: loaded(r) = nil, loc(c) =
l, loc(r) = l
eff: loaded(r) ← c, loc(c) ← r
Only load
(r1,c1,d1)
is applicable in
final level of RPG; these preconds
used to generate new landmarks
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo
Ø act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0 and add to φ
Ø The resulting disjunction φ is a landmark; add it to
queue
79. 79
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
RPG-‐based
Landmark
Computa.on
d3
r1
c1
s0:
g:
d2
d1
d3
r1
c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
loaded(r1) = nil
loc(c1) = d1
loc(r1) = d1
These new landmarks
are added
to the queue
ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
● queue = {g1, g2,…, gk}
● While queue is not empty:
Ø Remove a gi from queue
Ø Ai = {all actions that can achieve gi}
Ø Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo
Ø act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
Ø For each action in act(gi):
• Pick a precondition not satisfied in s0 and add to φ
Ø The resulting disjunction φ is a landmark; add it to
queue
80. 80
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Landmark
Heuris.c
● Every solution to the problem needs to achieve all the computed landmarks
● One possible heuristic to estimate distance of state s from g:
Ø Number of landmarks required to be accomplished from s
Ø Planner biased towards actions that achieve landmarks
● Is this heuristic admissible?
81. 81
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15
Landmark
Heuris.c
● Every solution to the problem needs to achieve all the computed landmarks
● One possible heuristic to estimate distance of state s from g:
Ø Number of landmarks required to be accomplished from s
Ø Planner biased towards actions that achieve landmarks
● Is this heuristic admissible?
● A number of more advanced landmark-based heuristics developed (including
admissible ones)
Ø Check textbook for references
g1
a
s0
g2
Number of landmarks: | {g1, g2} | = 2
Optimal Plan Length = |<a>| = 1
g = g1∧g2