Automated Planning Lecture Slides

1
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

This
work
is
licensed
under
a
CreaCve
Commons
AFribuCon-‐NonCommercial-‐ShareAlike
4.0
InternaConal
License.

Chapter
2

Delibera.on
with
Determinis.c
Models

Dana S. Nau and Vikas Shivashankar
University of Maryland

2
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Purpose
of
this
Chapter

●  Last time, Vikas mentioned conventional AI planning
Ø  Given
•  a domain model (descriptions of the states and actions)
•  initial state s0, and goal g
Ø  Find a plan or a policy that
•  is executable starting in s0
•  produces a state that satisfies g
●  This chapter discusses some techniques for doing that
Ø  Also, how to use those techniques in acting systems
●  Outline:
Ø  2a: Representing planning domains
Ø  2b: Planning algorithms
Ø  2c: Acting and planning

3
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

This
work
is
licensed
under
a
CreaCve
Commons
4.0
InternaConal
License.

Chapter
2

Delibera.on
with
Determinis.c
Models

2a:
Represen.ng
Planning
Domains


4
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Domain
Model

●  Planning domain: an abstract model of the environment
Ø  Many different kinds of environments, various ways to model them
●  In this chapter, the model is a deterministic state-transition system
Ø  Σ = (S,A,γ)
•  S is a finite set of states
▸  States of the world
•  A is a finite set of actions
▸  Things an actor can do
•  γ: S × A → S is a prediction function (or state-transition function)
▸  Given a state s and action a, γ(s,a) is another state
•  Prediction of what state will be produced by executing a in s
•  γ is partial: γ(s,a) is undefined if a is inapplicable in s
▸  Dom(a) = {s ∈ S | γ(s,a) is defined} = {states where a is applicable}
▸  Range(a) = {γ(s,a) | s ∈ Dom(a)}

5
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Implicit
Assump.ons

●  The state-transition model incorporates the following assumptions
Ø  Static world
•  Changes occur only in response to the actor’s actions
Ø  Perfect information
•  Actor always has all the information it needs
Ø  Instantaneous actions
•  Each action causes an instantaneous transition from one state to the next
Ø  Determinism
•  Actions are deterministic
Ø  Correct prediction function
•  Outcome of action a in state s is always γ(s,a)
Ø  Flat search space
•  Only one level of abstraction; ignore how to refine actions at a lower level

6
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

How
to
Represent
Σ?

●  If the domain is small enough
Ø  Give each state and action a
unique name
Ø  For each s and a, store γ(s,a)
in a lookup table
loc1
loc3
loc2
loc6
loc5
loc4
loc7
loc8
loc0
x
y
4
3
2
1
1 2 3 4 5 60
loc9
●  If a domain is larger, don’t represent all states explicitly
Ø  Have a formalism for describing states by describing their properties
Ø  Represent each action by describing how it changes those properties
Ø  Start with initial state, use actions to produce other states

7
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Determinis.c
Operator
(General
Form)

●  Domain-specific format for representing states
Ø  Invent your own format
●  General form of a deterministic operator:
Ø  o = (head(o), pre(o), eff(o), cost(o))
•  head(o): name and parameter list
•  pre(o): preconditions
▸  Computational tests to predict whether an action can be performed in
a state s
▸  In principle, should be necessary/sufficient for the action to run
without error
•  eff(o): effects
▸  Procedures that assign new values to some of the state variables
•  cost(o): procedure that returns a number
▸  Can be omitted, in which case cost(o) = 1
▸  Could represent monetary cost, time required, something else

8
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Example

●  Suppose we want to plan how to create a metal hole in the workpiece
●  a state s includes
Ø  geometric model of the workpiece, variables describing its location,
orientation, and other status information,
Ø  capabilities and status of drilling machine and drill
●  Several actions (getting the workpiece onto the machine, clamping it, loading a
drill bit, etc.)
Ø  Next slide: the drilling operation itself

9
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Drilling
Ac.on

●  head(o) = drill-‐hole(m, w, b, x1, …, xn)
Ø  m, w, b are the names or ID numbers of the drilling machine, the workpiece,
and the drill bit
Ø  x1, …, xn are specifications of the hole’s location, orientation, depth, and
required machining tolerances
●  pre(o): computational tests
Ø  Can the drilling machine and drill bit produce a hole having the desired
geometry and machining tolerances?
Ø  Is the drill loaded into the drilling machine? Is the workpiece is properly
clamped onto the drilling platform? Etc.
●  eff(o): geometric computation
Ø  Modify the geometric model of the workpiece to include a hole having the
desired specifications.
●  cost(o)
Ø  could be an estimate of the time required for the drilling operation
Ø  could be a cost estimate based on time + other criteria

10
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

State-‐Variable
Representa.on

●  Represent each state as a collection of properties of various objects
Ø  Represent each action by describing how it changes those properties
●  Let B = {all objects that matter for planning}
Ø  B may be classified into subsets: various kinds of objects
●  Example
Ø  B = Robots ∪ Containers ∪ Locs ∪ Booleans
•  Robots = {r1,
r2}
•  Containers = {c1,
c2}
•  Locs = {d1,
d2,
d3}
•  Booleans = {T,
F}

d2

d1

d3
r1

c1

r2

c2

11
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Proper.es
of
Objects

●  Define ways to represent properties of objects
Ø  Two kinds of properties: rigid and varying
●  A property is rigid if it stays the same in every state
Ø  Represent as a mathematical relation
●  Example:
Ø  adjacent = {(d1,d2), (d2,d1)}
Ø  Can also write as
•  adjacent(d1,d2)

•  adjacent(d2,d1)

d2

d1

d3
r1

c1

r2

c2

12
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Varying
Proper.es

●  A property is varying if it may differ in different states
Ø  Represent using a state variable that we can assign a value to
Ø  Each state variable x has a range (set of possible values), Range(x)
Ø  For each state s, s(x) ∈ Range(x) is x’s value in state s
●  Example:
Ø  what we want to represent
Ø  Each robot can hold at most one container

Ø  Each robot is at a one of the locations
Ø  Each container is on a robot or at one of the locations

d2

d1

d3
r1

c1

r2

c2

13
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

A
Simple
Example

●  Set of all state variables
Ø  X = {loc(r1), loaded(r1), loc(c1), loc(r2), loaded(r2), loc(c2)}
●  Range(loc(r1)) = Locs
●  Range(loc(r2)) = Locs
●  Range(loaded(r1)) = Booleans
●  Range(loaded(r2)) = Booleans
●  Range(loc(c1)) = Robots ∪ Locs
●  Range(loc(c2)) = Robots ∪ Locs
Why have both
loc and loaded?

d2

d1

d3
r1

c1

r2

c2

14
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

States
as
Func.ons

●  Write each state as a function s that maps each x ∈ X to a value in Range(x)
s1(loc(r1))
=
d1,

s1(loaded(r1))
=
F,

s1(loc(c1))
=
d1,

s1(loc(r2))
=
d2,

s1(loaded(r2))
=
F,

s1(loc(c2))
=
d2
Ø  Let S = the set of all such functions
●  Functions are sets of ordered pairs
s1 = {(loc(r1),
d1),

(loaded(r1),
F),

(loc(c1),
d1), …}
●  Rewrite as
s1 = {loc(r1)
=
d1,

loaded(r1)
=
F,

loc(c1)
=
d1,

loc(r2)
=
d2,

loaded(r2)
=
F,

loc(c2)
=
d2}

d2

d1

d3
r1

c1

r2

c2

15
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

●  Just because a function maps each state variable x into a value in Range(x),
this doesn’t make it a state
Ø  Some sets of state-variable values may not make any sense as states
s = {loaded(r1) = F, loc(c1) = r1, …}
s = {loc(c1) = d1, loc(c2) = r1, …}
●  Need restrictions on what sets of state-variable values constitute real states
Ø  In most of the book, we won’t represent such restrictions explicitly
Ø  Just write the action models in such a way that none of them will ever
produce such a set of assignments
Discussion

d2

d1

d3
r1

c1

r2

c2

16
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Descrip.ve
Ac.on
Models

●  Actions often fall into closely related classes
●  For each class, write a parameterized schema called a planning operator to
describe all the actions in that class
action move(r, l, m)
Pre: loc(r) = l, adjacent(l, m)
Eff: loc(r) ← m
action take(r, l, c)
Pre: loaded(r) = F,
loc(r) = l, loc(c) = l,

Eff: loaded(r) ← T, loc(c) ← r
action put(r, l, c)
Pre: loc(r) = l, loc(c) = r
Eff: loaded(r) ← F, loc(c) ← l
Each parameter has a range
of possible values, e.g.,
Range(r) = Robots

d2

d1

d3
r1

c1

r2

c2

17
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

CSV
Operator

●  Classical State Variable (CSV) Operator:
•  the kind of operator shown on the previous page
●  General form
Ø  o = (head(o), pre(o), eff(o), cost(o))
•  each precondition in pre(o) must have one of these forms:
▸  relname(t1,…,tk) varname(t1,…,tk) = t0
¬relname(t1,…,tk) varname(t1,…,tk) ≠ t0
•  relname = name of a rigid relation
•  varname = name of a state variable
•  each ti must be a constant (i.e., a member of B) or a parameter
•  each effect in eff(o) must have the form varname(t1,…,tk) ← t0
•  if cost(o) isn’t omitted, it must be a nonnegative number (not a formula)
●  Limited representational capability, but easy to compute, easy to reason about
Ø  Many algorithms have been written to use this kind of operator

18
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

CSV
Ac.ons

●  A CSV action is an instance
of a planning operator
Ø  assign values to parameters
action move(r1,d1,d2)

Pre: loc(r1)
=
d1, adjacent(d1,d2)
Eff: loc(r1) ← d2

action take(r2,d2,c2)

Pre: loaded(r2)
=
F,
loc(r2)
=
d2,

loc(c2)
=
d2,

Eff: loaded(r2)
← T,
loc(c1)
← r2

action put(r1,d1,c1)

Pre: loc(r1)
=
d1,
loc(c1)
=
r1

Eff: loaded(r1)
← F,
loc(c1)
← d1

d2

d1

d3
r1

c1

r2

c2

19
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Compu.ng
γ

●  s1 = {loaded(r1)
=
F,
loaded(r2)
=
F,

loc(r1)
=
d1,
loc(r2)
=
d2,

loc(c1)
=
d1,

loc(c2)
=
d2}
●  action take(r2,d2,c2)

Pre: loaded(r2)
=
F,
loc(r2)
=
d2,

loc(c2)
=
d2,

Eff: loaded(r2)
← T,
loc(c1)
← r2

●  γ(s1, take(r2,d2,c2)) =
{loaded(r1)
=
F,
loaded(r2)
=
T,

loc(r1)
=
d1,
loc(r2)
=
d2,

loc(c1)
=
d1,
loc(c2)
=
r2}

d2

d1

d3
r1

c1

r2

c2

d2

d1

d3
r1

c1

r2
c2

20
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Plans

●  Plan: a sequence of actions
π = 〈a1,a2,…,an〉
Ø  π is applicable in s0 if the actions
can be applied in the order given
•  γ (s0,a1) = s1, γ (s1,a2) = s2, …, γ (sn–1,an) = sn
●  If π is applicable, define γ (s0,π) = sn
●  Example
π = 〈take(r2,d2,c2),

move(r2,d2,d1),

put(r2,d1,c2)〉

d2

d1

d3
c1

r2

c2

d2

d1

d3
c1

c2

s0:
γ (s0,π):
r2

r1

r1

21
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Planning
Problems,
Solu.ons

●  Planning problem: P=(Σ,s0,g)
Ø  Σ = planning domain
Ø  s0 = initial state
Ø  g = {g1,…,gk} is a set of constraints called the goal
●  π is a solution for P if γ (s0,π) satisfies g
Ø  π is a shortest solution if no shorter plan is also a solution
Ø  π is a minimal solution if no proper subsequence of π is also a solution
●  CSV planning problem:
Ø  Σ is a CSV planning domain,
Ø  g has the same form as a CSV operator’s preconditions
●  Example:
Ø  s0 and π as on previous page
Ø  g = {loc(r2)=d1,
loc(c2)=d1}

22
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Classical
and
State-‐Variable
Representa.ons

●  Motivation
Ø  The field of AI planning started out as automated theorem proving
Ø  It still uses a lot of that notation
●  Classical representation
Ø  Equivalent to CSV representation
Ø  Represents both rigid and varying properties using logical predicates
•  adjacent(l,m) - location l is adjacent to location m
•  loc(r,l) - robot r is at location l
•  loc(c,l), loc(c,r) - container c is at location l or on robot r
•  loaded(r) - there is a container on robot r

d2

d1

d3
r1

c1

r2

c2

23
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

States

●  Use ground atoms to represent both rigid and varying properties of Σ
●  To represent a state
Ø  s = {all ground atoms that are true in ŝ}
Ø  e.g., s1 = {loc(c1,d1),
loc(c2,d2),

loc(r1,d1),
loc(r2,d2),

adjacent(d1,d2),
adjacent(d2,d1),

adjacent(d1,d3),
adjacent(d3,d1)}

d2

d1

d3
r1

c1

r2

c2

24
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Classical
Planning
Operators

●  Classical planning operator:
Ø  o = (head(o), pre(o), eff(o))
●  Each precondition and effect
must have one of these forms:
pred(t1,…,tk) ¬pred(t1,…,tk)
Ø  pred = predicate name
Ø  each ti must be a constant
(i.e., member of B) or
a parameter
●  Classical action: a ground
instance of a classical operator
move(r,l,m)

Precond: loc(r,l),

Effects: ¬loc(r,l),
loc(r,m)

take(r,l,c)

Precond: loc(r,l),
loc(c,l),
¬loaded(r)

Effects: loc(c,r),
¬loc(c,l),
loaded(r)

put(r,l,c)

Precond: loc(r,l),
loc(c,r)

Effects: loc(c,l),
¬loc(c,r),
¬loaded(r)

d2

d1

d3
r1

c1

r2

c2

25
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Discussion

●  Classical representation is equivalent to state-variable representation in
expressive power
Ø  Each can be converted to the other in linear time and space
●  Classical representation is more natural for logicians
●  CSV is more natural for engineers and most computer scientists
Ø  When changing a value, you don’t have to explicitly delete the old one
●  Historically, classical representation has been more widely used
Ø  Many of the algorithms in the book were originally written to use
classical representation
Ø  That’s starting to change
Classical
rep.
CSV
rep.
P(b1,…,bk) becomes xP(b1,…,bk)=1
x(b1,…,bn)=b0
becomes
Px(b1,…,bn,b0)

26
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

This
work
is
licensed
under
a
CreaCve
Commons
4.0
InternaConal
License.

Chapter
2

Delibera.on
with
Determinis.c
Models

2b:
Planning
Algorithms


27
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Mo.va.on

●  Nearly all planning procedures are search procedures
●  Different planning procedures have different search spaces
Ø  This section: state-space planning
•  Each node represents a state of the world
▸  A plan is a path through the space

28
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Forward
Search

Which forward-search algorithm
depends on how you implement
the nondeterministic choice
I’ll discuss several such algorithms

29
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Depth-‐First
Search

●  At each state, select the action that has the lowest heuristic-functionvalue
●  Visited is for cycle-checking
Ø  If you come to a state you’ve already seen on the current path, then backtrack
Ø  In a finite domain, this guarantees termination
●  Guaranteed to find a solution if one exists

30
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Greedy
Search

●  Like DFFS, but never backtracks
Ø  Not guaranteed to find a solution

31
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

●  Requires a heuristic function in line 3
Ø  Chooses a node (π,s) in Fringe having the smallest value for cost(π) + h(s)
Ø  Expands the node (computes nodes for all applicable actions)
Ø  Prunes nodes that can be shown to be no better than nodes already expanded
A*

32
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

A*

33
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
Straight−line distance
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Arad
366=0+366

34
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu Timisoara
447=118+329 449=75+374393=140+253

35
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Rimnicu VilceaFagaras Oradea
447=118+329 449=75+374
646=280+366 413=220+193415=239+176 671=291+380

36
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Fagaras Oradea
447=118+329 449=75+374
646=280+366 415=239+176
Rimnicu Vilcea
Craiova Pitesti Sibiu
526=366+160 553=300+253417=317+100
671=291+380
X

X

37
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Sibiu Bucharest
447=118+329 449=75+374
646=280+366
591=338+253 450=450+0 526=366+160 553=300+253417=317+100
671=291+380
X
X

X

38
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Bucharest
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu
Fagaras
Pitesti
Rimnicu Vilcea
Vaslui
Iasi
to Bucharest
0
160
242
161
77
151
241
366
193
178
253
329
80
199
244
380
226
234
374
98
Giurgiu
Urziceni
Hirsova
Eforie
Neamt
Oradea
Zerind
Arad
Timisoara
Lugoj
Mehadia
Dobreta
Craiova
Sibiu Fagaras
Pitesti
Vaslui
Iasi
Rimnicu Vilcea
Bucharest
71
75
118
111
70
75
120
151
140
99
80
97
101
211
138
146 85
90
98
142
92
87
86
Zerind
Arad
Sibiu
Arad
Timisoara
Sibiu Bucharest
Bucharest Craiova Rimnicu Vilcea
418=418+0
447=118+329 449=75+374
646=280+366
591=338+253 450=450+0 526=366+160 553=300+253
615=455+160 607=414+193
671=291+380
X
X

X

X

39
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

A*

40
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

●  h is admissible if for every s, 0 ≤ h(s) ≤ h*(s)
Ø  where h*(s) = least cost of getting from s to a state that satisfies g
●  If h is admissible then A* is guaranteed to return an optimal solution
●  Inadmissible heuristics might get you to a solution faster, but the solution won’t
necessarily be optimal
Proper.es
of
A*

41
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Depth-‐First
Branch
and
Bound

●  Depth-first search with heuristic function and pruning
Ø  π* and c*: least-cost solution found so far, and the cost of that solution
Ø  Any time you find a solution with lower cost, update π* and c*
Ø  Prune any plan π such that cost(π) + h(s) ≥ c*

42
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Discussion

●  If the state space is not too large, then A* or DFBB may be preferable
Ø  They are guaranteed to return optimal solutions
●  DFFS returns the first solution it finds
Ø  can be arbitrarily far from optimal
●  Greedy isn’t guaranteed to return a solution at all
●  If S is very large, A* may require excessive memory, and both it and DFBB may
require excessive running time
Ø  In these cases DFFS and Greedy may be preferable

43
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Example

●  One rigid relation, adjacent
●  State variables to represent each
location’s x and y coordinates:
Ø  x = {(loc0, 2), (loc1, 0),
(loc2, 4), (loc3, 0), . . .}
Ø  y = {(loc0, 4), (loc1, 3),
(loc2, 4), (loc3, 2), . . .}
●  One planning operator:
action move(r, l, m)
Ø  pre: adjacent(l, m), loc(r) = l
Ø  eff: loc(r) ← m
●  cost(loci, move(r,locj)) = ︎(xi −xj)2 +(yi − yj)2
= ︎(x(loci) − x(locj))2 + (y(loci) − y(locj))2
●  hsld(si) = ︎(x(loci) − x(locg))2 + (y(loci) − y(locg))2
Ø  Optimal solution to a relaxed problem in which the locations don’t
need to be adjacent
loc1
loc3
loc2
loc6
loc5
loc4
loc7
loc8
loc0
x
y
4
3
2
1
1 2 3 4 5 60
loc9
Not CSV

44
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

CSV
Example

●  s0 = {loc(r1)
=
d3,

loaded(r1)
=
F,

loc(c1)
=
d1};
●  g = {loc(r1)
=
d3,

loc(c1)
=
r1}
●  action move(r, d, e)
Ø  pre: loc(r) = d
Ø  eff: loc(r) ← e
●  action load(r, c, l)
Ø  pre: loaded(r) = F, loc(c) = l, loc(r) = l
Ø  eff: loaded(r) ← T, loc(c) ← r
●  action unload(r, c, l)
Ø  pre: loc(c) = r, loc(r) = l
Ø  eff: loaded(r) ← F, loc(c) ← l
d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

45
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1

CSV
Example

●  Two applicable actions
●  a1 = move(r1,
d3,
d1)

s1 = {loc(r1)
=
d1,
loaded(r1)
=
F,

loc(c1)
=
d1}
●  a2 = move(r1,
d3,
d2)

s2 = {loc(r1)
=
d2,
loaded(r1)
=
F,

loc(c1)
=
d1}
●  In line 4, compute
Ø  min(h(s1), h(s2))
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

46
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Addi.ve-‐Cost
and
Max-‐Cost
Heuris.cs

●  hadd(s) = Δadd(s,g), where
●  Minimum cost of a “plan tree” to achieve g from s
Ø  Pretend each element of g needs a completely separate plan
•  If an action achieves i elements of g, it’s included i times
Ø  Pretend each of an action’s preconditions needs a completely separate plan
Ø  Thus, get a “plan” that’s a tree of actions
•  Total cost can sometimes be much higher than h*(s)
●  Can implement as a tree search, going backward from g

47
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

true in s2
loc(r1)=d3 loc(c1)=r1
g = {loc(r1)=d3,,loc(c1)=r1}
move(r1,d1,d3) move(r1,d2,d3)
pre:
loc(r1)=d1
pre:
loc(r1)=d2
true in s1 … 0
0 > 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s1
loc(r1)=d1 loc(c1)=d1
true in s2
0 0
load(r1,c1,d1)
0+1 = 10+1 = 1 (>0)+1 > 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 0
> 0
(>0) +1 > 1
(>0)+1 > 1
sum
min(1,(>1),(>1)) = 1
sum(0,0,0) = 0
su m
alternatives
alternatives
hadd(s1) = Δadd(s1,g) = sum(1,1) = 2
d3
r1
c1

s1:

d2

d1

d3
r1

c1

Addi.ve-‐Cost
Heuris.c

g = {loc(r1)
=
d3,
loc(c1)
=
r1}

48
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

hadd(s2) = Δadd(s2,g) = sum(1,2) = 3
move(r1,d2,d1)
g = {loc(r1)=d3,0loc(c1)=r1}
pre:
loc(r1)=d1
pre:
loc(r1)=d2
… true in s2
0
> 0 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s2
true in s2
0
0+1 = 1
load(r1,c1,d1)
1+1 = 2(>0)+1 > 1 0+1 = 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 1
> 1
(>1) +1 > 2
(>1)+1 > 2
sum
true in s2
0
min(2,(>2),(>2)) = 2
sum(0,1,0) = 1
su m
alternatives
alternatives
Addi.ve-‐Cost
Heuris.c

g = {loc(r1)
=
d3,
loc(c1)
=
r1}
d3
r1
c1
s2:

d2

d1

d3
r1

c1

49
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Addi.ve-‐Cost
and
Max-‐Cost
Heuris.cs

●  hmax(s) = Δmax(s,g), where
●  Like hadd(s), but doesn’t add all of the costs in the plan tree
Ø  Just the most costly path in the plan tree
●  Guaranteed to be a lower bound on h*(s)
●  Same tree search as for hadd(s)

50
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

true in s2
g = {loc(r1)=d3,,loc(c1)=r1}
pre:
loc(r1)=d1
pre:
loc(r1)=d2
true in s1 … 0
0 > 0
min(1,(>1)) = 1
pre:
loaded(r1)=nil
true in s1
true in s2
0 0
load(r1,c1,d1)
0+1 = 10+1 = 1 (>0)+1 > 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 0
> 0
(>0) +1 > 1
(>0)+1 > 1
max
min(1,(>1),(>1)) = 1
max(0,0,0) = 0
m ax
alternatives
alternatives
hmax(s1) = Δmax(s1,g) = max(1,1) = 1
s1:

d2

d1

d3
r1

c1

Max-‐Cost
Heuris.c

g = {loc(r1)
=
d3,
loc(c1)
=
r1}
d3
r1
c1

51
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

move(r1,d2,d1)
g = {loc(r1)=d3,0loc(c1)=r1}
pre:
loc(r1)=d1
pre:
loc(r1)=d2
… true in s2
0
> 0 0
min(1,(>1)) = 1
hmax(s2) = Δmax(s2,g) = max(1,2) = 2
pre:
loaded(r1)=nil
true in s2
true in s2
0
0+1 = 1
load(r1,c1,d1)
1+1 = 2(>0)+1 > 1 0+1 = 1
load(r1,c2,d2)
load(r1,c3,d3)
…
…
> 1
> 1
(>1) +1 > 2
(>1)+1 > 2
max
true in s2
0
min(2,(>2),(>2)) = 2
max(0,1,0) = 1
m ax
alternatives
alternatives
Max-‐Cost
Heuris.c

g = {loc(r1)
=
d3,
loc(c1)
=
r1}
d3
r1
c1
s2:

d2

d1

d3
r1

c1

52
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Delete-‐Relaxa.on
Heuris.cs

●  Suppose a state s includes an assignment x = v
●  Suppose an action a has an effect x ← w
●  Then γ+(s, a) includes both x = v and x = w
●  Relaxed state (or r-state)
Ø  any set ŝ of state-variable values
Ø  may include more than one value for each state variable
●  ŝ r-satisfies a goal g if ŝ contains a subset that satisfies g
●  An action a is r-applicable in ŝ if ŝ r-satisfies a’s preconditions
Ø  In this case, γ+(ŝ,a) = ŝ ∪ γ(s,a)
●  π = ⟨a1, …, an⟩ is r-applicable in ŝ0 if there are r-states ŝ1, ŝ2, …, ŝn such that
•  a1 is r-applicable in ŝ0 and γ+(ŝ0,a1) = ŝ1
•  a2 is r-applicable in ŝ1 and γ+(ŝ1,a2) = ŝ2
•  …
Ø  In this case, γ+(ŝ,a) = ŝn
●  π is a relaxed solution for P = (Σ, s0, g) if γ+(s0, π) r-satisfies g
Name is from classical planning
“don’t do the deletion”

move(r,l,m)

Precond: loc(r,l),

Effects: ¬loc(r,l),
loc(r,m)

53
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

d3
r1
c1
ŝ1:
g:

d2

d1

d3
r1

r1

c1

Op.mal
Relaxed
Solu.on
Heuris.c

●  ∆+(ŝ, g) = minimum cost of all plans π such that γ+(s0, π) r-satisfies g
●  Optimal relaxed solution heuristic: h+(s) = ∆+(s, g)
●  Example:
Ø  ŝ1 = γ+(s0, move(r1,d3,d1))
= {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1,
loc(r1)
=
d3}
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

54
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

d3
r1
c1
ŝ2:
g:

d2

d1

d3
r1

r1

c1

c1
r1

r1

c1

Op.mal
Relaxed
Solu.on
Heuris.c

●  Example:
Ø  ŝ1 = γ+(s0, move(r1,d3,d1))
= {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1,
loc(r1)
=
d3}
Ø  ŝ2 = γ+(s1, load(r1,c1,d1))
= {loc(r1)
=
d1,
loaded(r1)
=
F,
loc(c1)
=
d1,
loc(r1)
=
d3,

loaded(r1)
=
T,
loc(c1)
=
r1}.
Ø  ŝ2 r-satisfies g, so ⟨move(r1,d3,d1), load(r1,c1,d1)⟩ is a relaxed solution
Ø  It’s optimal, so h+(s0) = 2
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

55
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Op.mal
Relaxed
Solu.on
Heuris.c

●  Every solution is also a relaxed solution
Ø  Thus h+ is admissible
Ø  Problem: computing it is NP-hard
d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

56
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG
Heuris.c

●  Relaxed planning graph (RPG) heuristic
Ø  An approximation of h+ that’s easier to compute
●  Based on the fact that
γ+ doesn’t depend on
the order in which
actions are applied
●  If a1 and a2 are both
applicable in s0 then
γ+(s0, ⟨a1, a2⟩)
= γ+(s0, ⟨a2, a1⟩)
= s0 ∪ eff(a1) ∪ eff(a2)

57
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG
Heuris.c

●  Computation of RPG(s1,g)
Ø  First line of RPG assigns ŝ0 = s1 , A0 = ∅
ŝ0:

loc(r1)
=
d1

loc(c1)
=
d1

loaded(r1)
=
F
ŝ1:

loc(r1)
=
d3

loc(r1)
=
d2

loaded(r1)
=
T

loc(c1)
=
r1

loc(c1)
=
d1

loc(r1)
=
d1

loaded(r1)
=
F
A1:

move(r1,d1,d3)

move(r1,d1,d2)

load(r1,c1,d1)
From ŝ0
d3
r1
c1
s1:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

58
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG
Heuris.c

●  RPG(s1,g) returns 2
ŝ0:

loc(r1)
=
d1

loc(c1)
=
d1

loaded(r1)
=
F
ŝ1:

loc(r1)
=
d3

loc(r1)
=
d2

loaded(r1)
=
T

loc(c1)
=
r1

loc(c1)
=
d1

loc(r1)
=
d1

loaded(r1)
=
F
A1:

move(r1,d1,d3)

move(r1,d1,d2)

load(r1,c1,d1)
From ŝ0
d3
r1
c1
s1:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

59
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG
Heuris.c

ŝ0:

loaded(r1)
=
F

loc(c1)
=
d1

loc(r1)
=
d2

ŝ1:

loc(r1)
=
d3

loc(r1)
=
d1

loaded(r1)
=
F

loc(c1)
=
d1

loc(r1)
=
d2
A1:

move(r1,d2,d3)

move(r1,d2,d1)

From ŝ0
A2:

move(r1,d3,d1)

move(r1,d3,d2)

move(r1,d1,d2)

move(r1,d1,d3)

load(r1,c1,d1)

move(r1,d2,d3)

move(r1,d2,d1)

ŝ2:

loc(c1)
=
r1

loaded(r1)
=
T

loc(r1)
=
d3

loc(r1)
=
d1

loc(c1)
=
d1

loc(r1)
=
d2

loaded(r1)
=
F
From ŝ0
d3
r1
c1
s2:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

60
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG
Heuris.c

ŝ0:

loaded(r1)
=
F

loc(c1)
=
d1

loc(r1)
=
d2

ŝ1:

loc(r1)
=
d3

loc(r1)
=
d1

loaded(r1)
=
F

loc(c1)
=
d1

loc(r1)
=
d2
A1:

move(r1,d2,d3)

move(r1,d2,d1)

From ŝ0
A2:

move(r1,d3,d1)

move(r1,d3,d2)

move(r1,d1,d2)

move(r1,d1,d3)

load(r1,c1,d1)

move(r1,d2,d3)

move(r1,d2,d1)

ŝ2:

loc(c1)
=
r1

loaded(r1)
=
T

loc(r1)
=
d3

loc(r1)
=
d1

loc(c1)
=
d1

loc(r1)
=
d2

loaded(r1)
=
F
From ŝ0
d3
r1
c1
s2:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
●  RPG(s2,g) returns 3

61
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Landmark
Heuris.cs

●  P = (Σ,s0,g) be a planning problem
●  Let φ = φ1 ∨ ... ∨ φm be a disjunction of state-variable assignments
●  Definition: φ is a landmark for P if φ is true at some point in every solution plan
of P
●  Example Landmarks
d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

d1

r1

A complete state s0
A single state-variable
Assignment (loc(r1)=d1)

d1

r1

d2

r1

Can be a disjunction of
state-variable assignments
loc(r1)=d1 ∨
loc(r1)=d2

62
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Why
are
Landmarks
Useful?

●  Help in breaking down the given problem into smaller subproblems
gs0

63
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Why
are
Landmarks
Useful?

●  Help in breaking down the given problem into smaller subproblems
●  Every solution to P has to achieve these landmarks
●  Possible strategy:
Ø  find a plan that takes us from s0 to any state s1 that satisfies lm1
Ø  find a plan that takes us from s1 to any state s2 that satisfies lm2
Ø  …
lm1
gs0
lm2
lm3
P1
P2 P3
P4

64
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Compu.ng
Landmarks

●  Question: How do we compute landmarks for a problem P?
●  Not easy:
Ø  Deciding whether a state-variable assignment φ is a landmark is in the worst-
case PSPACE-hard
Ø  To put it in perspective: as hard as solving the planning problem itself!
●  However, all is not lost:
Ø  There are often useful landmarks that can be found more easily
Ø  There are polynomial-time procedures that can compute these landmarks
Ø  Going to see one such procedure based on Relaxed Planning Graphs
●  Why Relaxed Planning Graphs?
Ø  Solving relaxed planning problems easier
•  Computing landmarks for relaxed planning problems easier
Ø  A landmark for a relaxed planning problem is a landmark for the original
planning problem as well

65
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

●  Main intuition: if a state-variable assignment φ is a landmark, then preconditions
of actions that achieve φ is also a landmark
Ø  In other words: we’re going to start from known landmarks and discover new
ones by looking at preconditions of actions that achieve these landmarks
●  Example:
Ø  g is the goal
Ø  Actions a1 and a2 can achieve g
Ø  Therefore, either p1∧q or p2∧q
must be true to be able to achieve g
Ø  In other words: (p1∧q)∨(p2∧q) is
a landmark
Ø  By rearranging terms: we get
(p1∧q)∨(p2∧q) ≅ q∧(p1∨p2)
Ø  Since q∧(p1∨p2) is a landmark, both q and (p1∨p2) are landmarks
●  In practice, we try to rearrange assignments in a similar manner to group terms
with the same state variable together
g
a2
a1
p1
q
p2

66
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

●  Question: What landmarks can we start with?
Ø  Every goal is trivially a landmark; can start from there
●  E.g. loc(r1)
=
d3
is a landmark
Ø  Two actions achieve this landmark: move (r1, d3, d1)
and move (r1, d2, d1)
Ø  Can infer a new landmark that is the disjunction of the
preconditions of these two actions:
φ‘ = loc(r1)
=
d3∨loc(r1)
=
d2
action move(r, d, e)
pre: loc(r) = d
eff: loc(r) ← e
d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}

67
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
●  While queue is not empty:
Ø  Remove a gi from queue
Ø  Ai = {all actions that can achieve gi}
Ø  Compute all assignments that can be r-
produced starting from s0 without using
Ai, thus generating the RPG foo

Ø  act(gi) = {all actions in Ai that are r-
applicable in the r-state resulting from
foo}
Ø  For each action in act(gi):
•  Pick a precondition not satisfied in s0
and add to φ
Ø  The resulting disjunction φ is a
landmark; add it to queue
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3

68
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

●  queue = {g1, g2,…, gk}

foo}
and add to φ
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3

69
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

●  queue = {g1, g2,…, gk}

foo}
and add to φ
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3

70
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

●  queue = {g1, g2,…, gk}

foo}
and add to φ
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3

71
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

●  queue = {g1, g2,…, gk}

foo}
and add to φ
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3

72
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

●  queue = {g1, g2,…, gk}

foo}
and add to φ
gi
a1
p1
q1
a2
p2
q2
a3
p3
q3 p3∨q1: a new landmark

73
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1ComputeLandmark (s0, g = g1 ∧ g2 ∧ … gk)
●  queue = {g1, g2,…, gk}
Ø  Compute all assignments that can be r-produced
starting from s0 without using Ai, thus generating the
RPG foo

Ø  act(gi) = {all actions in Ai that are r-applicable in the
r-state resulting from foo}
•  Pick a precondition not satisfied in s0 and add to φ
Ø  The resulting disjunction φ is a landmark; add it to
queue

74
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
True in
current state, always going
to be true in RPG;
No use expanding
●  queue = {g1, g2,…, gk}
RPG foo

Ø  act(gi) = {all actions in Ai that are r-applicable in the
r-state resulting from foo}
queue

75
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
3 actions achieve this: load

(r1,c1,d1),
load
(r1,c1,d2),

load
(r1,c1,d3)

action load(r, c, l)
pre: loaded(r) = nil, loc(c) =
l, loc(r) = l
eff: loaded(r) ← c, loc(c) ← r

●  queue = {g1, g2,…, gk}
RPG foo

Ø  act(gi) = {all actions in Ai that are r-applicable in the r-
state resulting from foo}
queue

76
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

●  queue = {g1, g2,…, gk}
RPG foo

queue
RPG-‐based
Landmark
Computa.on

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1

(r1,c1,d1),
load
(r1,c1,d2),

load
(r1,c1,d3)

l, loc(r) = l

77
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
ŝ0 = {

loc(c1)
=
d1

loc(r1)
=
d3

loaded(r1)
=
nil

}
A1 = {

move(r1,d3,d1)

move(r1,d3,d2)

}
ŝ1 = {

loc(r1)
=
d1

loc(r1)
=
d2

loc(c1)
=
d1

loc(r1)
=
d3

loaded(r1)
=
nil

}
From ŝ0

load(r1,c1,d1)

load(r1,c1,d2)

load(r1,c1,d3)

Relaxed Planning Graph foo

Only first load action
applicable in r-state
at the end of foo

78
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1

(r1,c1,d1),
load
(r1,c1,d2),

load
(r1,c1,d3)

l, loc(r) = l

Only load
(r1,c1,d1)
is applicable in
final level of RPG; these preconds
used to generate new landmarks
●  queue = {g1, g2,…, gk}
RPG foo

queue

79
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

RPG-‐based
Landmark
Computa.on

d3
r1
c1
s0:
g:

d2

d1

d3
r1

c1
g = {loc(r1)
=
d3,
loc(c1)
=
r1}
loc(r1)
=
d3 loc(c1)
=
r1
loaded(r1) = nil
loc(c1) = d1

loc(r1) = d1
These new landmarks
are added
to the queue
●  queue = {g1, g2,…, gk}
RPG foo

queue

80
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Landmark
Heuris.c

●  Every solution to the problem needs to achieve all the computed landmarks
●  One possible heuristic to estimate distance of state s from g:
Ø  Number of landmarks required to be accomplished from s
Ø  Planner biased towards actions that achieve landmarks
●  Is this heuristic admissible?

81
Dana
Nau
and
Vikas
Shivashankar:
Lecture
slides
for
Automated
Planning
and
Ac0ng
Updated
3/24/15

Landmark
Heuris.c

●  Every solution to the problem needs to achieve all the computed landmarks
●  One possible heuristic to estimate distance of state s from g:
Ø  Number of landmarks required to be accomplished from s
Ø  Planner biased towards actions that achieve landmarks
●  Is this heuristic admissible?
●  A number of more advanced landmark-based heuristics developed (including
admissible ones)
Ø  Check textbook for references
g1
a
s0
g2
Number of landmarks: | {g1, g2} | = 2
Optimal Plan Length = |<a>| = 1
g = g1∧g2

Automated Planning Lecture Slides

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Automated Planning Lecture Slides

Similar to Automated Planning Lecture Slides (20)

More from Tianlu Wang

More from Tianlu Wang (20)

Automated Planning Lecture Slides