Sequential Monte Carlo in Probabilistic Planning Reachability Heuristics

About This Presentation

Title:

Sequential Monte Carlo in Probabilistic Planning Reachability Heuristics

Description:

Before, planning algorithms could synthesize about 6 10 action plans in minutes ... Domains: Logistics, 10x10 Grid, Slippery Gripper, Sand-Castle-67 ... – PowerPoint PPT presentation

Number of Views:102

Avg rating:3.0/5.0

Slides: 19

Provided by: DB27

Category:

more less

Transcript and Presenter's Notes

Title: Sequential Monte Carlo in Probabilistic Planning Reachability Heuristics

1
Sequential Monte Carlo in Probabilistic Planning
Reachability Heuristics

Daniel Bryce
Subbarao Kambhampati
David E. Smith
Supported By NSF, ONR, NASA, IBM, ARCS

2
Scalability of Planning
Problem is Search Control!!!

Before, planning algorithms could synthesize
about 6 10 action plans in minutes
Significant scale-up in the last 6-7 years
Now, we can synthesize 100 action plans in
seconds.

The primary revolution in planning in the recent
years has been domain-independent heuristics to
scale up plan synthesis
3
Rover Domain
4
Planning Graph and Search Tree

Envelope of Progression Tree (Relaxed
Progression)
Proposition lists Union of states at kth level
Lowerbound reachability information

5
Stochastic Rover Example
ICAPS 2006
6
Search in Probabilistic Belief State Space
0.04
sample(soil, ?)
0.04
0.04
avail(soil, ?)
0.36
avail(soil, ?)
drive(?, ?)
0.36
at(b)
sample(soil, ?)
have(soil)
at(?)
avail(soil, ?)
have(soil)
0.36
at(b)
0.5
have(soil)
0.5
0.4
0.05
0.1
0.1
avail(soil, b)
0.45
at(b)
0.5
have(soil)
drive(?, g)
drive(?, ?)
0.1
0.1
0.4
0.5
drive(?, g)
0.1
7
Relaxed Conformant GraphPlan (CGP)
Generate a proposition layer for each
joint Outcome of actions
A1
A0
P1
P0
P2
avail(soil, ?)
at(?)
Initial Proposition Layer For Each Possible State
at(?)
at(?)
comm(soil)
avail(soil, ?)
have(soil)
drive(?, ?)
at(?)
drive(?, ?)
at(?)
avail(soil, ?)
sample(soil, ?)
at(?)
at(?)
avail(soil, ?)
drive(?, ?)
at(?)
commun(soil)
have(soil)
drive(?, ?)
at(?)
at(?)
sample(soil, ?)
have(soil)
avail(soil, ?)
at(?)
avail(soil, b)
at(?)
at(?)
at(?)
comm(soil)
have(soil)
avail(soil, ?)
avail(soil, ?)
at(?)
avail(soil, g)
at(?)
at(?)
at(?)
at(?)
at(?)
at(?)
have(soil)
8
CGP-style planning graph
Planning Graph is a tree Of Deterministic
Planning Graph Branches
0.4
0.5
0.1
9
Monte Carlo CGP
P(G) 5/16 0.3125
P(G) 8/16 0.5
P(G) 13/16 0.8125
0.4

Problem
Have Multiple Planning Graphs,
Which can still be costly
Solution
Use Labeled Planning Graph

0.5
0.1
10
Monte Carlo LUG (McLUG) -- Initial Layer
N 4
avail(soil, ?)
avail(soil, ?)
at(?)
avail(soil, ?)
0.4
at(?)
avail(soil, b)
0.5
avail(soil, b)
at(?)
at(?)
0.1
avail(soil, b)
at(?)
Sample a State for Each particle
Form Initial Layer
11
Monte Carlo LUG (McLUG)
A1
A0
A2
P1
P0
P2
P3
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, b)
avail(soil, b)
avail(soil, b)
avail(soil, b)
sample(soil, ?)
sample(soil, ?)
sample(soil, ?)
have(soil)
have(soil)
have(soil)
sample(soil, b)
sample(soil, b)
comm(soil)
comm(soil)
commun(soil)
commun(soil)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
at(?)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
drive(b, a)
drive(b, a)
at(?)
at(?)
at(?)
drive(b, ?)
drive(b, ?)
Particles in action Label must sample action
outcome
¾ of particles Support goal, Okay to stop
Sample States for Initial layer
¼ of particles Support goal, need At least ½
drive(g, a)
drive(g, a)
drive(g, b)
drive(g, b)
12
Monte Carlo LUG (McLUG)
A1
A0
A2
P1
P0
P2
P3
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, b)
avail(soil, b)
avail(soil, b)
avail(soil, b)
sample(soil, ?)
sample(soil, ?)
sample(soil, ?)
have(soil)
have(soil)
have(soil)
sample(soil, b)
sample(soil, b)
comm(soil)
comm(soil)
commun(soil)
commun(soil)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
at(?)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
drive(b, a)
drive(b, a)
at(?)
at(?)
at(?)
drive(b, ?)
drive(b, ?)
Must Support Goal in All Particles
Pick Persistence By default, Commun covers Other
particles
Support Preconditions for Particles the Action
supports
drive(g, a)
drive(g, a)
drive(g, b)
drive(g, b)
13
Related CPP planners

Finite Horizon (Maximize Probability of k-step
plan)
POMDP VI
MAXPLAN Littman, 1998
CPplan Hyafil and Bacchus, 2003,2004
COMPLAN Huang, 2006
Indefinite Horizon (Minimize plan length to
exceed goal probability threshold)
Buridan Kushmerick et.al., 1995
Probabpop Onder et.al., 2006
PFF Domshalk and Hoffmann, 2006
POND w/ McLUG Bryce et.al., 2006

Can Solve By Iterating over Planning Horizons
14
Evaluation

POND w/ McLUG
Forward Chaining A search
C/C, IPP, CUDD (BDDs)
Adjust particles
Average over 5 runs
CPplan (increment plan lengths)
Gave POND several ?, matching plans found by
CPplan
2.66 GHz P4, 1GB RAM, 20 minute time out
Domains Logistics, 10x10 Grid, Slippery Gripper,
Sand-Castle-67

15
Logistics (Ppackages-cities-locs per
city)
P4-2-2 time (s)
P2-2-4 time (s)
P2-2-2 time (s)
Significantly More Scalable, w/ comparable quality
P4-2-2 length
P2-2-4 length
P2-2-2 length
16
Grid
Grid(0.8) time(s)
Grid(0.5) time(s)
Again, good scalability and quality!
Need More Particles for broad beliefs
Grid(0.8) length
Grid(0.5) length
17
Slippery Gripper SandCastle-67
Time
Length
18
Conclusion

Monte Carlo is effective in relaxing relaxed
reachability analysis
Increasing Particles makes trade-off between
heuristic computation time and informedness
Approximating plan suffixes preferable in bigger
problems where CPplan exhausts memory
Should be useful in non-probabilistic planning
when planning graph computations are costly
Future Work
How many particles !?