Title: Sequential Monte Carlo in Probabilistic Planning Reachability Heuristics
1Sequential Monte Carlo in Probabilistic Planning
Reachability Heuristics
- Daniel Bryce
- Subbarao Kambhampati
- David E. Smith
- Supported By NSF, ONR, NASA, IBM, ARCS
2Scalability of Planning
Problem is Search Control!!!
- Before, planning algorithms could synthesize
about 6 10 action plans in minutes - Significant scale-up in the last 6-7 years
- Now, we can synthesize 100 action plans in
seconds.
The primary revolution in planning in the recent
years has been domain-independent heuristics to
scale up plan synthesis
3Rover Domain
4Planning Graph and Search Tree
- Envelope of Progression Tree (Relaxed
Progression) - Proposition lists Union of states at kth level
- Lowerbound reachability information
5Stochastic Rover Example
ICAPS 2006
6Search in Probabilistic Belief State Space
0.04
sample(soil, ?)
0.04
0.04
avail(soil, ?)
0.36
avail(soil, ?)
drive(?, ?)
0.36
at(b)
sample(soil, ?)
have(soil)
at(?)
avail(soil, ?)
have(soil)
0.36
at(b)
0.5
have(soil)
0.5
0.4
0.05
0.1
0.1
avail(soil, b)
0.45
at(b)
0.5
have(soil)
drive(?, g)
drive(?, ?)
0.1
0.1
0.4
0.5
drive(?, g)
0.1
7Relaxed Conformant GraphPlan (CGP)
Generate a proposition layer for each
joint Outcome of actions
A1
A0
P1
P0
P2
avail(soil, ?)
at(?)
Initial Proposition Layer For Each Possible State
at(?)
at(?)
comm(soil)
avail(soil, ?)
have(soil)
drive(?, ?)
at(?)
drive(?, ?)
at(?)
avail(soil, ?)
sample(soil, ?)
at(?)
at(?)
avail(soil, ?)
drive(?, ?)
at(?)
commun(soil)
have(soil)
drive(?, ?)
at(?)
at(?)
sample(soil, ?)
have(soil)
avail(soil, ?)
at(?)
avail(soil, b)
at(?)
at(?)
at(?)
comm(soil)
have(soil)
avail(soil, ?)
avail(soil, ?)
at(?)
avail(soil, g)
at(?)
at(?)
at(?)
at(?)
at(?)
at(?)
have(soil)
8CGP-style planning graph
Planning Graph is a tree Of Deterministic
Planning Graph Branches
0.4
0.5
0.1
9Monte Carlo CGP
P(G) 5/16 0.3125
P(G) 8/16 0.5
P(G) 13/16 0.8125
0.4
- Problem
- Have Multiple Planning Graphs,
- Which can still be costly
- Solution
- Use Labeled Planning Graph
0.5
0.1
10Monte Carlo LUG (McLUG) -- Initial Layer
N 4
avail(soil, ?)
avail(soil, ?)
at(?)
avail(soil, ?)
0.4
at(?)
avail(soil, b)
0.5
avail(soil, b)
at(?)
at(?)
0.1
avail(soil, b)
at(?)
Sample a State for Each particle
Form Initial Layer
11Monte Carlo LUG (McLUG)
A1
A0
A2
P1
P0
P2
P3
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, b)
avail(soil, b)
avail(soil, b)
avail(soil, b)
sample(soil, ?)
sample(soil, ?)
sample(soil, ?)
have(soil)
have(soil)
have(soil)
sample(soil, b)
sample(soil, b)
comm(soil)
comm(soil)
commun(soil)
commun(soil)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
at(?)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
drive(b, a)
drive(b, a)
at(?)
at(?)
at(?)
drive(b, ?)
drive(b, ?)
Particles in action Label must sample action
outcome
¾ of particles Support goal, Okay to stop
Sample States for Initial layer
¼ of particles Support goal, need At least ½
drive(g, a)
drive(g, a)
drive(g, b)
drive(g, b)
12Monte Carlo LUG (McLUG)
A1
A0
A2
P1
P0
P2
P3
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, ?)
avail(soil, b)
avail(soil, b)
avail(soil, b)
avail(soil, b)
sample(soil, ?)
sample(soil, ?)
sample(soil, ?)
have(soil)
have(soil)
have(soil)
sample(soil, b)
sample(soil, b)
comm(soil)
comm(soil)
commun(soil)
commun(soil)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
at(?)
drive(?, ?)
drive(?, ?)
drive(?, ?)
at(?)
at(?)
at(?)
drive(b, a)
drive(b, a)
at(?)
at(?)
at(?)
drive(b, ?)
drive(b, ?)
Must Support Goal in All Particles
Pick Persistence By default, Commun covers Other
particles
Support Preconditions for Particles the Action
supports
drive(g, a)
drive(g, a)
drive(g, b)
drive(g, b)
13Related CPP planners
- Finite Horizon (Maximize Probability of k-step
plan) - POMDP VI
- MAXPLAN Littman, 1998
- CPplan Hyafil and Bacchus, 2003,2004
- COMPLAN Huang, 2006
- Indefinite Horizon (Minimize plan length to
exceed goal probability threshold) - Buridan Kushmerick et.al., 1995
- Probabpop Onder et.al., 2006
- PFF Domshalk and Hoffmann, 2006
- POND w/ McLUG Bryce et.al., 2006
Can Solve By Iterating over Planning Horizons
14Evaluation
- POND w/ McLUG
- Forward Chaining A search
- C/C, IPP, CUDD (BDDs)
- Adjust particles
- Average over 5 runs
- CPplan (increment plan lengths)
- Gave POND several ?, matching plans found by
CPplan - 2.66 GHz P4, 1GB RAM, 20 minute time out
- Domains Logistics, 10x10 Grid, Slippery Gripper,
Sand-Castle-67
15Logistics (Ppackages-cities-locs per
city)
P4-2-2 time (s)
P2-2-4 time (s)
P2-2-2 time (s)
Significantly More Scalable, w/ comparable quality
P4-2-2 length
P2-2-4 length
P2-2-2 length
16Grid
Grid(0.8) time(s)
Grid(0.5) time(s)
Again, good scalability and quality!
Need More Particles for broad beliefs
Grid(0.8) length
Grid(0.5) length
17Slippery Gripper SandCastle-67
Time
Length
18Conclusion
- Monte Carlo is effective in relaxing relaxed
reachability analysis - Increasing Particles makes trade-off between
heuristic computation time and informedness - Approximating plan suffixes preferable in bigger
problems where CPplan exhausts memory - Should be useful in non-probabilistic planning
when planning graph computations are costly - Future Work
- How many particles !?