NMS PI meeting, September 2729, 2000 - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

NMS PI meeting, September 2729, 2000

Description:

Implies no backtracking during solution extraction search! ... that ignores delete effects (R stores propositions in most recent level) ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 25
Provided by: edwin4
Category:

less

Transcript and Presenter's Notes

Title: NMS PI meeting, September 2729, 2000


1
Heuristic Search PlanningProgression and
Regression
Alan Fern
  • A heuristic for STRIPS problems
  • Forward search (HSP, HSP2.0)
  • Regression
  • Regression search (HSPr)
  • Based in part on slides by Daniel Weld and Dana
    Nau

2
Planning as heuristic search
  • Use standard search techniques, e.g. A,
    best-first, hill-climbing etc.
  • Attempt to extract heuristic state evaluator
    automatically from the Strips encoding of the
    domain
  • Here, heuristic is based on relaxed problem by
    assuming action preconditions are independent and
    no delete effects

3
Review Heuristic Search
  • A search is a best-first search using node
    evaluation
  • f(s) g(s) h(s)
  • where
  • g(s) accumulated cost/number of actions
  • h(s) estimate of future cost
  • h(s) is admissible if it does not overestimate
    the cost to goal
  • For admissible h(s) A returns optimal solutions

4
Heuristic from Relaxed Problem
  • Relaxed problem ignores delete lists on actions
  • The length of optimal solution for the relaxed
    problem is admissible heuristic for original
    problem. Why?
  • BUT still finding optimal solution is NP-hard
  • So we will approximate it
  • One way is to explicitly search for a relaxed
    plan
  • Finding a relaxed plan can be done in polynomial
    time
  • Take relaxed-plan length to be the heuristic
    value
  • FF (for FastForward) is one such well-known
    planner

5
FF Planner finding relaxed plans
  • Consider running Graphplan while ignoring the
    delete lists
  • No mutexes
  • Implies no backtracking during solution
    extraction search!
  • So we can find a relaxed solutions efficiently
  • After running the no-delete-list Graphplan take
    the number of actions in layered plan to be
    heuristic
  • Different choices in solution extracton can lead
    to differentheuristic values
  • The planner FastForward (FF) uses this heuristic
    in forward state-space best-first search
  • Actually uses several improvements over this
  • Took first place in the AIPS-2000 planning
    competition

6
Example Finding Relaxed Plans
Relaxed plan graph(no mutexes)
The value returned depends on particularchoices
made in the backward extraction
7
HSP Indirect Relaxed Plan Length
  • HSP preceded FF and was one of the first
    successful state-space heuristic search
  • HSP does not compute a relaxed plan explicitly
  • Uses recursive equations to compute bounds on the
    relaxed plan length
  • ?(s,p) minimum distance from state s to a state
    containing proposition p
  • ?(s,g) minimum distance from state s to a state
    containing every proposition p in goal set g
  • Since these are NP-hard to compute we will
    instead compute ?0(s,p) and ?0(s,g)
  • estimates of ?(s,p) and ?(s,g)

8
Heuristic Functions for Planning
  • ?0(s,p) and ?0(s,g)
  • estimates of ?(s,p) and ?(s,g)
  • and p ? s,
  • h(s) ?0(s,g), where g is the goal

9
Admissibility
  • Is h admissible?
  • No. It assumes subgoals are independent, but
    they may not be.
  • I.e. it assumes that the cost of achieving a set
    of subgoals is the sum of costs of achieving them
    independently
  • In reality, achieving one subgoal can help
    achieve another subgoal
  • Consider an alternative heuristic hmax(s)
    ?0(s,g) where we redefine ?0(s,g) to be
  • ?0(s,g) MAX p ? g ?0(s,p)
  • Is hmax admissible?
  • Yes.
  • Why would we use h instead of hmax then?
  • h is more informative, which tends to lead us to
    the goal more quickly
  • But the solutions found may not be optimal

10
Computing the Heuristic
  • Given current state s, can compute ?0(s,p), for
    every proposition p in polynomial time
  • 1) Set ?0(s,p)0 if p ? s, otherwise ?0(s,p)?
  • 2) R s the reachable set of
    propositions
  • 3) repeat until no change to ?0(s,p)
  • for each action a such that
    PRE(a)?? R do
  • for each p ? ADD(a) do
  • add p to R
  • ?0(s,p) min ?0(s,p),
    1?p ? PRE(a) ?0(s,p)
  • From this, compute h(s) ?0(s,g) ?p ? g
    ?0(s,p)
  • Can be viewed as a plan graph expansion that
    ignores delete effects (R stores propositions in
    most recent level)

11
HSP algorithm overview
  • Hill-climbing search based on h(s)
  • randomly breaks ties
  • restarts if no progress is made for a given
    number of steps
  • Some ad hoc choices for the planning competition
  • Hill-climbing search is not complete and is not
    guaranteed optimal

12
HSP2 overview
  • Based on weighted A (WA) search
  • f(n) g(n) W h(n)
  • If W 1, its A (with admissible h).
  • If W gt 1, its a little greedy generally finds
    solutions faster, but not optimal (within factor
    of W of optimal).
  • In HSP2, W 5

13
Experiments
  • Does ok compared with IPP (Graphplan derivative)
    and Blackbox.

14
Regression search
  • Motivation for HSPr
  • HSP and HSP2 spend up to 80 of their time
    computing the evaluation function.
  • Search backwards from goal. This will allow reuse
    of the heuristic computation

goal (partial state)
initial state
. . . .
Problem Many possible goal states are equally
acceptable since the goal is
only a partial specification.
From which one should we search?
15
Regression
  • Let G be a goal (set of facts)
  • The regression of a goal G through an action A,
    REG(G,A) yields
  • weakest precondition G (least constraining G)
  • Such that if G is true before A is executed
    then G is guaranteed to be true afterwards

A
G
precond
GREG(G,A)
effect
Represents a set of world states
Represents a set of world states
16
Regressing STRIPS Actions
  • An action A is relevant for G, if
  • G ? ADD(A) ? ?
  • G ? DEL(A) ?
  • The result of regressing g through a is
  • REG(G,A) (G ADD(A)) ? PRE(A)

A
G
precond
GREG(G,A)
effect
17
Regression Example
A
G
G
precond
effect
clear(C) ontable(C) handempty on(A,B)
holding(C) on(A,B)
pickup(C) PRE clear(C), ontable(C),
handempty ADD holding(C) DEL
clear(C), handempty, ontable(C)
18
HSPr search space
  • Search nodes are sets of atoms (correspond to
    sets of states in original space)
  • initial search node n0 is the goal G
  • Goal nodes are those that are true in the initial
    state s0
  • Heuristic value for search node g is
    h(g) ?0(s0,g) ?p ? g ?0(s0,p)
  • Note that we can compute ?0(s0,p) before search
    begins and reuse the values during search (avoids
    significant computation)!

19
Mutexes in HSPr
  • Problem many of the regressed goal states are
    impossible prune them with mutexes
  • E.g in blocksworld (on(c,d), on(a,d), ..) is
    probably unreachable.
  • How can we detect and prune this set during
    regression?
  • Compute a set of mutex propositions and only
    consider regression results that do not include
    mutexed pairs.

20
Mutexes in HSPr
  • First definition
  • A set M of pairs R p, q is a mutex set if
  • (1) R is not true in s0, and
  • (2) for every action A that adds p, a deletes q
    (and vice versa replacing the roles of p
    and q)
  • Sound, but too weak. Will not recognize many
    mutexed propositions

21
Mutexes in HSPr, take 2
  • Better definition
  • A set M of pairs R p, q is a mutex set if
  • (1) R is not true in s0
  • (2) for every action A that adds p,
  • either A deletes q,
  • or A does not add q, and for some
    precondition r of A,
  • r, q is in M. (and vice versa
    replacing the role of p and q)
  • Recursive definition allows for some interaction
    of the operators

22
Computing mutex sets
  • Start with some set of potential mutex pairs
  • Delete any that dont satisfy (1) and (2) above
  • Keep going until you dont delete any more
  • Initial set? could be all pairs (usually too
    expensive)
  • The paper gives one suggestion for a smaller
    initial set.

23
HSPr algorithm
  • Compute mutex set M
  • Computer heuristic value for each proposition
  • WA search using h(g) as heuristic and pruning
    states that violate M
  • W 5 as before

24
Experiments comparing HSP2 and HSPr
  • Sometimes HSPr does better, sometimes HSP2 does
    better. Why?
  • Two reasons (per B G)
  • HSPr saves significant time per search node
  • But, regressions still yields spurious states
  • Also, since HSP2 recomputes the estimate in each
    state, it actually has more information
Write a Comment
User Comments (0)
About PowerShow.com