Lookahead pathology in real-time pathfinding - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Lookahead pathology in real-time pathfinding

Description:

First explanation: a lot of states are intrinsically pathological (off-policy mode) ... LRTS tends to visit pathological states with an above-average frequency ... – PowerPoint PPT presentation

Number of Views:19
Avg rating:3.0/5.0
Slides: 58
Provided by: mitjal
Category:

less

Transcript and Presenter's Notes

Title: Lookahead pathology in real-time pathfinding


1
Lookahead pathology in real-time pathfinding
  • Mitja Luštrek
  • Jožef Stefan Institute, Department of Intelligent
    Systems
  • Vadim Bulitko
  • University of Alberta, Department of Computer
    Science

2
  • Introduction
  • Problem
  • Explanation
  • Remedy

3
Real-time single-agent heuristic search
  • Task
  • find a path from a start state to a goal state
  • Complete search
  • plan the whole path to the goal state
  • execute the plan
  • example A Hart et al. 68
  • good given an admissible heuristic, the path is
    optimal
  • bad the delay before the first move can be large

4
Real-time single-agent heuristic search
  • Incomplete search
  • plan a part of the path to the goal
  • execute the plan
  • repeat
  • example LRTA Korf 90, LRTS Bulitko Lee 06
  • good delay before the first move small, amount
    of planning per move bounded
  • bad the path is typically not optimal

5
Why do we need it?
  • Picture a real-time strategy game
  • The user commands dozens of units to move towards
    a distant goal
  • Complete search would have to compute the whole
    paths for all of them
  • Incomplete search computes just the first couple
    of steps

6
Heuristic lookahead search
Lookahead area
Current state
Goal state
Lookahead depth d
7
Heuristic lookahead search
f g h
True shortest distance g
Estimated shortest distance h
Frontier state
8
Heuristic lookahead search
Frontier state with the lowest f (fopt)
9
Heuristic lookahead search
10
Heuristic lookahead search
h fopt
11
Heuristic lookahead search
12
Lookahead pathology
  • Generally believed that larger lookahead depths
    produce better solutions
  • Solution-length pathology larger lookahead
    depths produce worse solutions

Lookahead depth Solution length
1 11
2 10
3 8
4 10
5 7
6 8
7 7
Degree of pathology 2
13
Lookahead pathology
  • Pathology on states that do not form a path
  • Error pathology larger lookahead depths produce
    more suboptimal decisions

Multiple states Multiple states
Depth Error
1 0.31
2 0.25
3 0.21
4 0.24
5 0.18
6 0.23
7 0.12
One state One state
Depth Decision
1 suboptimal
2 suboptimal
3 optimal
4 optimal
5 optimal
6 suboptimal
7 suboptimal
Degree of pathology 2
There is pathology
14
Related minimax pathology
  • Minimax backs up heuristic values from the leaves
    of the game tree to the root
  • Attempts to explain why backed-up heuristic
    values are better than static values
  • Theoretical analyses show that they are worse
    pathology Nau 79, Beal 80
  • Explanations
  • similarity of nearby positions in real games
  • realistic modeling of error
  • ...
  • Focus on why the pathology does not appear in
    practice

15
Related pathology in single-agent search
  • Discovered on synthetic search trees Bulitko et
    al. 03
  • Observed in eight puzzle Bulitko 03
  • appears with different evaluation functions
  • shown that the benefit from knowing the optimal
    lookahead depth is large
  • Explained on synthetic search trees Luštrek 05
  • caused by certain properties of trees
  • caused by inconsistent and inadmissible
    heuristics
  • Unexplored in pathfinding

16
  • Introduction
  • Problem
  • Explanation
  • Remedy

17
Our setting
  • HOG Hierarchical Open Graph Sturtevant et al.
  • Maps from commercial computer games (Baldurs
    Gate, Warcraft III)
  • Initial heuristic octile distance (true distance
    assuming an empty map)
  • 1,000 problems (map, start state, goal state)





18
On-policy experiments
  • The agent follows a path from the start state to
    the goal state, updating the heuristic along the
    way
  • Solution length and error over the whole path
    computed for each lookahead depth -gt pathology

d 1
d 2
d 3
19
Off-policy experiments
  • The agent spawns in a number of states
  • It takes one move towards the goal state
  • Heuristic not updated
  • Error is computed from these first moves -gt
    pathology

d 3
d 1, 2
d 1
d 1
d 2
d 2, 3
d 3
20
Basic on-policy experiment
Degree of pathology 0 1 2 3 4 5
Length (problems ) 38.1 12.8 18.2 16.1 9.5 5.3
Error (problems ) 38.5 15.1 20.3 17.0 7.6 1.5
  • A lot of pathology over 60!
  • First explanation a lot of states are
    intrinsically pathological (off-policy mode)
  • Not true only 3.9 are
  • If the topology of the maps is not at fault,
    perhaps the algorithm is to blame?

21
Off-policy experiment on 188 states
  • Comparison not fair
  • On-policy pathology from error over a number of
    states
  • Off-policy pathologicalness of single states
  • Fair off-policy error over the same number of
    states as on-policy 188 (chosen randomly)
  • Can use only error no solution length off-policy

Degree of pathology 0 1 2 3 4
Problems 57.8 31.4 9.4 1.4 0.0
  • Not much less pathology than on-policy 42.2 vs.
    61.5

22
Tolerance
  • The first off-policy experiment showed little
    pathology, the second one quite a lot
  • Perhaps off-policy pathology is caused by minor
    differences in error noise
  • Introduce tolerence t
  • increase in error counts towards the pathology
    only if error (d1) gt t error (d2)
  • set t so that the pathology in the off-policy
    experiment on 188 states is lt 5 t 1.09

23
Experiments with t 1.09
Degree of pathology 0 1 2 3 4 5
On-policy (prob. ) 42.3 19.7 21.2 12.9 3.6 0.3
Off-policy (prob. ) 95.7 3.7 0.6 0.0 0.0 0.0
  • On-policy changes little vs. t 1 57.7 vs.
    61.9
  • Apparently on-policy pathology is more severe
    than off-policy
  • Investigate why!
  • The above experiments are the basic on-policy
    experiment and the basic off-policy experiment

24
  • Introduction
  • Problem
  • Explanation
  • Remedy

25
Hypothesis 1
  • LRTS tends to visit pathological states with an
    above-average frequency
  • Test compute pathology from states visited
    on-policy instead of 188 random states

Degree of pathology 0 1 2 3 4
Problems 93.6 5.3 0.9 0.2 0.0
  • More pathology than in random states 6.3 vs.
    4.3
  • Much less pathology than basic on-policy 6.3
    vs. 57.7
  • Hypothesis 1 is correct, but it is not the main
    reason for on-policy pathology

26
Is learning the culprit?
  • There is learning (updating the heuristic)
    on-policy, but not off-policy
  • Learning necessary on-policy, otherwise the agent
    gets caught in infinite loops
  • Test traverse paths in the normal on-policy
    manner, measure error without learning

Degree of pathology 0 1 2 3 4 5
Problems 79.8 14.2 4.5 1.2 0.3 0.0
  • Less pathology than basic on-policy 20.2 vs.
    57.7
  • Still more pathology than basic off-policy 20.2
    vs. 4.3
  • Learning is a reason, although not the only one

27
Hypothesis 2
  • Larger fraction of updated states at smaller
    depths

Current lookahead area
Updated state
28
Hypothesis 2
  • Smaller lookahead depths benefit more from
    learning
  • This makes their decisions better than the mere
    depth suggests
  • Thus they are closer to larger depths
  • If they are closer to larger depths, cases where
    a larger depth happens to be worse than a smaller
    depth are more common
  • Test equalize depths by learning as much as
    possible in the whole lookahead area uniform
    learning

29
Uniform learning
30
Uniform learning
Search
31
Uniform learning
Update
32
Uniform learning
Search
33
Uniform learning
Update
34
Uniform learning
35
Uniform learning
36
Uniform learning
37
Uniform learning
38
Pathology with uniform learning
Degree of pathology 0 1 2 3 4 5
Problems 40.9 20.2 22.1 12.3 4.2 0.3
  • Even more pathology than basic on-policy 59.1
    vs. 57.7
  • Is Hypothesis 2 wrong?
  • Let us look at the volume of heuristic updates
    encountered per state generated during search
  • This seems to be the best measure of the benefit
    of learning

39
Volume of updates encountered
  • Hypothesis 2 is correct after all

40
Consistency
  • Initial heuristic is consistent
  • the difference in heuristic value between two
    states does not exceed the actual shortest
    distance between them
  • Updates make it inconsistent
  • Research on synthetic trees showed inconsistency
    causes pathology Luštrek 05
  • Uniform learning preserves consistency
  • It is more pathological than regular learning
  • Consistency is not a problem in our case

41
Hypothesis 3
  • On-policy one search every d moves, so fewer
    searchs at larger depths
  • Off-policy one search every move

42
Hypothesis 3
  • The difference between depths in the amount of
    search is smaller on-policy than off-policy
  • This makes the depths closer on-policy
  • If they are closer, cases where a larger depth
    happens to be worse than a smaller depth are more
    common
  • Test search every move on-policy

43
Pathology when searching every move
Degree of pathology 0 1 2 3 4 5
Problems 86.9 9.0 3.3 0.6 0.2 0.0
  • Less pathology than basic on-policy 13.1 vs.
    57.7
  • Still more pathology than basic off-policy 13.1
    vs. 4.3
  • Hypothesis 3 is correct, the remaining pathology
    due to Hypotheses 1 and 2
  • Further test number of states generated per move

44
States generated / move
  • Hypothesis 3 confirmed again

45
Summary of explanation
  • On-policy pathology caused by different lookahead
    depths being closer to each other in terms of the
    quality of decisions than the mere depths would
    suggest
  • due to the volume of heuristic updates
    ecnountered per state generated
  • due to the number of states generated per move
  • LRTS tends to visit pathological states with an
    above-average frequency

46
  • Introduction
  • Problem
  • Explanation
  • Remedy

47
Is a remedy worth looking for?
Averaged over 1,000 problems Averaged over 1,000 problems Averaged over 1,000 problems
Depth Length States
1 175.4 7.8
2 226.4 29.0
3 226.6 50.4
4 225.3 69.7
5 227.4 87.0
6 221.0 102.2
7 209.3 115.0
8 199.6 126.4
9 200.4 137.2
10 187.0 146.3
  • Optimal lookahead depth selected for each
    problem
  • Solution length 107.9
  • States generated / move 73.6
  • The answer is yes solution length improved by
    38.5

48
What can we do?
  • House garden
  • Precompute the optimal depth for every start state

49
Optimal depth per start state
Averaged for house garden Averaged for house garden Averaged for house garden
Depth Length States
1 253.2 7.8
2 346.3 29.4
3 329.1 50.4
4 337.0 69.3
5 358.9 85.7
6 318.8 101.2
7 283.6 116.2
8 261.5 126.7
9 282.6 133.2
10 261.1 142.7
  • Optimal lookahead depth selected for each start
    state
  • Solution length 132.4
  • States generated / move 59.3
  • Similar to 1,000 problems map representative

50
Optimal depth per start state
51
Optimal depth per move
  • In a current state s, we can select the lookahead
    depth that would be optimal if we were starting
    in s
  • Might not be optimal because of learning prior to
    reaching s, which would not have happened if we
    started in s
  • House garden
  • solution length even smaller than with adapting
    per start state 113.3 vs. 132.4
  • fewer state generated / move 34.0 vs. 59.3

52
Precomputation too expensive
  • House garden has 8,743 states
  • That means 7.6 107 directed pairs of states
  • It could take months
  • If we were to go that far, we should just store
    the optimal paths instead, at least in a static
    environment

53
State abstraction
  • Clique abstraction Sturtevant, Bulitko et al.
    05
  • Compute the optimal lookahead depth for the
    central ground-level state under each abstract
    state
  • Use the depth in all ground-level states under
    that abstract state

54
House garden with abstraction
Abs. level Abs. states Length States/move
0 8,743 113.3 34.0
1 2,463 124.6 38.3
2 783 129.2 39.2
3 296 133.4 40.9
4 129 154.0 51.2
5 58 169.3 50.5
6 26 189.2 45.1
7 12 235.7 55.5
8 4 253.2 7.8
9 1 253.2 7.8
No abstraction
Fixed depth 1
55
Abstraction level 5
  • 3,306 directed pairs of abstract states 0.004
    of ground-level pairs
  • Precomputed in a few hours, maybe even less

56
Future work
  • Search on abstract states even faster
  • Problem correlation between the optimal
    lookahead depth at abstract levels and ground
    level
  • Smarter selection of ground-level states to merge
    into abstract states
  • Problem how does the topology of maps affects
    the pathology

57
Thank you.
  • Questions?
Write a Comment
User Comments (0)
About PowerShow.com