Classical Situation - PowerPoint PPT Presentation

About This Presentation
Title:

Classical Situation

Description:

hell. heaven. World deterministic. State observable. MDP-Style Planning. hell. heaven ... hell. heaven. sign. heaven. hell. Stochastic, Partially Observable ... – PowerPoint PPT presentation

Number of Views:67
Avg rating:3.0/5.0
Slides: 27
Provided by: Sebasti47
Learn more at: https://ww2.odu.edu
Category:

less

Transcript and Presenter's Notes

Title: Classical Situation


1
Classical Situation
hell
heaven
  • World deterministic
  • State observable

2
MDP-Style Planning
hell
heaven
  • World stochastic
  • State observable

3
Stochastic, Partially Observable
hell?
heaven?
sign
Sondik 72 Littman/Cassandra/Kaelbling 97
4
Stochastic, Partially Observable
hell
heaven
heaven
hell
sign
sign
5
Stochastic, Partially Observable
heaven
hell
?
?
hell
heaven
sign
sign
sign
6
Robot Planning Frameworks
7
MDP-Style Planning
hell
heaven
  • World stochastic
  • State observable

8
Markov Decision Process (discrete)
r1
0.1
s2
0.9
0.7
0.1
0.3
0.99
r0
s3
0.3
s1
r20
0.3
0.4
0.2
s5
s4
r0
0.8
r-10
Bellman 57 Howard 60 Sutton/Barto 98
9
Value Iteration
  • Value function of policy p
  • Bellman equation for optimal value function
  • Value iteration recursively estimating value
    function
  • Greedy policy

Bellman 57 Howard 60 Sutton/Barto 98
10
Value Iteration for Motion Planning(assumes
knowledge of robots location)
11
Continuous Environments
From A Moore C.G. Atkeson The Parti-Game
Algorithm for Variable Resolution Reinforcement
Learning in Continuous State spaces, Machine
Learning 1995
12
Approximate Cell Decomposition Latombe 91
From A Moore C.G. Atkeson The Parti-Game
Algorithm for Variable Resolution Reinforcement
Learning in Continuous State spaces, Machine
Learning 1995
13
Parti-Game Moore 96
From A Moore C.G. Atkeson The Parti-Game
Algorithm for Variable Resolution Reinforcement
Learning in Continuous State spaces, Machine
Learning 1995
14
Robot Planning Frameworks
15
Stochastic, Partially Observable
16
A Quiz
actions
states
size belief space?
sensors
3 s1, s2, s3
3 s1, s2, s3
23-1 s1, s2, s3, s12, s13, s23, s123
2-dim continuous p(Ss1), p(Ss2)
?-dim continuous
?-dim continuous
aargh!
17
Introduction to POMDPs (1 of 3)
p(s1)
Sondik 72, Littman, Kaelbling, Cassandra 97
18
Introduction to POMDPs (2 of 3)
100
-100
100
-40
80
0
b
a
b
a
-100
s2
s1
s2
s1
p(s1)
Sondik 72, Littman, Kaelbling, Cassandra 97
19
Introduction to POMDPs (3 of 3)
100
-100
100
-40
80
0
b
a
b
a
-100
c
80
s2
s1
s2
s1
p(s1)
20
Sondik 72, Littman, Kaelbling, Cassandra 97
20
Value Iteration in POMDPs
Substitute b for s
  • Value function of policy p
  • Bellman equation for optimal value function
  • Value iteration recursively estimating value
    function
  • Greedy policy

21
Missing Terms Belief Space
  • Expected reward
  • Next state density

Bayes filters! (Dirac distribution)
22
Value Iteration in Belief Space
23
Why is This So Complex?
State Space Planning (no state uncertainty)
Belief Space Planning (full state uncertainties)
24
Augmented MDPs
uncertainty (entropy)
conventional state space
Roy et al, 98/99
25
Path Planning with Augmented MDPs
Conventional planner
Probabilistic Planner
Roy et al, 98/99
26
Robot Planning Frameworks
Write a Comment
User Comments (0)
About PowerShow.com