Structured Models for Decision Making - PowerPoint PPT Presentation

Loading...

PPT – Structured Models for Decision Making PowerPoint presentation | free to download - id: 5c5929-YTAyY



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Structured Models for Decision Making

Description:

Structured Models for Decision Making Daphne Koller Stanford University koller_at_cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18, 2000 – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 46
Provided by: Kol63
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Structured Models for Decision Making


1
Structured Models forDecision Making
  • Daphne Koller
  • Stanford University
  • koller_at_cs.Stanford.edu

MURI Program on Decision Making under
Uncertainty July 18, 2000
2
Roadmap
Bayes Nets
PRMs
Static
Encapsulation Reuse
Dynamic PRMs
DBNs
Dynamic
Encapsulation Approximation
Relational MDPs
Factored MDPs
Decision Problem
Factored Policy Iteration, Efficient PRM inference
3
Outline
  • Probabilistic Relational Models
  • Representing complex domains
  • Structural uncertainty
  • Temporal models
  • Decision making

4
Basic units of knowledge
entities properties relations
attributes
5
So what?
  • Set of entities and relations between them is
    determined at BN design time
  • structure must be known in advance
  • hard to adapt to changes
  • BNs for complex domains are large unstructured
  • ? very hard to build
  • No ability to generalize
  • across similar individuals
  • across related situations

6
Probabilistic Relational Models
  • Combine advantages of predicate logic BNs
  • natural domain modeling objects, properties,
    relations
  • generalization over a variety of situations
  • compact, natural probability models.
  • Integrate uncertainty with relational model
  • properties of domain entities can depend on
    properties of related entities
  • uncertainty over relational structure of domain.

7
Real-World Case Study
Battlefield situation assessment for missile units
  • several locations
  • many units
  • each has detailed model
  • Example object classes
  • Battalion
  • Battery
  • Vehicle
  • Location
  • Weather.
  • Example relations
  • At-Location
  • Has-Weather
  • Sub-battery/In-battalion
  • Sub-vehicle/In-battery

8
Scud Battery Simplified PRM
Under Fire
Launcher
(Launcher.status ok)
Next Mission
9
SCUD Battery Model
10
Cargo Vehicle Group
11
Original BN SCUD Battery
  • Disadvantages
  • A lot more complex
  • must include relevant attributes of related
    objects
  • Hard to transfer information between different BN
    models

Built by IET, Inc.
12
Situation Models
  • Complex situations can be described compactly by
    specifying objects and relations between them
  • Class model is instantiated for each object, with
    probabilistic dependencies induced by relations

13
Example reasoning pattern
Scud-Battalion-Charlie
under_fire
under_fire
heavy
0.06
0.44
0.28
0.33
Battery1
hit
hit
Group-TLs
Loc
TL1
TL2
damaged
damaged
good
hide-support
hide-support
rep_damaged
rep_damaged
reported_damaged
reported_damaged
none
14
Inference in PRMs

PRM
Situation description
Induces
BN over attributes
15
Exploit Structure for Inference
  • Encapsulation objects interact in limited ways
  • Inference can be encapsulated within objects,
    with communication limited to interfaces
  • Reuse objects from same class have same model
  • Inference from one can be reused for others

16
Effects of exploiting structure
6000
flat BN
no reuse
with reuse
5000
4000
running time in seconds
3000
2000
1000
0
1
2
3
4
5
6
7
8
9
10
vehicles of each type / battery
17
Extension Structural Uncertainty
  • Uncertainty about model structure
  • Set of objects is that radar signal from a tank
  • Relations between objects location of
    SCUD-Battalion-C
  • Task 1 Seamless integration w. probabilistic
    model
  • structural variables can depend on other
    variables.
  • Task 2 Efficient Inference
  • Use approximate inference to simplify model
  • variational methods to summarize multiple
    potential influences
  • MCMC for traversing possible relationships
  • Use structured inference (encapsulation/reuse) on
    simplified model

18
Outline
  • Probabilistic Relational Models
  • Temporal models
  • Structured belief-state tracking
  • Dynamic PRMs time, events and actions
  • Decision making

19
Dynamic Bayesian Nets
Action(t2)
Action(t1)
Action(t)
...
Velocity(t2)
Velocity(t1)
Velocity(t)
Position(t2)
Position(t1)
Position(t)
Observed_pos(t)
Observed_pos(t1)
Observed_pos(t2)
  • Compact representation of system dynamics
  • discrete, continuous, hybrid
  • Generalization of Kalman filters

20
Tracking System State
Task Maintain Belief state distribution over
current state given evidence so far
Action(t2)
Action(t1)
Action(t)
...
Velocity(t2)
Velocity(t1)
Velocity(t)
Position(t2)
Position(t1)
Position(t)
  • In discrete/hybrid systems, belief state
    representation is exponential in of state
    variables
  • In hybrid systems, of distinct hypotheses grows
    exponentially over time

21
Approximate Tracking
  • Decompose belief state along subsystem lines
  • Maintain belief state as product of marginals
  • In hybrid systems, keep mixture of hypotheses for
    every subsystem
  • Merge hypotheses associated with similar density

22
Case Study Diagnosis Tracking for Five-Tank
System
F1o
F5o
F23
observables
  • State space per time slice
  • eleven-dimensional continuous space
  • 227 discrete failure modes

23
The doomsday scenario
24
Algorithm Performance
Omniscient Kalman Filter
25
Dynamic PRMs
  • Goal Model complex structured systems
  • that evolve over time
  • where agents take compound structured actions
  • construct effective scalable inference
    algorithm
  • Easy part Add time relation to PRMs
  • Allows notion of current and previous state
  • Maintains notions of structured objects and
    relations
  • Challenges
  • Appropriate representation for actions, events
  • Modeling changes in domain structure (objects,
    relations)
  • Effective inference that exploits structure

26
Dynamic PRMs Event Models
Events Discrete points where the system
undergoes a discontinuous change
  • Events can be triggered by external events
  • an agents action
  • or by system dynamics
  • e.g., a unit reaches its destination
  • Events can influence the system structure
  • discrete change in continuous dynamics
  • truck velocity goes to 0 when destination is
    reached
  • modification of relational structure
  • aircraft taking off is no longer on aircraft
    carrier
  • creation / deletion of objects
  • units entering/leaving battlespace

27
Dynamic PRMs Adding Actions
  • Use relational / hierarchical action
    representation
  • class hierarchy for Move action
  • an instantiation of a particular action is
    related to object moving, road taken, origin,
    destination
  • Actions can depend on and influence attributes of
    related objects
  • duration of Move action may depend on road
    condition, influence status of moving objects
  • Actions are like events, can change domain
    structure
  • Complex actions can be composed of simpler ones
  • Effects of complex action derived from that of
    subactions

28
Inference in Dynamic Systems
  • Main tasks
  • situation monitoring
  • prediction
  • Goal Exploit structure as we did in PRMs
  • First step Encapsulation
  • Exploit structure of weakly interacting
    subsystems
  • Applied successfully to Dynamic Bayesian Nets

29
Tracking in Dynamic PRMs
  • Use relational structure to guide belief state
    approximation
  • direct dependencies only between related objects
  • Deal with dynamic structure
  • relations and even domain objects change over
    time
  • want to adjust our approximation to context
  • structural uncertainty critical
  • Event-driven tracking
  • no reason to use fine-grained model of boring
    bits
  • but fast forward requires ability to propagate
    dynamics over variable-length segments

30
Outline
  • Probabilistic Relational Models
  • Temporal models
  • Decision making
  • Planning in factored MDPs
  • Planning in relational MDPs

31
What is a Markov Decision Process?
  • An MDP is a controlled dynamic process
  • Stochastic transition between states
  • Actions affect system dynamics
  • Rewards or costs are associated with states
  • Objective Drive process to regions of high
    reward
  • MDP solutions are policies
  • Policies assign an action to every state

32
MDP Policies Value Functions
Suppose an expert told you the value of each
state
V(s1) 10
V(s2) 5
s1
s1
0.7
0.5
s2
s2
0.3
0.5
Action 2
Action 1
33
Greedy Policy Construction
Pick action with highest expected future value
Expectation over next-state values
34
Bootstrapping Policy Iteration
Idea Greedy selection is useful even with
suboptimal V
Guess V
Repeat until policy doesnt change
? greedy(V)
V value of acting on ?
Guaranteed to find globally optimal policy if V
is defined over explicit states, i.e., if V is
exponential
Exploit Structure with Factored Policy Iteration
35
Factored MDPs DBNS Rewards
t
t1
Rewards have small sets of parent variables too
X
Y
Total reward adds sub-rewards RR1R2
Z
36
Linearly Decomposable Value Functions
Note Overlapping is allowed!
Approximate high-dimensional value function with
combination of lower-dimensional functions
Motivation Multi-attribute utility theory
(Keeney Raifa)
37
Decomposable Value Functions
Linear combination of restricted domain functions
  • Each basis function hi is the status of some
    small part(s) of a complex system
  • status of a machine
  • inventory of a store
  • status of a subgoal

38
Exploiting Structure
X
Key operation backprojection of a basis
function thru a DBN transition
Y
Z
Structure allows us to consider operations
over small subsets of variables, not the entire
state space.
39
Policy Format
Factored value functions ? compact
action effect descriptions
Action 1
Action 2
Sorted result values form a decision list
If then action 1 else if then action
2 else if then action 1
40
Factored Policy Iteration Summary
Structure induces decision-list policy
Guess V
? greedy(V)
V value of acting on ?
Key operations isomorphic to BN inference
  • Time per iteration reduced from O((2n)3) to
    O(Cbk3)
  • Cb cost of Bayes net inference (function of
    structure)
  • k number of basis functions (k ltlt 2n)

41
Run Times
70000
States
Seconds
3n3
60000
50000
40000
CPU Seconds/States
30000
20000
10000
0
4
6
8
10
12
14
16
State Variables
Note Nearly optimal policy found in all cases (?
6).
42
Planning in Relational MDPs
  • Replace DBN transition model with dynamic PRM
  • Generalize factored policy iteration
  • Define basis functions via relational formulas
  • Replace BN inference with PRM inference as key
    step
  • Exploit hierarchical structure of complex actions
    by encapsulating decision making along hierarchy
  • Potential benefits
  • Tractable approximate planning in relational
    domains
  • Unification of classical and stochastic planning

43
Conclusions Past Present
  • PRMs compactly represent complex systems with
    multiple interacting objects
  • coherent (probabilistic) semantics
  • structured representation modularity reuse.
  • Scalable inference that exploits structure
  • Tracking algorithms for DBNs that exploit system
    decomposition
  • Planning algorithms in MDPs that exploit
    structure of system and of value functions

Theme Representation inference scale up,
if we exploit structure
44
Conclusions Future
  • Better inference for densely connected PRMs
  • Extending PRMs with time, events, actions
  • Exploit structure for inference in dynamic PRMs
  • system decomposition into subsystems
  • relational context
  • varying time granularity
  • Planning in dynamic PRMs
  • extend factored policy iteration to PRMs
  • exploit hierarchical action decomposition

45
Acknowledgements
  • Students postdocs
  • Nir Friedman (? Hebrew U.)
  • Dirk Ormoneit
  • Ron Parr (? Duke)
  • Xavier Boyen
  • Urszula Chajewska
  • Lise Getoor
  • Carlos Guestrin
  • Uri Lerner
  • Uri Nodelman
  • Avi Pfeffer (? Harvard)
  • Eran Segal
  • Benjamin Taskar
  • Simon Tong
  • Brian Milch (? Berkeley)
  • Ken Takusagawa (? MIT)
  • Support
  • PECASE Award via ONR YIP
  • DARPAs HPKB Program
  • MURI Program Integrated Approach to Intelligent
    Systems
  • Sloan Faculty Fellowship
  • DARPAs IA Program under subcontract to SRI
    International
  • DARPAs DMIF Program under subcontract to IET
    Inc.
  • ONR grant

Postdocs
PhD students
Ugrad
http//robotics.stanford.edu/koller/
About PowerShow.com