Structured Models for Decision Making presentation

About This Presentation

Transcript and Presenter's Notes

Title: Structured Models for Decision Making

1
Structured Models forDecision Making

Daphne Koller
Stanford University
koller_at_cs.Stanford.edu

MURI Program on Decision Making under
Uncertainty July 18, 2000
2
Roadmap
Bayes Nets
PRMs
Static
Encapsulation Reuse
Dynamic PRMs
DBNs
Dynamic
Encapsulation Approximation
Relational MDPs
Factored MDPs
Decision Problem
Factored Policy Iteration, Efficient PRM inference
3
Outline

Probabilistic Relational Models
Representing complex domains
Structural uncertainty
Temporal models
Decision making

4
Basic units of knowledge
entities properties relations
attributes
5
So what?

Set of entities and relations between them is
determined at BN design time
structure must be known in advance
hard to adapt to changes
BNs for complex domains are large unstructured
? very hard to build
No ability to generalize
across similar individuals
across related situations

6
Probabilistic Relational Models

Combine advantages of predicate logic BNs
natural domain modeling objects, properties,
relations
generalization over a variety of situations
compact, natural probability models.
Integrate uncertainty with relational model
properties of domain entities can depend on
properties of related entities
uncertainty over relational structure of domain.

7
Real-World Case Study
Battlefield situation assessment for missile units

several locations
many units
each has detailed model

Example object classes
Battalion
Battery
Vehicle
Location
Weather.

Example relations
At-Location
Has-Weather
Sub-battery/In-battalion
Sub-vehicle/In-battery

8
Scud Battery Simplified PRM
Under Fire
Launcher
(Launcher.status ok)
Next Mission
9
SCUD Battery Model
10
Cargo Vehicle Group
11
Original BN SCUD Battery

Disadvantages
A lot more complex
must include relevant attributes of related
objects
Hard to transfer information between different BN
models

Built by IET, Inc.
12
Situation Models

Complex situations can be described compactly by
specifying objects and relations between them
Class model is instantiated for each object, with
probabilistic dependencies induced by relations

13
Example reasoning pattern
Scud-Battalion-Charlie
under_fire
under_fire
heavy
0.06
0.44
0.28
0.33
Battery1
hit
hit
Group-TLs
Loc
TL1
TL2
damaged
damaged
good
hide-support
hide-support
rep_damaged
rep_damaged
reported_damaged
reported_damaged
none
14
Inference in PRMs

PRM
Situation description
Induces
BN over attributes
15
Exploit Structure for Inference

Encapsulation objects interact in limited ways
Inference can be encapsulated within objects,
with communication limited to interfaces
Reuse objects from same class have same model
Inference from one can be reused for others

16
Effects of exploiting structure
6000
flat BN
no reuse
with reuse
5000
4000
running time in seconds
3000
2000
1000
0
1
2
3
4
5
6
7
8
9
10
vehicles of each type / battery
17
Extension Structural Uncertainty

Uncertainty about model structure
Set of objects is that radar signal from a tank
Relations between objects location of
SCUD-Battalion-C
Task 1 Seamless integration w. probabilistic
model
structural variables can depend on other
variables.
Task 2 Efficient Inference
Use approximate inference to simplify model
variational methods to summarize multiple
potential influences
MCMC for traversing possible relationships
Use structured inference (encapsulation/reuse) on
simplified model

18
Outline

Probabilistic Relational Models
Temporal models
Structured belief-state tracking
Dynamic PRMs time, events and actions
Decision making

19
Dynamic Bayesian Nets
Action(t2)
Action(t1)
Action(t)
...
Velocity(t2)
Velocity(t1)
Velocity(t)
Position(t2)
Position(t1)
Position(t)
Observed_pos(t)
Observed_pos(t1)
Observed_pos(t2)

Compact representation of system dynamics
discrete, continuous, hybrid
Generalization of Kalman filters

20
Tracking System State
Task Maintain Belief state distribution over
current state given evidence so far
Action(t2)
Action(t1)
Action(t)
...
Velocity(t2)
Velocity(t1)
Velocity(t)
Position(t2)
Position(t1)
Position(t)

In discrete/hybrid systems, belief state
representation is exponential in of state
variables
In hybrid systems, of distinct hypotheses grows
exponentially over time

21
Approximate Tracking

Decompose belief state along subsystem lines
Maintain belief state as product of marginals

In hybrid systems, keep mixture of hypotheses for
every subsystem
Merge hypotheses associated with similar density

22
Case Study Diagnosis Tracking for Five-Tank
System
F1o
F5o
F23
observables

State space per time slice
eleven-dimensional continuous space
227 discrete failure modes

23
The doomsday scenario
24
Algorithm Performance
Omniscient Kalman Filter
25
Dynamic PRMs

Goal Model complex structured systems
that evolve over time
where agents take compound structured actions
construct effective scalable inference
algorithm
Easy part Add time relation to PRMs
Allows notion of current and previous state
Maintains notions of structured objects and
relations
Challenges
Appropriate representation for actions, events
Modeling changes in domain structure (objects,
relations)
Effective inference that exploits structure

26
Dynamic PRMs Event Models
Events Discrete points where the system
undergoes a discontinuous change

Events can be triggered by external events
an agents action
or by system dynamics
e.g., a unit reaches its destination
Events can influence the system structure
discrete change in continuous dynamics
truck velocity goes to 0 when destination is
reached
modification of relational structure
aircraft taking off is no longer on aircraft
carrier
creation / deletion of objects
units entering/leaving battlespace

27
Dynamic PRMs Adding Actions

Use relational / hierarchical action
representation
class hierarchy for Move action
an instantiation of a particular action is
related to object moving, road taken, origin,
destination
Actions can depend on and influence attributes of
related objects
duration of Move action may depend on road
condition, influence status of moving objects
Actions are like events, can change domain
structure
Complex actions can be composed of simpler ones
Effects of complex action derived from that of
subactions

28
Inference in Dynamic Systems

Main tasks
situation monitoring
prediction
Goal Exploit structure as we did in PRMs
First step Encapsulation
Exploit structure of weakly interacting
subsystems
Applied successfully to Dynamic Bayesian Nets

29
Tracking in Dynamic PRMs

Use relational structure to guide belief state
approximation
direct dependencies only between related objects
Deal with dynamic structure
relations and even domain objects change over
time
want to adjust our approximation to context
structural uncertainty critical
Event-driven tracking
no reason to use fine-grained model of boring
bits
but fast forward requires ability to propagate
dynamics over variable-length segments

30
Outline

Probabilistic Relational Models
Temporal models
Decision making
Planning in factored MDPs
Planning in relational MDPs

31
What is a Markov Decision Process?

An MDP is a controlled dynamic process
Stochastic transition between states
Actions affect system dynamics
Rewards or costs are associated with states
Objective Drive process to regions of high
reward
MDP solutions are policies
Policies assign an action to every state

32
MDP Policies Value Functions
Suppose an expert told you the value of each
state
V(s1) 10
V(s2) 5
s1
s1
0.7
0.5
s2
s2
0.3
0.5
Action 2
Action 1
33
Greedy Policy Construction
Pick action with highest expected future value
Expectation over next-state values
34
Bootstrapping Policy Iteration
Idea Greedy selection is useful even with
suboptimal V
Guess V
Repeat until policy doesnt change
? greedy(V)
V value of acting on ?
Guaranteed to find globally optimal policy if V
is defined over explicit states, i.e., if V is
exponential
Exploit Structure with Factored Policy Iteration
35
Factored MDPs DBNS Rewards
t
t1
Rewards have small sets of parent variables too
X
Y
Total reward adds sub-rewards RR1R2
Z
36
Linearly Decomposable Value Functions
Note Overlapping is allowed!
Approximate high-dimensional value function with
combination of lower-dimensional functions
Motivation Multi-attribute utility theory
(Keeney Raifa)
37
Decomposable Value Functions
Linear combination of restricted domain functions

Each basis function hi is the status of some
small part(s) of a complex system
status of a machine
inventory of a store
status of a subgoal

38
Exploiting Structure
X
Key operation backprojection of a basis
function thru a DBN transition
Y
Z
Structure allows us to consider operations
over small subsets of variables, not the entire
state space.
39
Policy Format
Factored value functions ? compact
action effect descriptions
Action 1
Action 2
Sorted result values form a decision list
If then action 1 else if then action
2 else if then action 1
40
Factored Policy Iteration Summary
Structure induces decision-list policy
Guess V
? greedy(V)
V value of acting on ?
Key operations isomorphic to BN inference

Time per iteration reduced from O((2n)3) to
O(Cbk3)
Cb cost of Bayes net inference (function of
structure)
k number of basis functions (k ltlt 2n)

41
Run Times
70000
States
Seconds
3n3
60000
50000
40000
CPU Seconds/States
30000
20000
10000
0
4
6
8
10
12
14
16
State Variables
Note Nearly optimal policy found in all cases (?
6).
42
Planning in Relational MDPs

Replace DBN transition model with dynamic PRM
Generalize factored policy iteration
Define basis functions via relational formulas
Replace BN inference with PRM inference as key
step
Exploit hierarchical structure of complex actions
by encapsulating decision making along hierarchy
Potential benefits
Tractable approximate planning in relational
domains
Unification of classical and stochastic planning

43
Conclusions Past Present

PRMs compactly represent complex systems with
multiple interacting objects
coherent (probabilistic) semantics
structured representation modularity reuse.
Scalable inference that exploits structure
Tracking algorithms for DBNs that exploit system
decomposition
Planning algorithms in MDPs that exploit
structure of system and of value functions

Theme Representation inference scale up,
if we exploit structure
44
Conclusions Future

Better inference for densely connected PRMs
Extending PRMs with time, events, actions
Exploit structure for inference in dynamic PRMs
system decomposition into subsystems
relational context
varying time granularity
Planning in dynamic PRMs
extend factored policy iteration to PRMs
exploit hierarchical action decomposition

45
Acknowledgements

Students postdocs
Nir Friedman (? Hebrew U.)
Dirk Ormoneit
Ron Parr (? Duke)
Xavier Boyen
Urszula Chajewska
Lise Getoor
Carlos Guestrin
Uri Lerner
Uri Nodelman
Avi Pfeffer (? Harvard)
Eran Segal
Benjamin Taskar
Simon Tong
Brian Milch (? Berkeley)
Ken Takusagawa (? MIT)

Support
PECASE Award via ONR YIP
DARPAs HPKB Program
MURI Program Integrated Approach to Intelligent
Systems
Sloan Faculty Fellowship
DARPAs IA Program under subcontract to SRI
International
DARPAs DMIF Program under subcontract to IET
Inc.
ONR grant

Postdocs
PhD students
Ugrad
http//robotics.stanford.edu/koller/

Write a Comment

User Comments (0)

About PowerShow.com

Structured Models for Decision Making PowerPoint PPT Presentation