# Structured Models for Decision Making - PowerPoint PPT Presentation

PPT – Structured Models for Decision Making PowerPoint presentation | free to download - id: 5c5929-YTAyY

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Structured Models for Decision Making

Description:

### Structured Models for Decision Making Daphne Koller Stanford University koller_at_cs.Stanford.edu MURI Program on Decision Making under Uncertainty July 18, 2000 – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 46
Provided by: Kol63
Category:
Tags:
Transcript and Presenter's Notes

Title: Structured Models for Decision Making

1
Structured Models forDecision Making
• Daphne Koller
• Stanford University
• koller_at_cs.Stanford.edu

MURI Program on Decision Making under
Uncertainty July 18, 2000
2
Bayes Nets
PRMs
Static
Encapsulation Reuse
Dynamic PRMs
DBNs
Dynamic
Encapsulation Approximation
Relational MDPs
Factored MDPs
Decision Problem
Factored Policy Iteration, Efficient PRM inference
3
Outline
• Probabilistic Relational Models
• Representing complex domains
• Structural uncertainty
• Temporal models
• Decision making

4
Basic units of knowledge
entities properties relations
attributes
5
So what?
• Set of entities and relations between them is
determined at BN design time
• structure must be known in advance
• hard to adapt to changes
• BNs for complex domains are large unstructured
• ? very hard to build
• No ability to generalize
• across similar individuals
• across related situations

6
Probabilistic Relational Models
• Combine advantages of predicate logic BNs
• natural domain modeling objects, properties,
relations
• generalization over a variety of situations
• compact, natural probability models.
• Integrate uncertainty with relational model
• properties of domain entities can depend on
properties of related entities
• uncertainty over relational structure of domain.

7
Real-World Case Study
Battlefield situation assessment for missile units
• several locations
• many units
• each has detailed model
• Example object classes
• Battalion
• Battery
• Vehicle
• Location
• Weather.
• Example relations
• At-Location
• Has-Weather
• Sub-battery/In-battalion
• Sub-vehicle/In-battery

8
Scud Battery Simplified PRM
Under Fire
Launcher
(Launcher.status ok)
Next Mission
9
SCUD Battery Model
10
Cargo Vehicle Group
11
Original BN SCUD Battery
• A lot more complex
• must include relevant attributes of related
objects
• Hard to transfer information between different BN
models

Built by IET, Inc.
12
Situation Models
• Complex situations can be described compactly by
specifying objects and relations between them
• Class model is instantiated for each object, with
probabilistic dependencies induced by relations

13
Example reasoning pattern
Scud-Battalion-Charlie
under_fire
under_fire
heavy
0.06
0.44
0.28
0.33
Battery1
hit
hit
Group-TLs
Loc
TL1
TL2
damaged
damaged
good
hide-support
hide-support
rep_damaged
rep_damaged
reported_damaged
reported_damaged
none
14
Inference in PRMs

PRM
Situation description
Induces
BN over attributes
15
Exploit Structure for Inference
• Encapsulation objects interact in limited ways
• Inference can be encapsulated within objects,
with communication limited to interfaces
• Reuse objects from same class have same model
• Inference from one can be reused for others

16
Effects of exploiting structure
6000
flat BN
no reuse
with reuse
5000
4000
running time in seconds
3000
2000
1000
0
1
2
3
4
5
6
7
8
9
10
vehicles of each type / battery
17
Extension Structural Uncertainty
• Set of objects is that radar signal from a tank
• Relations between objects location of
SCUD-Battalion-C
• Task 1 Seamless integration w. probabilistic
model
• structural variables can depend on other
variables.
• Use approximate inference to simplify model
• variational methods to summarize multiple
potential influences
• MCMC for traversing possible relationships
• Use structured inference (encapsulation/reuse) on
simplified model

18
Outline
• Probabilistic Relational Models
• Temporal models
• Structured belief-state tracking
• Dynamic PRMs time, events and actions
• Decision making

19
Dynamic Bayesian Nets
Action(t2)
Action(t1)
Action(t)
...
Velocity(t2)
Velocity(t1)
Velocity(t)
Position(t2)
Position(t1)
Position(t)
Observed_pos(t)
Observed_pos(t1)
Observed_pos(t2)
• Compact representation of system dynamics
• discrete, continuous, hybrid
• Generalization of Kalman filters

20
Tracking System State
Task Maintain Belief state distribution over
current state given evidence so far
Action(t2)
Action(t1)
Action(t)
...
Velocity(t2)
Velocity(t1)
Velocity(t)
Position(t2)
Position(t1)
Position(t)
• In discrete/hybrid systems, belief state
representation is exponential in of state
variables
• In hybrid systems, of distinct hypotheses grows
exponentially over time

21
Approximate Tracking
• Decompose belief state along subsystem lines
• Maintain belief state as product of marginals
• In hybrid systems, keep mixture of hypotheses for
every subsystem
• Merge hypotheses associated with similar density

22
Case Study Diagnosis Tracking for Five-Tank
System
F1o
F5o
F23
observables
• State space per time slice
• eleven-dimensional continuous space
• 227 discrete failure modes

23
The doomsday scenario
24
Algorithm Performance
Omniscient Kalman Filter
25
Dynamic PRMs
• Goal Model complex structured systems
• that evolve over time
• where agents take compound structured actions
• construct effective scalable inference
algorithm
• Easy part Add time relation to PRMs
• Allows notion of current and previous state
• Maintains notions of structured objects and
relations
• Challenges
• Appropriate representation for actions, events
• Modeling changes in domain structure (objects,
relations)
• Effective inference that exploits structure

26
Dynamic PRMs Event Models
Events Discrete points where the system
undergoes a discontinuous change
• Events can be triggered by external events
• an agents action
• or by system dynamics
• e.g., a unit reaches its destination
• Events can influence the system structure
• discrete change in continuous dynamics
• truck velocity goes to 0 when destination is
reached
• modification of relational structure
• aircraft taking off is no longer on aircraft
carrier
• creation / deletion of objects
• units entering/leaving battlespace

27
• Use relational / hierarchical action
representation
• class hierarchy for Move action
• an instantiation of a particular action is
related to object moving, road taken, origin,
destination
• Actions can depend on and influence attributes of
related objects
• duration of Move action may depend on road
condition, influence status of moving objects
• Actions are like events, can change domain
structure
• Complex actions can be composed of simpler ones
• Effects of complex action derived from that of
subactions

28
Inference in Dynamic Systems
• situation monitoring
• prediction
• Goal Exploit structure as we did in PRMs
• First step Encapsulation
• Exploit structure of weakly interacting
subsystems
• Applied successfully to Dynamic Bayesian Nets

29
Tracking in Dynamic PRMs
• Use relational structure to guide belief state
approximation
• direct dependencies only between related objects
• Deal with dynamic structure
• relations and even domain objects change over
time
• want to adjust our approximation to context
• structural uncertainty critical
• Event-driven tracking
• no reason to use fine-grained model of boring
bits
• but fast forward requires ability to propagate
dynamics over variable-length segments

30
Outline
• Probabilistic Relational Models
• Temporal models
• Decision making
• Planning in factored MDPs
• Planning in relational MDPs

31
What is a Markov Decision Process?
• An MDP is a controlled dynamic process
• Stochastic transition between states
• Actions affect system dynamics
• Rewards or costs are associated with states
• Objective Drive process to regions of high
reward
• MDP solutions are policies
• Policies assign an action to every state

32
MDP Policies Value Functions
Suppose an expert told you the value of each
state
V(s1) 10
V(s2) 5
s1
s1
0.7
0.5
s2
s2
0.3
0.5
Action 2
Action 1
33
Greedy Policy Construction
Pick action with highest expected future value
Expectation over next-state values
34
Bootstrapping Policy Iteration
Idea Greedy selection is useful even with
suboptimal V
Guess V
Repeat until policy doesnt change
? greedy(V)
V value of acting on ?
Guaranteed to find globally optimal policy if V
is defined over explicit states, i.e., if V is
exponential
Exploit Structure with Factored Policy Iteration
35
Factored MDPs DBNS Rewards
t
t1
Rewards have small sets of parent variables too
X
Y
Z
36
Linearly Decomposable Value Functions
Note Overlapping is allowed!
Approximate high-dimensional value function with
combination of lower-dimensional functions
Motivation Multi-attribute utility theory
(Keeney Raifa)
37
Decomposable Value Functions
Linear combination of restricted domain functions
• Each basis function hi is the status of some
small part(s) of a complex system
• status of a machine
• inventory of a store
• status of a subgoal

38
Exploiting Structure
X
Key operation backprojection of a basis
function thru a DBN transition
Y
Z
Structure allows us to consider operations
over small subsets of variables, not the entire
state space.
39
Policy Format
Factored value functions ? compact
action effect descriptions
Action 1
Action 2
Sorted result values form a decision list
If then action 1 else if then action
2 else if then action 1
40
Factored Policy Iteration Summary
Structure induces decision-list policy
Guess V
? greedy(V)
V value of acting on ?
Key operations isomorphic to BN inference
• Time per iteration reduced from O((2n)3) to
O(Cbk3)
• Cb cost of Bayes net inference (function of
structure)
• k number of basis functions (k ltlt 2n)

41
Run Times
70000
States
Seconds
3n3
60000
50000
40000
CPU Seconds/States
30000
20000
10000
0
4
6
8
10
12
14
16
State Variables
Note Nearly optimal policy found in all cases (?
6).
42
Planning in Relational MDPs
• Replace DBN transition model with dynamic PRM
• Generalize factored policy iteration
• Define basis functions via relational formulas
• Replace BN inference with PRM inference as key
step
• Exploit hierarchical structure of complex actions
by encapsulating decision making along hierarchy
• Potential benefits
• Tractable approximate planning in relational
domains
• Unification of classical and stochastic planning

43
Conclusions Past Present
• PRMs compactly represent complex systems with
multiple interacting objects
• coherent (probabilistic) semantics
• structured representation modularity reuse.
• Scalable inference that exploits structure
• Tracking algorithms for DBNs that exploit system
decomposition
• Planning algorithms in MDPs that exploit
structure of system and of value functions

Theme Representation inference scale up,
if we exploit structure
44
Conclusions Future
• Better inference for densely connected PRMs
• Extending PRMs with time, events, actions
• Exploit structure for inference in dynamic PRMs
• system decomposition into subsystems
• relational context
• varying time granularity
• Planning in dynamic PRMs
• extend factored policy iteration to PRMs
• exploit hierarchical action decomposition

45
Acknowledgements
• Students postdocs
• Nir Friedman (? Hebrew U.)
• Dirk Ormoneit
• Ron Parr (? Duke)
• Xavier Boyen
• Urszula Chajewska
• Lise Getoor
• Carlos Guestrin
• Uri Lerner
• Uri Nodelman
• Avi Pfeffer (? Harvard)
• Eran Segal
• Simon Tong
• Brian Milch (? Berkeley)
• Ken Takusagawa (? MIT)
• Support
• PECASE Award via ONR YIP
• DARPAs HPKB Program
• MURI Program Integrated Approach to Intelligent
Systems
• Sloan Faculty Fellowship
• DARPAs IA Program under subcontract to SRI
International
• DARPAs DMIF Program under subcontract to IET
Inc.
• ONR grant

Postdocs
PhD students