Decision Making Under Uncertainty presentation

About This Presentation

Transcript and Presenter's Notes

Title: Decision Making Under Uncertainty

1
Decision Making Under Uncertainty

Russell and Norvig ch 16
CMSC421 Fall 2006

2
Utility-Based Agent
3
Non-deterministic vs. Probabilistic Uncertainty

a,b,c
decision that is best for worst case

Non-deterministic model
Probabilistic model
Adversarial search
4
Expected Utility

Random variable X with n values x1,,xn and
distribution (p1,,pn)E.g. Xi is
Resulti(A)Do(A), E, the state reached after
doing an action A given E, what we know about the
current state
Function U of XE.g., U is the utility of a state
The expected utility of A is EUAE Si1,,n
p(xiA)U(xi) Si1,,n
p(Resulti(A)Do(A),E)U(Resulti(A))

5
One State/One Action Example
U(S0) 100 x 0.2 50 x 0.7 70 x 0.1
20 35 7 62
6
One State/Two Actions Example

U1(S0) 62
U2(S0) 74
U(S0) maxU1(S0),U2(S0)
74

80
7
Introducing Action Costs

U1(S0) 62 5 57
U2(S0) 74 25 49
U(S0) maxU1(S0),U2(S0)
57

-5
-25
80
8
MEU Principle

rational agent should choose the action that
maximizes agents expected utility
this is the basis of the field of decision theory
normative criterion for rational choice of action

AI is Solved!!!
9
Not quite

Must have complete model of
Actions
Utilities
States
Even if you have a complete model, will be
computationally intractable
In fact, a truly rational agent takes into
account the utility of reasoning as
well---bounded rationality
Nevertheless, great progress has been made in
this area recently, and we are able to solve much
more complex decision theoretic problems than
ever before

10
Well look at

Decision Theoretic Reasoning
Simple decision making (ch. 16)
Sequential decision making (ch. 17)

11
Preferences

An agent chooses among prizes (A, B, etc.) and
lotteries, i.e., situations with uncertain
prizes
Lottery L p, A (1 p), B
Notation A gt B A preferred to B A ? B
indifference between A and B A B B not
preferred to A

12
Rational Preferences

Idea preferences of a rational agent must obey
constraints
Axioms of Utility Theory
Orderability (A gt B) v (B gt A) v (A ? B)
Transitivity (A gt B) (B gt C) ?(A gt C)
Contitnuity A gt B gt C ? ?p p, A 1-p,C ? B
Substitutability A ? B ? p, A 1-p,C ? p,
B 1-p,C
Monotonicity A gt B ? (p q ? p, A 1-p, B
q, A 1-q, B)

13
Rational Preferences

Violating the constraints leads to irrational
behavior
E.g an agent with intransitive preferences can
be induced to give away all its money
if B gt C, than an agent who has C would pay some
amount, say 1, to get B
if A gt B, then an agent who has B would pay, say,
1 to get A
if C gt A, then an agent who has A would pay, say,
1 to get C
.oh, oh!

14
Rational Preferences ? Utility

Theorem (Ramsey, 1931, von Neumann and
Morgenstern, 1944) Given preferences satisfying
the constraints, there exists a real-valued
function U such that U(A) U(B) ? A
B U(p1,S1,pn,Sn)?i piU(Si)
MEU principle Choose the action that maximizes
expected utility

15
Utility Assessment

Standard approach to assessment of human
utilitescompare a given state A to a standard
lottery Lp that has best possible prize w/ prob.
p worst possible catastrophy w/ prob. (1-p)
adjust lottery probability p until A?Lp

continue as before
p
A ? Lp
instant death
1 - p
16
Aside Money ? Utility function

Given a lottery L with expected monetrary value
EMV(L),
usually U(L) lt U(EMV(L))
e.g., people are risk-averse
Would you rather have 1,000,000 for sure, or a
lottery with 0.5, 0 0.5, 3,000,000?

17
Decision Networks

Extend BNs to handle actions and utilities
Also called Influence diagrams
Make use of BN inference
Can do Value of Information calculations

18
Decision Networks cont.

Chance nodes random variables, as in BNs
Decision nodes actions that decision maker can
take
Utility/value nodes the utility of the outcome
state.

19
RN example
20
Prenatal Testing Example
21
Umbrella Network
take/dont take
P(rain) 0.4
Take Umbrella
rain
umbrella
P(umbtake) 1.0 P(umbtake)1.0
happiness
U(umb, rain) 100 U(umb, rain) -100
U(umb,rain) 0 U(umb,rain) -25
22
Evaluating Decision Networks

Set the evidence variables for current state
For each possible value of the decision node
Set decision node to that value
Calculate the posterior probability of the parent
nodes of the utility node, using BN inference
Calculate the resulting utility for action
return the action with the highest utility

23
Umbrella Network
take/dont take
P(rain) 0.4
Take Umbrella
rain
umbrella
P(umbtake) 1.0 P(umbtake) 0
happiness
U(umb, rain) 100 U(umb, rain) -100
U(umb,rain) 0 U(umb,rain) -25
24
Umbrella Network
take/dont take
P(rain) 0.4
Take Umbrella
rain
umbrella
1
P(umbtake) 0.8 P(umbtake)0.1
happiness
umb rain P(umb,rain take)
0 0 0.2 x 0.6
0 1 0.2 x 0.4
1 0 0.8 x 0.6
1 1 0.8 x 0.4
U(umb, rain) 100 U(umb, rain) -100
U(umb,rain) 0 U(umb,rain) -25
1 EU(take) 100 x .12 -100 x 0.08 0 x 0.48
-25 x .32 ???
25
Umbrella Network
So, in this case I would?
take/dont take
P(rain) 0.4
Take Umbrella
rain
umbrella
2
P(umbtake) 0.8 P(umbtake)0.1
happiness
umb rain P(umb,rain take)
0 0 0. 9 x 0.6
0 1 0.9 x 0.4
1 0 0.1 x 0.6
1 1 0.1 x 0.4
U(umb, rain) 100 U(umb, rain) -100
U(umb,rain) 0 U(umb,rain) -25
2 EU(take) 100 x .54 -100 x 0.36 0 x
0.06 -25 x .04 ???
26
Value of Information

Idea Compute the expected value of acquiring
possible evidence
Example buying oil drilling rights
Two blocks A and B, exactly one of them has oil,
worth k
Prior probability 0.5
Current price of block is k/2
What is the value of getting a survey of A done?
Survey will say oil in A or no oil in A w/
prob. 0.5
Compute expected value of information (VOI)
expected value of best action given the
infromation minus expected value of best action
without information
VOI(Survey) 0.5 x value of buy A given oil in
A 0.5 x value of buy B
given no oil in A 0 ??

27
Value of Information (VOI)

suppose agents current knowledge is E. The
value of the current best action ? is

28
Umbrella Network
take/dont take
P(rain) 0.4
Take Umbrella
rain
umbrella
forecast
P(umbtake) 0.8 P(umbtake)0.1
happiness
R P(FrainyR)
0 0.2
1 0.7
U(umb, rain) 100 U(umb, rain) -100
U(umb,rain) 0 U(umb,rain) -25
29
VOI

VOI(forecast) P(rainy)EU(?rainy)
P(rainy)EU(?rainy) EU(?)

30
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
3 EU(takerainy)
1 EU(takerainy)
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
4 EU(takerainy)
2 EU(takerainy)
31
Umbrella Network
F P(RrainF)
0 0.2
1 0.7
take/dont take
Take Umbrella
rain
umbrella
forecast
P(umbtake) 0.8 P(umbtake)0.1
happiness
P(Frainy) 0.4
U(umb, rain) 100 U(umb, rain) -100
U(umb,rain) 0 U(umb,rain) -25
32
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
3 EU(takerainy)
1 EU(takerainy)
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
umb rain P(umb,rain take, rainy)
0 0
0 1
1 0
1 1
4 EU(takerainy)
2 EU(takerainy)
33
VOI

VOI(forecast) P(rainy)EU(?rainy)
P(rainy)EU(?rainy) EU(?)

34
Summary Simple Decision Making

Decision Theory Probability Theory Utility
Theory
Rational Agent operates by MEU
Decision Networks
Value of Information

Write a Comment

User Comments (0)

About PowerShow.com

Decision Making Under Uncertainty PowerPoint PPT Presentation