Title: MAKING SIMPLE DECISIONS
1MAKING SIMPLE DECISIONS
Russell Norvig
2Ideas
- Utility Functions
- Utility functions that depend on several
quantities - Decision Theoretic Agent
- Decision Networks / Influence Diagrams
- Information Theory
3Utility Functions
- Utility functions an agents preference
between world states. Assigns a single number to
express the desirability of a state. - States are complete snapshots of the world.
- U(S) utility of a state S according to the
agent making the decisions.
Result1(A)
Action A
Result2(A)
Evidence available to the agent
Non- deterministic
Result3(A)
P(Resulti(A)/Do(A), E)
4Maximum Expected Utility ( MEU)
- Expected Utility
- EU (A/E) ?i P(Resulti(A) /Do
(A), E) U(Resulti(A)) - Principle of MEU A rational agent should choose
an action that maximizes its EU. - Problems with MEU -
- Computationally prohibitive.
- Difficult to formulate any problem completely.
- Knowing the initial state of the world requires
perception, inference, knowledge
representation and learning. - Computing P(Resulti(A) /Do (A), E) requires a
complete causal model of the entire world. - Computing U(Resulti(A)) requires search and
planning. An agent does not know the goodness of
a state until it gets to it. -
5Utility Functions
- If an agent maximizes a utility function that
correctly reflects the performance measure by
which its behavior is being judged, then it will
achieve the highest possible performance score,
if we average over the possible environments in
which the agent could be placed. - Single / one-shot decisions as opposed to
sequential decisions.
6Utility Theory
- Complex scenarios are called Lotteries.
- Different outcomes are like prizes and they are
determined by chance. - L p, A 1-p , B
- Multiple outcomes. Outcomes can be atomic or
other lotteries. - Preferences between lotteries defined as
- A gt B A is preferred to B
- A B Agent is indifferent between A B
- A gt B Agent prefers A to B or is indifferent
between them.
7Axioms of Utility Theory
- Orderability An agent should know what it
wants. - (A gt B) ? (B gt A) ? (A B)
- Transitivity Preferences are transitive.
- (A gt B) (B gt C) gt (A gt C)
- Continuity For some state B which lies between
A and C in preference, - A gt B gt C gt ? p p, A 1-p, C B
- Substitutability
- A B gt p, A 1-p, C p, B 1-p, C
- Monotonicity
- A gt B p ? q gt p, A 1-p, B gt q, A 1-q,
B - Decomposability An agent should not prefer a
lottery just because it has more choices than
another. - p, A 1-p, q, B 1-q, C p, A (1-p)q, B
(1-p)(1-q), C
8Utility Principle
- Utility axioms refer only to preferences
basic property of rational agents. - Utility Principle If the agents preferences
obey the axioms of utility, then there is a real
valued function U that operates on states A B
such that - U(A) gt U(B) iff A gt B
- U(A) U(B) iff A B
- MEU Principle
- Utility of a lottery ? (Probability of each
outcome) (Utility of that outcome) - U(p1, s1 , pn, sn ) ?i pi U(si)
9Choice of Utility Function
- Utility theory has its roots in economics.
- Money is an obvious candidate for utility
measure. - Agent has a monotonic preference for money.
- Money behaves as an ordinal utility measure gt
agent prefers more money to less when considering
definite amounts.
10Definitions of agent behavior
- Risk averse agents For a lottery L, the utility
of being faced with the lottery lt utility of
being handed the expected monetary value of the
lottery as a sure thing. - U(SL) lt U(SEMV(L))
- Risk seeking
- U(SL) gt U(SEMV(L))
- Certainty Equivalent of the lottery Value an
agent will accept in lieu of lottery. - Insurance premium difference between EMV of a
lottery and its certainty equivalent. - Risk Neutral An agent with a linear curve
(Utility versus money) .
11Assess Utilities
- Establish a scale with a best possible prize and
the worst possible catastrophe. - U(S) u U(S) u ?
- Normalized utilities
- U(S) u 1 U(S) u ? 0
- Utilities of intermediate outcomes are assessed
by asking the agent to indicate a preference
between outcome state S and a standard lottery
p, u (1-p) u ? .
12Multi-attribute Utility Function
- If an option has attributes X1, X2,. Xn with
values x1, x2, xn then the utility function is - U(x1, x2, xn ) f f1(x1),.fn(xn)
- Strict dominance If an option has higher values
on all attributes than another option. - Cumulative Distribution Measures the
probability that the cost ? any given amount. It
integrates the original distribution. - Stochastic Dominance Action A1 dominates A2 on
X if - ? x p1(x) dx ? p2(x) dx
- In this case we now have n attributes with m
possible values. gt Possible outcomes of size mn.
Therefore we need simplified decision procedures.
13Mutual Preferential Independence
- 2 attributes X1 X2 are preferentially
independent of a third attribute X3 if the
preference between outcomes (x1, x2, x3) (x1,
x2, x3) does not depend on a particular value x3
for X3. - MPI gt Each attribute may be independent, but it
does not affect the way in which one trades off
the other attributes against each other. - If attributes are MPI then the agents preference
behavior can be described as maximizing the
function - V(S) ?i Vi(Xi(S))
14Decision Networks
- Combine belief networks with additional node
types for actions and utilities. - It represents information about the agents
current state, its possible actions, the state
resulting from the agents action and the utility
of that state.
Airport Site
Air Traffic
Deaths
U
Litigation
Noise
Construction
Cost
15Evaluating Decision Networks
- Algorithm for evaluating decision networks
- Set the evidence variables for the current state.
- For each possible value of the decision node
- Set the decision node to that value.
- Calculate the posterior probabilities for the
parent nodes of the utility node. - Calculate the resulting utility for the action.
- Return the action with the highest utility.
16Information Theory
- Important part of decision making is knowing
which questions to ask. - Value of the current best action ? be
- EU(?/E) max ?i P(Resulti(A) /Do (A), E)
U(Resulti(A)) - Value of the best action after new evidence Ej is
obtained - EU(?/E, Ej) max ?i P(Resulti(A) /Do (A), E, Ej)
U(Resulti(A)) - Value of discovering Ej is defined as
- VPI(Ej ) ?k P(Ej ejk/E) EU(? ejk/E, Ej ejk
) - EU(?/E) - Myopic or a greedy search agent
17Modular Utility Representation For Decision
Theoretic Planning
Michael P. Wellman Jon Doyle
18Decision Theoretic Planning
- Design planning systems to design constructs that
can be represented in terms of probabilities and
utilities. - Support computationally tractable inference about
plans and partial plans. - Goals do not provide any means to resolve
tradeoffs among competing tradeoffs or to
express partial satisfaction. - Ad hoc measures to account for this include
augmenting goals with numeric achievement values
to individual goals. - Problem measures lack any precise meaning.
19Decision Theoretic Planning
- Using decision theoretic preferences allows
designers to - Judge coherence of objectives
- Judge effectiveness of the planning system
- 2 options in using decision theory
- Specify a utility function over the entire domain
and ranking the plan results by desirability. - Modular representations that separately specify
preference information so as to allow dynamic
combination of relevant factors.
20Modularity
- Modularity in knowledge representation
specification of flexibly composable model
elements. - Synthesis of a composite plan involves looking at
the overall effects of the plan as a modular
combination of the effects of its constituent
actions. - Specify utility functions over individual
features or small groups of features and
combining these in decision problems involving
sets of features.
21Utility Functions
Utility Functions
- In planning, outcome state resulting from
execution of a plan. - Outcomes are ranked by comparing the numeric
values of the utility function applied. - If u(?) ?? u(?) then ?(u(?)) ?? ?(u(?)) , if ?
is a monotonically increasing function. - Both the functions represent the same preference
order and so sanction identical decisions. - Therefore u and ? ? u are strategically
equivalent.
22Multi-attribute Outcomes
- Attributes preference relevant features of an
outcome. - In order to separate overall preference into a
preference for individual attributes, a structure
is imposed on the outcome space. - Framing defines a multi-attribute representation
of the outcome space. - Each outcome ? is represented as a vector ? ?1 ,
?2 , ?n?. Each outcome attribute is drawn from
an attribute space Ai.
23Separability and Utility Independence
- Specification of a multi-dimensional function as
a combination of functions of lower dimension
depends on the separability of the various
dimensions. - The lower dimensions themselves must also have a
meaningful interpretation in terms of preferences
gt subutility functions. - Preferences for attribute i must be invariant in
some sense w.r.t. other attributes. - When a utility function is specified for an
attribute , implies all decisions involving the
attribute are determined, assuming all the other
attributes are fixed. - The decisions do not depend on the fixed values
the other attributes take Utility Independence
24Utility Independence
- One attribute is UI of the remaining attributes
if preferences for prospects over this attribute,
holding the other attribute values fixed, do not
depend on the fixed values of those attributes. - UI is not symmetric.
- Without UI relationships it is not possible to
refer to preferences over individual outcome
features via subutility functions. - If A1 is UI of A2 then u(?1, ?2) au(?1, ?2)
b for a gt 0. - a and b may depend on ?2
- u(?1, ?2) g(?2)u(?1, ?2) h(?2)
25 Multi-linear Decomposition
- Multi-linear decomposition n-dimensional
utility function is separable into n-1 subutility
functions for individual attribute functions. - u(?1 , ?2, ?n) f(?1 , u2(?2), , un(?n))
- Function f is linear in each argument (holding
the others fixed) except the first. - All attributes (except possibly the first) are UI
of the rest. - Disadvantage need to specify O(2n) scaling
constants in addition to the single attribute
functions.
26Multiplicative Decomposition
Multiplicative Decomposition
- Sum or product of subutility functions, each
weighed by a scaling constant. - Requires only O(n) parameters.
- Each subset of attributes must be UI of its
complement. - Impossible to specify UI for all subsets.
- If 2 attribute sets are considered, each UI of
its complement, with a nonempty intersection, Y.
Then X, Y, Z, X ? Z, X ? Y ? Z are all UI of
their respective complements. - Unique decomposition hierarchy Utility tree
corresponding to any set of UI conditions.
27Example
- u1, u2,3, u2,4,5,6, u4,5 u7 are the subutility
functions.
Multilinear
(1) Bulk cargo
(7) Safety
Multiplicative
Multilinear
(3) Tardiness
(2) Expenses
(4) Vehicles
(6) Human Resources
(5) Facilities
28Rationality and Intelligence
Stuart Russell University of California Berkeley
29Intelligence
- Motivation for studying AI create and
understand intelligence as a general property of
systems, rather than as a specific attribute of
humans. - Presupposes that there is a productive notion of
intelligence. - Agent based view of AI intelligence is
strongly related to the capacity for successful
behavior. - Rational agents are agents whose actions make
sense from the point of view of the information
they possess and its goals. - Rationality is a property of the actions and does
not specify he process by which the actions are
selected.
30Perfect Rationality
Perfect Rationality
- Perfect rationality Capacity to generate
maximally successful behavior given the available
information. - fopt argmaxf V(f,E,U) gt Agent does the best
it can. - V(f,E,U) Expected value .
- Problems
- Specifying utilities over time
- Relationship between goals and utility.
- Perfectly rational agents do not exist.
- Time lag to process information and select
actions.
31Rationality
- Calculative Rationality - A program that, if
executed infinitely fast, would result in
perfectly rational behavior. - Systems based on influence diagrams satisfy the
decision theoretic version of calculative
rationality. - Metalevel Rationality Finding an optimal
tradeoff between computational costs and decision
quality. - Object level - carries out computations
concerned with the application domain ( computing
the utility functions.. etc). These are actions
with costs and benefits. - Metalevel decision making process consisting
of object level computations . A rational
metalevel selects computations based on their
expected utility.
32Rational Metareasoning
- Information Value assume that the decision
theoretic value of acquiring an additional piece
of information can be calculated. - Need to simulate the decision process that would
be followed given each possible outcome of
information request . - Time or optimality tradeoff has to be made for
metalevel computations. - In some environments most effective agent
design may simply be a reactive agent, an agent
that does no metareasoning at all. - Bounded Optimality - Capacity to generate
maximally successful behavior given available
information and computational resources.