Decision%20Theory%20and%20Game%20Theory - PowerPoint PPT Presentation

About This Presentation
Title:

Decision%20Theory%20and%20Game%20Theory

Description:

Title: Lecture 6: MultiAgent Interactions Subject: Introduction to MultiAgent Systems Author: Jeff Rosenschein Last modified by: yzhang Created Date – PowerPoint PPT presentation

Number of Views:113
Avg rating:3.0/5.0
Slides: 36
Provided by: JeffRo162
Category:

less

Transcript and Presenter's Notes

Title: Decision%20Theory%20and%20Game%20Theory


1
Decision Theory and Game Theory
  • An Introduction to Multi-Agent Systemshttp//www.
    csc.liv.ac.uk/mjw/pubs/imas
  • Multi-Agent Systems Algorithm, Game-Theoretic,
    and Logic Foundations
  • http//www.masfoundations.org/resources.html

2
Decision Theory
  • Probability
  • Self-interested agents
  • Utilities and Preferences
  • Rationality

3
Decision Theory
  • Decision Theory is a game between agents and
    nature.
  • Such as Lotteries, slot machines.

4
Probability
  • Xi is a variable that captures some aspect of the
    current state of the environment.
  • is a possible value of Xi.
  • is some possibility of .
  • ? 0, 1.
  • .
  • Joint probability
  • if X1 and X2 are independent.
  • If not

5
Self-Interested Agents
  • What does it mean to say that an agent is
    self-interested?
  • Not that they want to harm other agents
  • Not that they only care about things that benefit
    them
  • That the agent has its preference over how the
    environment is, and that its actions are
    motivated by this description

6
State-Action Diagram
  • Set of agents
  • Each agent has a set of actions
  • Actions define outcomes
  • For each possible set of actions there is an
    outcome.
  • Outcomes define payoffs
  • Agents derive utility from different outcomes

5
10
-8
7
Utilities and Preferences
  • Assume we have just two agents Ag i, j
  • Assume W w1, w2, is the set of outcomes
    that agents have preferences over
  • We capture preferences by utility
    functions ui W ? ? uj W ? ?
  • Utility functions lead to preference orderings
    over outcomes w ?i w means ui(w) ? ui(w)
    preferred w gti w means ui(w) gt ui(w)
    strictly preferred
  • w i w means w ?i w or w ?i w
    indifferent

8
What is Utility?
  • Utility is not money (but it is a useful analogy)
  • Typical relationship between utility money

9
Rationality
  • Agent attempts to maximize its expected utility.

Lottery 1 0.5M w.p. 1 Lottery 2 1M w.p.
0.5 0 w.p. 0.5 Agents strategy is
the choice of lottery
Risk aversion gt insurance companies
10
Game Theory
  • What is a game
  • Strategies
  • Dominant strategies
  • Nash equilibrium
  • Pareto optimality
  • Competition games
  • Cooperation games
  • Coordination games
  • Axelrods tournament
  • Bounded rationality

11
What is a Game
  • Game Formal representation of a situation of
    strategic interdependence
  • We focus on games where
  • There are 2 or more players.
  • There is some choice of action where strategy
    matters.
  • The game has one or more outcomes, e.g. someone
    wins, someone loses.
  • The outcome depends on the strategies chosen by
    all players there is strategic interaction.
  • What does this rule out?
  • Games of pure chance, e.g. lotteries, slot
    machines. (Strategies don't matter).
  • Games without strategic interaction between
    players, e.g. Solitaire

12
Strategies
  • Strategy
  • A strategy, si, is a comprehensive plan of
    action defines actions agent i should take for
    all possible states of the world
  • Prisoners Dilemma Defect, Confess
  • Strategy profile s(s1,,sn)
  • s-i (s1,,si-1,si1,,sn)
  • Utility function ui(si, s-i)
  • Note that the utility of an agent depends on the
    strategy profile, not just its own strategy
  • We assume agents are expected utility maximizers

13
Three Elements of a Game
  • The players
  • how many players are there?
  • Two-players and Two-actions
  • does nature/chance play a role?
  • Pure strategies
  • Mixed strategies (with probability)
  • A complete description of the strategies of each
    player
  • A description of the consequences (payoffs) for
    each player for every possible profile of
    strategy choices of all players.

14
Assumptions Game Theorists Make
  • Payoffs are known and fixed. People treat
    expected payoffs the same as certain payoffs
    (they are risk neutral).
  • Example a risk neutral person is indifferent
    between 25 for certain or a 25 chance of
    earning 100 and a 75 chance of earning 0.
  • We can relax this assumption to capture risk
    averse behavior.
  • All players behave rationally.
  • They understand and seek to maximize their own
    payoffs.
  • They are flawless in calculating which actions
    will maximize their payoffs.
  • The rules of the game are common knowledge
  • Each player knows the set of players, strategies
    and payoffs from all possible combinations of
    strategies.

15
Multi-Agent Encounters
  • We need a model of the environment in which these
    agents will act
  • agents simultaneously choose an action to
    perform, and as a result of the actions they
    select, an outcome in W will result
  • the actual outcome depends on the combination of
    actions
  • assume each agent has just two possible actions
    that it can perform, C (cooperate) and D
    (defect)
  • Environment behavior given by state transformer
    function

16
Multiagent Encounters
  • Here is a state transformer function(This
    environment is sensitive to actions of both
    agents.)
  • Here is another(Neither agent has any
    influence in this environment.)
  • And here is another(This environment is
    controlled by j.)

17
Rational Action
  • Suppose we have the case where both agents can
    influence the outcome, and they have utility
    functions as follows
  • With a bit of abuse of notation
  • Then agent is preferences are
  • C is the rational choice for i.(Because i
    prefers all outcomes that arise through C over
    all outcomes that arise through D.)

18
Normal Form
  • We can characterize the previous scenario in a
    payoff matrix
  • Agent i is the column player
  • Agent j is the row player
  • Normal form is a way of describing a game. It
    represent the game by way of a matrix.
  • This approach can be of greater use in
    identifying Dominated Strategies and Nash
    Equilibra.

19
Dominant Strategies
  • Given any particular strategy (either C or D) of
    agent i, there will be a number of possible
    outcomes
  • We say s1 dominates s2 if every outcome possible
    by i playing s1 is preferred over every outcome
    possible by i playing s2
  • A rational agent will never play a dominated
    strategy
  • So in deciding what to do, we can delete
    dominated strategies
  • Unfortunately, there isnt always a unique
    undominated strategy

20
Nash Equilibrium
  • In general, we will say that two strategies s1
    and s2 are in Nash equilibrium if
  • under the assumption that agent i plays s1, agent
    j can do no better than play s2 and
  • under the assumption that agent j plays s2, agent
    i can do no better than play s1.
  • Neither agent has any incentive to deviate from a
    Nash equilibrium

21
Nash Equilibrium
  • Interpretations
  • Focal points, self-enforcing agreements, stable
    social convention, consequence of rational
    inference.
  • Criticisms
  • They may not be unique (Bach or Stravinsky)
  • Ways of overcoming this
  • Refinements of equilibrium concept, Mediation,
    Learning
  • Do not exist in all games (in form defined)
  • They may be hard to find
  • People dont always behave based on what
    equilibria would predict (ultimatum games and
    notions of fairness,)

22
Pareto Optimality
  • Sometimes, one outcome O is at least as good for
    every agent as another outcome O, and there is
    some agent who strictly prefers O to O
  • In this case, it seems reasonable to say that O
    is better than O
  • We say that O Pareto-dominates O
  • An outcome O is Pareto-optimal (or
    Pareto-efficient) if there is no other outcome
    that Pareto-dominates it.
  • Implied by social welfare maximization.

23
Competition Games
  • Where preferences of agents are diametrically
    opposed we have strictly competitive scenarios
  • Competition Games
  • Players have exactly opposed interests
  • There must be precisely two players (otherwise
    they cant have exactly opposed interests)
  • For all strategy profile s?S,
  • ? some constant C, s.t. ui(s) uj(s) C

24
Zero-Sum Games
  • Zero-sum games are those where utilities sum to
    zero
  • ui(s) uj(s) 0
  • Eg. Matching Pennies (a zero-sum game)

i
Head Tail
H 1, -1 -1, 1
T -1, 1 1, -1
j
25
Cooperation Game
  • Players have exactly the same interests.
  • No conflict all players want the same things
  • ?s?S, ?i,j, ui(s) uj(s)

i
A B
A 1, 1 0, 0
B 0, 0 1, 1
j
Two Nash equilibria (A, A) and (B, B) They are
also Pareto optimality
26
Coordination Game - The Prisoners Dilemma
  • Prisoners Dilemma is any game
  • where T gt R gt P gt S.

D C
D P, P S, T
C T, S R, R
27
Prisoners Dilemma
  • Two people are arrested for a crime. If neither
    suspect confesses both are released. If both
    confess then they get sent to jail. If one
    confesses and the other does not, then the
    confessor gets a light sentence and the other
    gets a heavy sentence.

Remain Silent Cooperate Confess Defect
D C
D P, P T, S
C S, T R, R
28
The Prisoners Dilemma
  • Payoff matrix forprisoners dilemma
  • Top left If both defect, then both get
    punishment for mutual defection
  • Top right If i cooperates and j defects, i gets
    suckers payoff of 1, while j gets 4
  • Bottom left If j cooperates and i defects, j
    gets suckers payoff of 1, while i gets 4
  • Bottom right Reward for mutual cooperation

29
The Prisoners Dilemma
  • The individual rational action is defectThis
    guarantees a payoff of no worse than 2, whereas
    cooperating guarantees a payoff of at most 1
  • So defection is the best response to all possible
    strategies both agents defect, and get payoff
    2
  • But intuition says this is not the best
    outcomeSurely they should both cooperate and
    each get payoff of 3!

30
The Prisoners Dilemma
  • This apparent paradox is the fundamental problem
    of multi-agent interactions.It appears to imply
    that cooperation will not occur in societies of
    self-interested agents.
  • Real world examples
  • Nuclear arms reduction (why dont I keep mine
    )
  • Free rider systems public transport
  • In the UK television licenses

31
Axelrods Tournament
  • Suppose you play iterated prisoners dilemma
    against a range of opponentsWhat strategy
    should you choose, so as to maximize your overall
    payoff?
  • Axelrod (1984) investigated this problem, with a
    computer tournament for programs playing the
    prisoners dilemma

32
Strategies in Axelrods Tournament
  • ALLD
  • Always defect the hawk strategy
  • TIT-FOR-TAT
  • On round u 0, cooperate
  • On round u gt 0, do what your opponent did on
    round u 1
  • TESTER
  • On 1st round, defect. If the opponent retaliated,
    then play TIT-FOR-TAT. Otherwise intersperse
    cooperation and defection.
  • JOSS
  • As TIT-FOR-TAT, except periodically defect

33
Recipes for Success in Axelrods Tournament
  • Axelrod suggests the following rules for
    succeeding in his tournament
  • Dont be enviousDont play as if it were zero
    sum!
  • Be niceStart by cooperating, and reciprocate
    cooperation
  • Retaliate appropriatelyAlways punish defection
    immediately, but use measured force dont
    overdo it
  • Dont hold grudgesAlways reciprocate
    cooperation immediately

34
Game of Chicken
  • Consider another type of encounter the game of
    chicken(Think of James Dean in Rebel without
    a Cause swerving coop, driving straight
    defect.)
  • Strategies (c,d) and (d,c) are in Nash
    equilibrium
  • Difference to prisoners dilemma Mutual
    defection is most feared outcome.(Whereas
    suckers payoff is most feared in prisoners
    dilemma.)
  • It refers to a situation in which there is a
    competition for a shared resource and the
    contestants can choose either conciliation or
    conflict.

35
Bounded Rationality
  • By Herbert Simon, perfectly rational decisions
    are often not feasible in practice due to the
    finite computational resources available for
    making them.
  • Game theory assumes that it is possible to
    characterize an agents preferences with respect
    to possible outcomes. Humans, however, find it
    extremely hard to consistently define their
    preference over outcomes.
  • Most game theoretic negotiation techniques tend
    to assume the availability of unlimited
    computational resources to find an optimal
    solution they have the characteristics of
    NP-hard problems.
Write a Comment
User Comments (0)
About PowerShow.com