Lecture V: Game Theory - PowerPoint PPT Presentation

About This Presentation
Title:

Lecture V: Game Theory

Description:

Tit For Tat - cooperating on the first time, then repeat opponent's last choice. ... Grudger (Co-operate, but only be a sucker once) - Cooperate until the ... – PowerPoint PPT presentation

Number of Views:362
Avg rating:3.0/5.0
Slides: 51
Provided by: del5236
Category:
Tags: game | lecture | study | theory | time

less

Transcript and Presenter's Notes

Title: Lecture V: Game Theory


1
Lecture V Game Theory
  • Zhixin Liu
  • Complex Systems Research Center,
  • Academy of Mathematics and Systems Sciences, CAS

2
In the last two lectures, we talked about
  • Multi-Agent Systems
  • Analysis
  • Intervention

3
In this lecture, we will talk about
  • Game theory
  • complex interactions between people

4
Start With A Game
  • Rock-paper-scissor

B




rock
paper
scissor
rock
0,0
-1,1
1,-1
A
paper
0,0
-1,1
1,-1
scissor
-1,1
0,0
1,-1
Other games poker, go, chess, bridge,
basketball, football,
5
From Games To Game Theory
  • Some hints from the games
  • Rules
  • Results (payoff)
  • Strategies
  • Interactions between strategies and payoff
  • Games are everywhere.
  • Economic systems oligarchy monopoly, market,
    trade
  • Political systems voting, presidential
    election, international relations
  • Military systems war, negotiation,
  • Game theory
  • the study of the strategic interactions among
    rational agents.
  • Rationality
  • implies that each player tries to maximize
    his/her payoff

Not to beat the other players
6
History of Game Theory
  • 1928, John von Neumann proved the minimax theorem
  • 1944, John von Neumann Oskar Morgenstern,
    Theory of Games and Economic Behaviors
  • 1950s, John Nash, Nash Equilibrium
  • 1970s, John Maynard Smith, Evolutionarily stable
    strategy
  • Eight game theorists have won Nobel prizes in
    economics

7
Elements of A Game
  • Player
  • Who is interacting? N1,2,,n
  • Actions/ Moves What the players can do?
  • Action set
  • Payoff What the players can get from the game

8
Strategy
  • Strategy complete plan of actions
  • Mixed strategy probability distribution over the
    pure strategies
  • Payoff

Pure strategy is a special kind of mixed
strategies
9
An Example Rock-paper-scissor
  • Players A and B
  • Actions/ Moves
  • rock, scissor, paper
  • Payoff
  • u1(rock,scissor)1
  • u2(scissor, paper)-1
  • Mixed strategies
  • s1(1/3,1/3,1/3)
  • s2(0,1/2,1/2)
  • u1(s1, s2) 1/3(001/2(-1)1/21)
  • 1/3(011/201/2(-1))
    1/3(0(-1)1/211/20)
  • 0

B




rock
paper
scissor
rock
0,0
-1,1
1,-1
A
paper
0,0
-1,1
1,-1
scissor
-1,1
0,0
1,-1
10
Classifications of Games
  • Cooperative and non-cooperative games
  • Cooperative game players are able to form
    binding commitments.
  • Non cooperative games the players make
    decisions independently
  • Zero sum and non-zero sum games
  • Zero sum game the total payoff to all
    players is zero. E.g., poker, go,
  • Non-zero sum game e.g., prisoners dilemma
  • Finite game and infinite game
  • Finite game the players and the actions are
    finite.
  • Simultaneous and sequential (dynamic) games
  • Simultaneous game players move
    simultaneously, or if they do not move
    simultaneously, the later players are unaware of
    the earlier players' actions
  • Sequential game later players have some
    knowledge about earlier actions.
  • Perfect information and imperfect information
    games
  • Perfect information game all players know
    the moves previously made by all other players.
    E.g., chess, go,
  • Perfect information ? Complete information

Every player know the strategies and payoffs of
the other players but not necessarily the actions.
11
We will first focus on games Simultaneous
Complete information Non cooperative Finite
  • What is the solution of the game?

12
Assumption
  • Assume that each player
  • knows the structure of the game
  • attempts to maximize his payoff
  • attempt to predict the moves of his opponents.
  • knows that this is the common knowledge between
    the players

13
 Dominated Strategy 
A strategy is dominated if, regardless of what
any other players do, the strategy earns a player
a smaller payoff than some other strategies.
S-i the strategy set formed by all other
players except player i
  • Strategy s' of the player i is called a strictly
    dominated strategy if there exists a strategy s,
    such that

14
Elimination of Dominated Strategies 
Example








L
M
R
L
R




L
R
L
U
4,3
5,1
6,2
U
4,3
6,2
M
8,4
3,6
2,1
M
3,6
2,1
U
4,3
U
4,3
6,2
2,8
3,0
9,6
D
2,8
3,0
D
(U,L) is the solution of the game.
A dominant strategy may not exist!
15
Definition of Nash Equilibrium
  • Nash Equilibrium (NE) A solution concept of a
    game
  • (N, S, u) a game
  • Si strategy set for player i
  • set of
    strategy profiles

  • payoff function
  • s-i strategy profile of all players except
    player i
  • A strategy profile s is called a Nash
    equilibrium if
  • where si is any pure strategy of the player i.

16
Remarks on Nash Equilibrium
  • A set of strategies, one for each player, such
    that each players strategy is a best response to
    others strategies
  • Best Response
  • The strategy that maximizes the payoff given
    others strategies.
  • No player can do better by unilaterally changing
    his or her strategy
  • A dominant strategy is a NE

17
Example
  • Players Smith and Louis
  • Actions Advertise , Do Not Advertise
  • Payoffs Companies Profits
  • Each firm earns 50 million from its customers
  • Advertising costs a firm 20 million
  • Advertising captures 30 million from competitor
  • How to represent this game?

18
Strategic Interactions
Smith
Ad
No Ad
(50,50)
(20,60)
No Ad
Louis
(60,20)
(30,30)
Ad
19
Best Responses
  • Best response for Louis
  • If Smith advertises advertise
  • If Smith does not advertise advertise
  • The best response for Smith is the same.
  • (Ad, Ad) is a dominant strategy!
  • (Ad, Ad) is a NE!
  • This is another Prisoners Dilemma!

Smith
No Ad
Ad
(20,60)
(50,50)
No Ad
Louis
(30,30)
(60,20)
Ad
20
Nash Equilibrium
  • NE may be a pair of mixed strategies.
  • Example

B
Tail
head
(-1,1)
(1,-1)
head
A
(1,-1)
(-1,1)
Tail
Matching Pennies
(1/2,1/2) is the Nash Equilibrium.
21
Existence of NE
  • Theorem (J. Nash, 1950s)
  • For a finite game, there exists at least one
    Nash Equilibrium (Pure strategy, or mixed
    strategy).

22
Nash Equilibrium
  • NE may not be a good solution of the game, it is
    different from the optimal solution.
  • e.g.,

Smith
No Ad
Ad
(20,60)
(50,50)
No Ad
Louis
(30,30)
(60,20)
Ad
23
Nash Equilibrium
  • A game may have more than one NE.
  • e.g., The Battle of Sex
  • NE (opera, opera), (football, football),
  • ((2/3,1/3),(1/3, 2/3))

Husband
football
opera
(0,0)
(2,1)
opera
Wife
(1,2)
(0,0)
football
24
Nash Equilibrium
  • Zero sum games (two-person) Saddle point is a
    solution

25
Nash Equilibrium
  • Many varieties of NE Refined NE, Bayesian NE,
    Sub-game Perfect NE, Perfect Bayesian NE
  • Finding NEs is very difficult.
  • NE can only tell us if the game reach such a
    state, then no player has incentive to change
    their strategies unilaterally. But NE can not
    tell us how to reach such a state.

26
  • Iterated Prisoners Dilemma

27
Cooperation
  • Groups of organisms
  • Mutual cooperation is of benefit to all agents
  • Lack of cooperation is harmful to them
  • Another types of cooperation
  • Cooperating agents do well
  • Any one will do better if failing cooperate
  • Prisoners Dilemma is an elegant embodiment

28
Prisoners Dilemma
  • The story of prisoners dilemma
  • Player two prisoners
  • Action Cooperation, Defecti
  • Payoff matrix

Prisoner B
C
D
(0,5)
(3,3)
C
Prisoner A
(1,1)
(5,0)
D
29
Prisoners Dilemma
  • No matter what the other does, the best choice
    is D.
  • (D,D) is a Nash Equilibrium.
  • But, if both choose D, both will do worse than
    if both select C

Prisoner B
C
D
(0,5)
(3,3)
C
Prisoner A
(1,1)
(5,0)
D
30
Iterated Prisoners Dilemma
  • The individuals
  • Meet many times
  • Can recognize a previous interactant
  • Remember the prior outcome
  • Strategy specify the probability of cooperation
    and defect based on the history
  • P(C)f1(History)
  • P(D)f2(History)

31
Strategies
  • Tit For Tat cooperating on the first time, then
    repeat opponent's last choice.
  • Player A C D D C C C C C D D D D C
  • Player B D D C C C C C D D D D C

32
Strategies
  • Tit For Tat - cooperating on the first time, then
    repeat opponent's last choice.
  • Tit For Tat and Random - Repeat opponent's last
    choice skewed by random setting.
  • Tit For Two Tats and Random - Like Tit For Tat
    except that opponent must make the same choice
    twice in a row before it is reciprocated. Choice
    is skewed by random setting.
  • Tit For Two Tats - Like Tit For Tat except that
    opponent must make the same choice twice in row
    before it is reciprocated.
  • Naive Prober (Tit For Tat with Random Defection)
    - Repeat opponent's last choice (ie Tit For Tat),
    but sometimes probe by defecting in lieu of
    cooperating.
  • Remorseful Prober (Tit For Tat with Random
    Defection) - Repeat opponent's last choice (ie
    Tit For Tat), but sometimes probe by defecting in
    lieu of cooperating. If the opponent defects in
    response to probing, show remorse by cooperating
    once.
  • Naive Peace Maker (Tit For Tat with Random
    Co-operation) - Repeat opponent's last choice (ie
    Tit For Tat), but sometimes make peace by
    co-operating in lieu of defecting.
  • True Peace Maker (hybrid of Tit For Tat and Tit
    For Two Tats with Random Cooperation) - Cooperate
    unless opponent defects twice in a row, then
    defect once, but sometimes make peace by
    cooperating in lieu of defecting.
  • Random - always set at 50 probability.

33
Strategies
  • Always Defect
  • Always Cooperate
  • Grudger (Co-operate, but only be a sucker once) -
    Cooperate until the opponent defects. Then always
    defect unforgivingly.
  • Pavlov (repeat last choice if good outcome) - If
    5 or 3 points scored in the last round then
    repeat last choice.
  • Pavlov / Random (repeat last choice if good
    outcome and Random) - If 5 or 3 points scored in
    the last round then repeat last choice, but
    sometimes make random choices.
  • Adaptive - Starts with c,c,c,c,c,c,d,d,d,d,d and
    then takes choices which have given the best
    average score re-calculated after every move.
  • Gradual - Cooperates until the opponent defects,
    in such case defects the total number of times
    the opponent has defected during the game.
    Followed up by two co-operations.
  • Suspicious Tit For Tat - As for Tit For Tat
    except begins by defecting.
  • Soft Grudger - Cooperates until the opponent
    defects, in such case opponent is punished with
    d,d,d,d,c,c.
  • Customised strategy 1 - default setting is T1,
    P1, R1, S0, B1, always co-operate unless
    sucker (ie 0 points scored).
  • Customised strategy 2 - default setting is T1,
    P1, R0, S0, B0, always play alternating
    defect/cooperate.

34
Iterated Prisoners Dilemma
  • The same players repeat the prisoners dilemma
    many times.
  • After ten rounds
  • The best income is 50.
  • A real case is to get 30 for each player.
  • An extreme case is that each player selects
    defection, each player can get 10.
  • The most possible case is that each player will
    play with a mixing strategy of defect and
    cooperate .

Prisoner A
C
D
(0,5)
(3,3)
C
Prisoner B
(1,1)
(5,0)
D
35
Iterated Prisoners Dilemma
  • Which strategy can thrive/what is the good
    strategy?
  • Robert Axelrod, 1980s
  • A computer round-robin tournament

AXELROD R. 1987. The evolution of strategies in
the iterated Prisoners' Dilemma. In L. Davis,
editor, Genetic Algorithms and Simulated
Annealing. Morgan Kaufmann, Los Altos, CA.
36
The first round
  • Strategies 14 entries random strategy
  • Including Markov process Bayesian inference
  • Each pair will meet each other, totally there are
    1515 runs, each pair will play the game 200
    times
  • Payoff ?S U(S,S)/15
  • Tit For Tat wins (cooperation based on
    reciprocity)

37
The first round
Naive Prober - Repeat opponent's last choice but
sometimes probe by defecting in lieu of
cooperating
  • Characters of good strategies
  • Goodness never defect first
  • TFT vs. Naive prober
  • Forgiveness may revenge, but the memory is
    short.
  • TFT vs. Grudger

Grudger - Cooperate until the opponent defects.
Then always defect unforgivingly
38
Winning Vs. High Scores
  • This is not a zero sum game, there is a banker.
  • TFT never wins one game. The best result for it
    is to get the same result as its opponent.
  • Winning the game is a kind of jealousness, it
    does not work well
  • It is possible to arise cooperation in a
    selfish group.

39
The second round
  • Strategies 62 entries random strategy
  • goodness strategies
  • wiliness strategies
  • Tit For Tat wins again
  • Win or lost depends on the circumstance.

40
Characters of good strategies
  • Goodness never defect first
  • First round the first eight strategies with
    goodness
  • Second round there are fourteen strategies with
    goodness in the first fifteen strategies
  • Forgiveness may revenge, but the memory is
    short.
  • Grudger is not s strategy with forgiveness
  • goodness and forgiveness is a kind of
    collective behavior.
  • For a single agent, defect is the best strategy.

41
Evolution of the Strategies
  • Evolve good strategies by genetic algorithm
    (GA)

42
What is a good strategy?
  • TFT is a good strategy?
  • Tit For Two Tats may be the best strategy in the
    first round, but it is not a good strategy in the
    second round.
  • Good strategy depends on the environment.
  • Tit For Two Tats - Like Tit For Tat except that
    opponent must make the same choice twice in row
    before it is reciprocated.

Evolutionarily stable strategy
43
Evolutionarily stable strategy (ESS)
  • Introduced by John Maynard Smith and George R.
    Price in 1973
  • ESS means evolutionarily stable strategy, that is
    a strategy such that, if all member of the
    population adopt it, then no mutant strategy
    could invade the population under the influence
    of natural selection.
  • ESS is robust for evolution, it can not be
    invaded by mutation.

John Maynard Smith, Evolution and the Theory of
Games
44
Definition of ESS
  • A strategy x is an ESS if for all y, y ? x, such
    that
  • holds for small positivee.

45
ESS
  • ESS is defined in a population with a large
    number of individuals.
  • The individuals can not control the strategy, and
    may not be aware the game they played
  • ESS is the result of natural selection
  • Like NE, ESS can only tell us it is robust to the
    evolution, but it can not tell us how the
    population reach such a state.

46
ESS in IPD
  • Tit For Tat can not be invaded by the wiliness
    strategies, such as always defect.
  • TFT can be invaded by goodness strategies, such
    as always cooperate, Tit For Two Tats and
    Suspicious Tit For Tat
  • Tit For Tat is not a strict ESS.
  • Always Cooperate can be invaded by Always
    Defect.
  • Always Defect is an ESS.

47
references
  • Drew Fudenberg, Jean Tirole, Game Theory, The MIT
    Press, 1991.
  • AXELROD R. 1987. The evolution of strategies in
    the iterated Prisoners' Dilemma. In L. Davis,
    editor, Genetic Algorithms and Simulated
    Annealing. Morgan Kaufmann, Los Altos, CA.
  • Richard Dawkins, The Selfish Gene, Oxford
    University Press.

48
Concluding Remarks
  • Tip Of Game theory
  • Basic Concepts
  • Nash Equilibrium
  • Iterated Prisoners Dilemma
  • Evolutionarily Stable Strategy

49
Concluding Remarks
  • Many interesting topics deserve to be studied and
    further investigated
  • Cooperative games
  • Incomplete information games
  • Dynamic games
  • Combinatorial games
  • Learning in games
  • .

50
Thank you!
Write a Comment
User Comments (0)
About PowerShow.com