Title: Lecture V: Game Theory
1Lecture V Game Theory
- Zhixin Liu
- Complex Systems Research Center,
- Academy of Mathematics and Systems Sciences, CAS
2In the last two lectures, we talked about
- Multi-Agent Systems
- Analysis
- Intervention
3In this lecture, we will talk about
- Game theory
- complex interactions between people
4Start With A Game
B
rock
paper
scissor
rock
0,0
-1,1
1,-1
A
paper
0,0
-1,1
1,-1
scissor
-1,1
0,0
1,-1
Other games poker, go, chess, bridge,
basketball, football,
5From Games To Game Theory
- Some hints from the games
- Rules
- Results (payoff)
- Strategies
- Interactions between strategies and payoff
- Games are everywhere.
- Economic systems oligarchy monopoly, market,
trade - Political systems voting, presidential
election, international relations - Military systems war, negotiation,
- Game theory
- the study of the strategic interactions among
rational agents. - Rationality
- implies that each player tries to maximize
his/her payoff
Not to beat the other players
6History of Game Theory
- 1928, John von Neumann proved the minimax theorem
- 1944, John von Neumann Oskar Morgenstern,
Theory of Games and Economic Behaviors - 1950s, John Nash, Nash Equilibrium
- 1970s, John Maynard Smith, Evolutionarily stable
strategy - Eight game theorists have won Nobel prizes in
economics
7Elements of A Game
- Player
- Who is interacting? N1,2,,n
- Actions/ Moves What the players can do?
- Action set
- Payoff What the players can get from the game
8Strategy
- Strategy complete plan of actions
- Mixed strategy probability distribution over the
pure strategies - Payoff
Pure strategy is a special kind of mixed
strategies
9An Example Rock-paper-scissor
- Players A and B
- Actions/ Moves
- rock, scissor, paper
- Payoff
- u1(rock,scissor)1
- u2(scissor, paper)-1
- Mixed strategies
- s1(1/3,1/3,1/3)
- s2(0,1/2,1/2)
- u1(s1, s2) 1/3(001/2(-1)1/21)
- 1/3(011/201/2(-1))
1/3(0(-1)1/211/20) - 0
B
rock
paper
scissor
rock
0,0
-1,1
1,-1
A
paper
0,0
-1,1
1,-1
scissor
-1,1
0,0
1,-1
10Classifications of Games
- Cooperative and non-cooperative games
- Cooperative game players are able to form
binding commitments. - Non cooperative games the players make
decisions independently - Zero sum and non-zero sum games
- Zero sum game the total payoff to all
players is zero. E.g., poker, go, - Non-zero sum game e.g., prisoners dilemma
- Finite game and infinite game
- Finite game the players and the actions are
finite. - Simultaneous and sequential (dynamic) games
- Simultaneous game players move
simultaneously, or if they do not move
simultaneously, the later players are unaware of
the earlier players' actions - Sequential game later players have some
knowledge about earlier actions. - Perfect information and imperfect information
games - Perfect information game all players know
the moves previously made by all other players.
E.g., chess, go, - Perfect information ? Complete information
Every player know the strategies and payoffs of
the other players but not necessarily the actions.
11We will first focus on games Simultaneous
Complete information Non cooperative Finite
- What is the solution of the game?
12Assumption
- Assume that each player
- knows the structure of the game
- attempts to maximize his payoff
- attempt to predict the moves of his opponents.
- knows that this is the common knowledge between
the players
13 Dominated Strategy
A strategy is dominated if, regardless of what
any other players do, the strategy earns a player
a smaller payoff than some other strategies.
S-i the strategy set formed by all other
players except player i
- Strategy s' of the player i is called a strictly
dominated strategy if there exists a strategy s,
such that
14Elimination of Dominated Strategies
Example
L
M
R
L
R
L
R
L
U
4,3
5,1
6,2
U
4,3
6,2
M
8,4
3,6
2,1
M
3,6
2,1
U
4,3
U
4,3
6,2
2,8
3,0
9,6
D
2,8
3,0
D
(U,L) is the solution of the game.
A dominant strategy may not exist!
15Definition of Nash Equilibrium
- Nash Equilibrium (NE) A solution concept of a
game
- (N, S, u) a game
- Si strategy set for player i
- set of
strategy profiles -
payoff function - s-i strategy profile of all players except
player i - A strategy profile s is called a Nash
equilibrium if - where si is any pure strategy of the player i.
16Remarks on Nash Equilibrium
- A set of strategies, one for each player, such
that each players strategy is a best response to
others strategies - Best Response
- The strategy that maximizes the payoff given
others strategies. - No player can do better by unilaterally changing
his or her strategy - A dominant strategy is a NE
17Example
- Players Smith and Louis
- Actions Advertise , Do Not Advertise
- Payoffs Companies Profits
- Each firm earns 50 million from its customers
- Advertising costs a firm 20 million
- Advertising captures 30 million from competitor
- How to represent this game?
18Strategic Interactions
Smith
Ad
No Ad
(50,50)
(20,60)
No Ad
Louis
(60,20)
(30,30)
Ad
19Best Responses
- Best response for Louis
- If Smith advertises advertise
- If Smith does not advertise advertise
- The best response for Smith is the same.
- (Ad, Ad) is a dominant strategy!
- (Ad, Ad) is a NE!
- This is another Prisoners Dilemma!
Smith
No Ad
Ad
(20,60)
(50,50)
No Ad
Louis
(30,30)
(60,20)
Ad
20Nash Equilibrium
- NE may be a pair of mixed strategies.
- Example
B
Tail
head
(-1,1)
(1,-1)
head
A
(1,-1)
(-1,1)
Tail
Matching Pennies
(1/2,1/2) is the Nash Equilibrium.
21Existence of NE
- Theorem (J. Nash, 1950s)
- For a finite game, there exists at least one
Nash Equilibrium (Pure strategy, or mixed
strategy).
22Nash Equilibrium
- NE may not be a good solution of the game, it is
different from the optimal solution. - e.g.,
Smith
No Ad
Ad
(20,60)
(50,50)
No Ad
Louis
(30,30)
(60,20)
Ad
23Nash Equilibrium
- A game may have more than one NE.
- e.g., The Battle of Sex
- NE (opera, opera), (football, football),
- ((2/3,1/3),(1/3, 2/3))
Husband
football
opera
(0,0)
(2,1)
opera
Wife
(1,2)
(0,0)
football
24Nash Equilibrium
- Zero sum games (two-person) Saddle point is a
solution
25Nash Equilibrium
- Many varieties of NE Refined NE, Bayesian NE,
Sub-game Perfect NE, Perfect Bayesian NE - Finding NEs is very difficult.
- NE can only tell us if the game reach such a
state, then no player has incentive to change
their strategies unilaterally. But NE can not
tell us how to reach such a state.
26- Iterated Prisoners Dilemma
27Cooperation
- Groups of organisms
- Mutual cooperation is of benefit to all agents
- Lack of cooperation is harmful to them
- Another types of cooperation
- Cooperating agents do well
- Any one will do better if failing cooperate
- Prisoners Dilemma is an elegant embodiment
28Prisoners Dilemma
- The story of prisoners dilemma
- Player two prisoners
- Action Cooperation, Defecti
- Payoff matrix
Prisoner B
C
D
(0,5)
(3,3)
C
Prisoner A
(1,1)
(5,0)
D
29Prisoners Dilemma
- No matter what the other does, the best choice
is D. - (D,D) is a Nash Equilibrium.
- But, if both choose D, both will do worse than
if both select C
Prisoner B
C
D
(0,5)
(3,3)
C
Prisoner A
(1,1)
(5,0)
D
30Iterated Prisoners Dilemma
- The individuals
- Meet many times
- Can recognize a previous interactant
- Remember the prior outcome
- Strategy specify the probability of cooperation
and defect based on the history - P(C)f1(History)
- P(D)f2(History)
31Strategies
- Tit For Tat cooperating on the first time, then
repeat opponent's last choice. - Player A C D D C C C C C D D D D C
- Player B D D C C C C C D D D D C
32Strategies
- Tit For Tat - cooperating on the first time, then
repeat opponent's last choice. - Tit For Tat and Random - Repeat opponent's last
choice skewed by random setting. - Tit For Two Tats and Random - Like Tit For Tat
except that opponent must make the same choice
twice in a row before it is reciprocated. Choice
is skewed by random setting. - Tit For Two Tats - Like Tit For Tat except that
opponent must make the same choice twice in row
before it is reciprocated. - Naive Prober (Tit For Tat with Random Defection)
- Repeat opponent's last choice (ie Tit For Tat),
but sometimes probe by defecting in lieu of
cooperating. - Remorseful Prober (Tit For Tat with Random
Defection) - Repeat opponent's last choice (ie
Tit For Tat), but sometimes probe by defecting in
lieu of cooperating. If the opponent defects in
response to probing, show remorse by cooperating
once. - Naive Peace Maker (Tit For Tat with Random
Co-operation) - Repeat opponent's last choice (ie
Tit For Tat), but sometimes make peace by
co-operating in lieu of defecting. - True Peace Maker (hybrid of Tit For Tat and Tit
For Two Tats with Random Cooperation) - Cooperate
unless opponent defects twice in a row, then
defect once, but sometimes make peace by
cooperating in lieu of defecting. - Random - always set at 50 probability.
33Strategies
- Always Defect
- Always Cooperate
- Grudger (Co-operate, but only be a sucker once) -
Cooperate until the opponent defects. Then always
defect unforgivingly. - Pavlov (repeat last choice if good outcome) - If
5 or 3 points scored in the last round then
repeat last choice. - Pavlov / Random (repeat last choice if good
outcome and Random) - If 5 or 3 points scored in
the last round then repeat last choice, but
sometimes make random choices. - Adaptive - Starts with c,c,c,c,c,c,d,d,d,d,d and
then takes choices which have given the best
average score re-calculated after every move. - Gradual - Cooperates until the opponent defects,
in such case defects the total number of times
the opponent has defected during the game.
Followed up by two co-operations. - Suspicious Tit For Tat - As for Tit For Tat
except begins by defecting. - Soft Grudger - Cooperates until the opponent
defects, in such case opponent is punished with
d,d,d,d,c,c. - Customised strategy 1 - default setting is T1,
P1, R1, S0, B1, always co-operate unless
sucker (ie 0 points scored). - Customised strategy 2 - default setting is T1,
P1, R0, S0, B0, always play alternating
defect/cooperate.
34Iterated Prisoners Dilemma
- The same players repeat the prisoners dilemma
many times. - After ten rounds
- The best income is 50.
- A real case is to get 30 for each player.
- An extreme case is that each player selects
defection, each player can get 10. - The most possible case is that each player will
play with a mixing strategy of defect and
cooperate .
Prisoner A
C
D
(0,5)
(3,3)
C
Prisoner B
(1,1)
(5,0)
D
35Iterated Prisoners Dilemma
- Which strategy can thrive/what is the good
strategy? - Robert Axelrod, 1980s
- A computer round-robin tournament
AXELROD R. 1987. The evolution of strategies in
the iterated Prisoners' Dilemma. In L. Davis,
editor, Genetic Algorithms and Simulated
Annealing. Morgan Kaufmann, Los Altos, CA.
36The first round
- Strategies 14 entries random strategy
- Including Markov process Bayesian inference
- Each pair will meet each other, totally there are
1515 runs, each pair will play the game 200
times - Payoff ?S U(S,S)/15
- Tit For Tat wins (cooperation based on
reciprocity)
37The first round
Naive Prober - Repeat opponent's last choice but
sometimes probe by defecting in lieu of
cooperating
- Characters of good strategies
- Goodness never defect first
- TFT vs. Naive prober
- Forgiveness may revenge, but the memory is
short. - TFT vs. Grudger
Grudger - Cooperate until the opponent defects.
Then always defect unforgivingly
38Winning Vs. High Scores
- This is not a zero sum game, there is a banker.
- TFT never wins one game. The best result for it
is to get the same result as its opponent. - Winning the game is a kind of jealousness, it
does not work well - It is possible to arise cooperation in a
selfish group.
39The second round
- Strategies 62 entries random strategy
- goodness strategies
- wiliness strategies
- Tit For Tat wins again
- Win or lost depends on the circumstance.
40Characters of good strategies
- Goodness never defect first
- First round the first eight strategies with
goodness - Second round there are fourteen strategies with
goodness in the first fifteen strategies - Forgiveness may revenge, but the memory is
short. - Grudger is not s strategy with forgiveness
- goodness and forgiveness is a kind of
collective behavior. - For a single agent, defect is the best strategy.
41Evolution of the Strategies
- Evolve good strategies by genetic algorithm
(GA)
42What is a good strategy?
- TFT is a good strategy?
- Tit For Two Tats may be the best strategy in the
first round, but it is not a good strategy in the
second round. - Good strategy depends on the environment.
- Tit For Two Tats - Like Tit For Tat except that
opponent must make the same choice twice in row
before it is reciprocated.
Evolutionarily stable strategy
43Evolutionarily stable strategy (ESS)
- Introduced by John Maynard Smith and George R.
Price in 1973 - ESS means evolutionarily stable strategy, that is
a strategy such that, if all member of the
population adopt it, then no mutant strategy
could invade the population under the influence
of natural selection. - ESS is robust for evolution, it can not be
invaded by mutation.
John Maynard Smith, Evolution and the Theory of
Games
44Definition of ESS
- A strategy x is an ESS if for all y, y ? x, such
that -
- holds for small positivee.
45ESS
- ESS is defined in a population with a large
number of individuals. - The individuals can not control the strategy, and
may not be aware the game they played - ESS is the result of natural selection
- Like NE, ESS can only tell us it is robust to the
evolution, but it can not tell us how the
population reach such a state.
46ESS in IPD
- Tit For Tat can not be invaded by the wiliness
strategies, such as always defect. - TFT can be invaded by goodness strategies, such
as always cooperate, Tit For Two Tats and
Suspicious Tit For Tat - Tit For Tat is not a strict ESS.
- Always Cooperate can be invaded by Always
Defect. - Always Defect is an ESS.
47references
- Drew Fudenberg, Jean Tirole, Game Theory, The MIT
Press, 1991. - AXELROD R. 1987. The evolution of strategies in
the iterated Prisoners' Dilemma. In L. Davis,
editor, Genetic Algorithms and Simulated
Annealing. Morgan Kaufmann, Los Altos, CA. - Richard Dawkins, The Selfish Gene, Oxford
University Press.
48Concluding Remarks
- Tip Of Game theory
- Basic Concepts
- Nash Equilibrium
- Iterated Prisoners Dilemma
- Evolutionarily Stable Strategy
49Concluding Remarks
- Many interesting topics deserve to be studied and
further investigated - Cooperative games
- Incomplete information games
- Dynamic games
- Combinatorial games
- Learning in games
- .
50Thank you!