Lecture V Game Theory

- Zhixin Liu
- Complex Systems Research Center,
- Academy of Mathematics and Systems Sciences, CAS

In the last two lectures, we talked about

- Multi-Agent Systems
- Analysis
- Intervention

In this lecture, we will talk about

- Game theory
- complex interactions between people

Start With A Game

- Rock-paper-scissor

B

rock

paper

scissor

rock

0,0

-1,1

1,-1

A

paper

0,0

-1,1

1,-1

scissor

-1,1

0,0

1,-1

Other games poker, go, chess, bridge,

basketball, football,

From Games To Game Theory

- Some hints from the games
- Rules
- Results (payoff)
- Strategies
- Interactions between strategies and payoff
- Games are everywhere.
- Economic systems oligarchy monopoly, market,

trade - Political systems voting, presidential

election, international relations - Military systems war, negotiation,
- Game theory
- the study of the strategic interactions among

rational agents. - Rationality
- implies that each player tries to maximize

his/her payoff

Not to beat the other players

History of Game Theory

- 1928, John von Neumann proved the minimax theorem
- 1944, John von Neumann Oskar Morgenstern,

Theory of Games and Economic Behaviors - 1950s, John Nash, Nash Equilibrium
- 1970s, John Maynard Smith, Evolutionarily stable

strategy - Eight game theorists have won Nobel prizes in

economics

Elements of A Game

- Player
- Who is interacting? N1,2,,n
- Actions/ Moves What the players can do?
- Action set
- Payoff What the players can get from the game

Strategy

- Strategy complete plan of actions
- Mixed strategy probability distribution over the

pure strategies - Payoff

Pure strategy is a special kind of mixed

strategies

An Example Rock-paper-scissor

- Players A and B
- Actions/ Moves
- rock, scissor, paper
- Payoff
- u1(rock,scissor)1
- u2(scissor, paper)-1
- Mixed strategies
- s1(1/3,1/3,1/3)
- s2(0,1/2,1/2)
- u1(s1, s2) 1/3(001/2(-1)1/21)
- 1/3(011/201/2(-1))

1/3(0(-1)1/211/20) - 0

B

rock

paper

scissor

rock

0,0

-1,1

1,-1

A

paper

0,0

-1,1

1,-1

scissor

-1,1

0,0

1,-1

Classifications of Games

- Cooperative and non-cooperative games
- Cooperative game players are able to form

binding commitments. - Non cooperative games the players make

decisions independently - Zero sum and non-zero sum games
- Zero sum game the total payoff to all

players is zero. E.g., poker, go, - Non-zero sum game e.g., prisoners dilemma
- Finite game and infinite game
- Finite game the players and the actions are

finite. - Simultaneous and sequential (dynamic) games
- Simultaneous game players move

simultaneously, or if they do not move

simultaneously, the later players are unaware of

the earlier players' actions - Sequential game later players have some

knowledge about earlier actions. - Perfect information and imperfect information

games - Perfect information game all players know

the moves previously made by all other players.

E.g., chess, go, - Perfect information ? Complete information

Every player know the strategies and payoffs of

the other players but not necessarily the actions.

We will first focus on games Simultaneous

Complete information Non cooperative Finite

- What is the solution of the game?

Assumption

- Assume that each player
- knows the structure of the game
- attempts to maximize his payoff
- attempt to predict the moves of his opponents.
- knows that this is the common knowledge between

the players

Dominated Strategy

A strategy is dominated if, regardless of what

any other players do, the strategy earns a player

a smaller payoff than some other strategies.

S-i the strategy set formed by all other

players except player i

- Strategy s' of the player i is called a strictly

dominated strategy if there exists a strategy s,

such that

Elimination of Dominated Strategies

Example

L

M

R

L

R

L

R

L

U

4,3

5,1

6,2

U

4,3

6,2

M

8,4

3,6

2,1

M

3,6

2,1

U

4,3

U

4,3

6,2

2,8

3,0

9,6

D

2,8

3,0

D

(U,L) is the solution of the game.

A dominant strategy may not exist!

Definition of Nash Equilibrium

- Nash Equilibrium (NE) A solution concept of a

game

- (N, S, u) a game
- Si strategy set for player i
- set of

strategy profiles -

payoff function - s-i strategy profile of all players except

player i - A strategy profile s is called a Nash

equilibrium if - where si is any pure strategy of the player i.

Remarks on Nash Equilibrium

- A set of strategies, one for each player, such

that each players strategy is a best response to

others strategies - Best Response
- The strategy that maximizes the payoff given

others strategies. - No player can do better by unilaterally changing

his or her strategy - A dominant strategy is a NE

Example

- Players Smith and Louis
- Actions Advertise , Do Not Advertise
- Payoffs Companies Profits
- Each firm earns 50 million from its customers
- Advertising costs a firm 20 million
- Advertising captures 30 million from competitor
- How to represent this game?

Strategic Interactions

Smith

Ad

No Ad

(50,50)

(20,60)

No Ad

Louis

(60,20)

(30,30)

Ad

Best Responses

- Best response for Louis
- If Smith advertises advertise
- If Smith does not advertise advertise
- The best response for Smith is the same.
- (Ad, Ad) is a dominant strategy!
- (Ad, Ad) is a NE!
- This is another Prisoners Dilemma!

Smith

No Ad

Ad

(20,60)

(50,50)

No Ad

Louis

(30,30)

(60,20)

Ad

Nash Equilibrium

- NE may be a pair of mixed strategies.
- Example

B

Tail

head

(-1,1)

(1,-1)

head

A

(1,-1)

(-1,1)

Tail

Matching Pennies

(1/2,1/2) is the Nash Equilibrium.

Existence of NE

- Theorem (J. Nash, 1950s)
- For a finite game, there exists at least one

Nash Equilibrium (Pure strategy, or mixed

strategy).

Nash Equilibrium

- NE may not be a good solution of the game, it is

different from the optimal solution. - e.g.,

Smith

No Ad

Ad

(20,60)

(50,50)

No Ad

Louis

(30,30)

(60,20)

Ad

Nash Equilibrium

- A game may have more than one NE.
- e.g., The Battle of Sex
- NE (opera, opera), (football, football),
- ((2/3,1/3),(1/3, 2/3))

Husband

football

opera

(0,0)

(2,1)

opera

Wife

(1,2)

(0,0)

football

Nash Equilibrium

- Zero sum games (two-person) Saddle point is a

solution

Nash Equilibrium

- Many varieties of NE Refined NE, Bayesian NE,

Sub-game Perfect NE, Perfect Bayesian NE - Finding NEs is very difficult.
- NE can only tell us if the game reach such a

state, then no player has incentive to change

their strategies unilaterally. But NE can not

tell us how to reach such a state.

- Iterated Prisoners Dilemma

Cooperation

- Groups of organisms
- Mutual cooperation is of benefit to all agents
- Lack of cooperation is harmful to them
- Another types of cooperation
- Cooperating agents do well
- Any one will do better if failing cooperate
- Prisoners Dilemma is an elegant embodiment

Prisoners Dilemma

- The story of prisoners dilemma
- Player two prisoners
- Action Cooperation, Defecti
- Payoff matrix

Prisoner B

C

D

(0,5)

(3,3)

C

Prisoner A

(1,1)

(5,0)

D

Prisoners Dilemma

- No matter what the other does, the best choice

is D. - (D,D) is a Nash Equilibrium.
- But, if both choose D, both will do worse than

if both select C

Prisoner B

C

D

(0,5)

(3,3)

C

Prisoner A

(1,1)

(5,0)

D

Iterated Prisoners Dilemma

- The individuals
- Meet many times
- Can recognize a previous interactant
- Remember the prior outcome
- Strategy specify the probability of cooperation

and defect based on the history - P(C)f1(History)
- P(D)f2(History)

Strategies

- Tit For Tat cooperating on the first time, then

repeat opponent's last choice. - Player A C D D C C C C C D D D D C
- Player B D D C C C C C D D D D C

Strategies

- Tit For Tat - cooperating on the first time, then

repeat opponent's last choice. - Tit For Tat and Random - Repeat opponent's last

choice skewed by random setting. - Tit For Two Tats and Random - Like Tit For Tat

except that opponent must make the same choice

twice in a row before it is reciprocated. Choice

is skewed by random setting. - Tit For Two Tats - Like Tit For Tat except that

opponent must make the same choice twice in row

before it is reciprocated. - Naive Prober (Tit For Tat with Random Defection)

- Repeat opponent's last choice (ie Tit For Tat),

but sometimes probe by defecting in lieu of

cooperating. - Remorseful Prober (Tit For Tat with Random

Defection) - Repeat opponent's last choice (ie

Tit For Tat), but sometimes probe by defecting in

lieu of cooperating. If the opponent defects in

response to probing, show remorse by cooperating

once. - Naive Peace Maker (Tit For Tat with Random

Co-operation) - Repeat opponent's last choice (ie

Tit For Tat), but sometimes make peace by

co-operating in lieu of defecting. - True Peace Maker (hybrid of Tit For Tat and Tit

For Two Tats with Random Cooperation) - Cooperate

unless opponent defects twice in a row, then

defect once, but sometimes make peace by

cooperating in lieu of defecting. - Random - always set at 50 probability.

Strategies

- Always Defect
- Always Cooperate
- Grudger (Co-operate, but only be a sucker once) -

Cooperate until the opponent defects. Then always

defect unforgivingly. - Pavlov (repeat last choice if good outcome) - If

5 or 3 points scored in the last round then

repeat last choice. - Pavlov / Random (repeat last choice if good

outcome and Random) - If 5 or 3 points scored in

the last round then repeat last choice, but

sometimes make random choices. - Adaptive - Starts with c,c,c,c,c,c,d,d,d,d,d and

then takes choices which have given the best

average score re-calculated after every move. - Gradual - Cooperates until the opponent defects,

in such case defects the total number of times

the opponent has defected during the game.

Followed up by two co-operations. - Suspicious Tit For Tat - As for Tit For Tat

except begins by defecting. - Soft Grudger - Cooperates until the opponent

defects, in such case opponent is punished with

d,d,d,d,c,c. - Customised strategy 1 - default setting is T1,

P1, R1, S0, B1, always co-operate unless

sucker (ie 0 points scored). - Customised strategy 2 - default setting is T1,

P1, R0, S0, B0, always play alternating

defect/cooperate.

Iterated Prisoners Dilemma

- The same players repeat the prisoners dilemma

many times. - After ten rounds
- The best income is 50.
- A real case is to get 30 for each player.
- An extreme case is that each player selects

defection, each player can get 10. - The most possible case is that each player will

play with a mixing strategy of defect and

cooperate .

Prisoner A

C

D

(0,5)

(3,3)

C

Prisoner B

(1,1)

(5,0)

D

Iterated Prisoners Dilemma

- Which strategy can thrive/what is the good

strategy? - Robert Axelrod, 1980s
- A computer round-robin tournament

AXELROD R. 1987. The evolution of strategies in

the iterated Prisoners' Dilemma. In L. Davis,

editor, Genetic Algorithms and Simulated

Annealing. Morgan Kaufmann, Los Altos, CA.

The first round

- Strategies 14 entries random strategy
- Including Markov process Bayesian inference
- Each pair will meet each other, totally there are

1515 runs, each pair will play the game 200

times - Payoff ?S U(S,S)/15
- Tit For Tat wins (cooperation based on

reciprocity)

The first round

Naive Prober - Repeat opponent's last choice but

sometimes probe by defecting in lieu of

cooperating

- Characters of good strategies
- Goodness never defect first
- TFT vs. Naive prober
- Forgiveness may revenge, but the memory is

short. - TFT vs. Grudger

Grudger - Cooperate until the opponent defects.

Then always defect unforgivingly

Winning Vs. High Scores

- This is not a zero sum game, there is a banker.
- TFT never wins one game. The best result for it

is to get the same result as its opponent. - Winning the game is a kind of jealousness, it

does not work well - It is possible to arise cooperation in a

selfish group.

The second round

- Strategies 62 entries random strategy
- goodness strategies
- wiliness strategies
- Tit For Tat wins again
- Win or lost depends on the circumstance.

Characters of good strategies

- Goodness never defect first
- First round the first eight strategies with

goodness - Second round there are fourteen strategies with

goodness in the first fifteen strategies - Forgiveness may revenge, but the memory is

short. - Grudger is not s strategy with forgiveness
- goodness and forgiveness is a kind of

collective behavior. - For a single agent, defect is the best strategy.

Evolution of the Strategies

- Evolve good strategies by genetic algorithm

(GA)

What is a good strategy?

- TFT is a good strategy?
- Tit For Two Tats may be the best strategy in the

first round, but it is not a good strategy in the

second round. - Good strategy depends on the environment.

- Tit For Two Tats - Like Tit For Tat except that

opponent must make the same choice twice in row

before it is reciprocated.

Evolutionarily stable strategy

Evolutionarily stable strategy (ESS)

- Introduced by John Maynard Smith and George R.

Price in 1973 - ESS means evolutionarily stable strategy, that is

a strategy such that, if all member of the

population adopt it, then no mutant strategy

could invade the population under the influence

of natural selection. - ESS is robust for evolution, it can not be

invaded by mutation.

John Maynard Smith, Evolution and the Theory of

Games

Definition of ESS

- A strategy x is an ESS if for all y, y ? x, such

that - holds for small positivee.

ESS

- ESS is defined in a population with a large

number of individuals. - The individuals can not control the strategy, and

may not be aware the game they played - ESS is the result of natural selection
- Like NE, ESS can only tell us it is robust to the

evolution, but it can not tell us how the

population reach such a state.

ESS in IPD

- Tit For Tat can not be invaded by the wiliness

strategies, such as always defect. - TFT can be invaded by goodness strategies, such

as always cooperate, Tit For Two Tats and

Suspicious Tit For Tat - Tit For Tat is not a strict ESS.
- Always Cooperate can be invaded by Always

Defect. - Always Defect is an ESS.

references

- Drew Fudenberg, Jean Tirole, Game Theory, The MIT

Press, 1991. - AXELROD R. 1987. The evolution of strategies in

the iterated Prisoners' Dilemma. In L. Davis,

editor, Genetic Algorithms and Simulated

Annealing. Morgan Kaufmann, Los Altos, CA. - Richard Dawkins, The Selfish Gene, Oxford

University Press.

Concluding Remarks

- Tip Of Game theory
- Basic Concepts
- Nash Equilibrium
- Iterated Prisoners Dilemma
- Evolutionarily Stable Strategy

Concluding Remarks

- Many interesting topics deserve to be studied and

further investigated - Cooperative games
- Incomplete information games
- Dynamic games
- Combinatorial games
- Learning in games
- .

Thank you!