Title: Gametheory: A brief survey of some classic and recent topics for the Intro to AI course at CMU By Tu
1Game-theory A brief survey of some classic and
recent topics for the Intro to AI course at
CMUBy Tuomas SandholmAssociate
ProfessorComputer Science DepartmentCarnegie
Mellon University
2The heart of the problem
- In a 1-agent setting, agents expected utility
maximizing strategy is well-defined - But in a multiagent system, the outcome may
depend on others strategies also
3Terminology
- Agent player
- Action move choice that agent can make at a
point in the game - Strategy si mapping from history (to the
extent that the agent i can distinguish) to
actions - Strategy set Si strategies available to the
agent - Strategy profile (s1, s2, ..., sA) one
strategy for each agent - Agents utility is determined after each agent
(including nature that is used to model
uncertainty) has chosen its strategy, and game
has been played ui ui(s1, s2, ..., sA)
4Game representations
5Dominant strategy equilibrium
- Best response si for all si, ui(si,s-i)
ui(si,s-i) - Dominant strategy si si is a best response
for all s-i - Does not always exist
- Inferior strategies are called dominated
- Dominant strategy equilibrium is a strategy
profile where each agent has picked its dominant
strategy - Does not always exist
- Requires no counterspeculation
6Nash equilibrium Nash50
- Sometimes an agents best response depends on
others strategies a dominant strategy does not
exist - A strategy profile is a Nash equilibrium if no
player has incentive to deviate from his strategy
given that others do not deviate for every agent
i, ui(si,s-i) ui(si,s-i) for all si - Dominant strategy equilibria are Nash equilibria
but not vice versa - Defect-defect is the only Nash eq. in Prisoners
Dilemma - Battle of the Sexes game
- Has no dominant strategy equilibria
7Criticisms of Nash equilibrium
- Not unique in all games, e.g. Battle of the Sexes
- Approaches for addressing this problem
- Refinements of the equilibrium concept
- Choose the Nash equilibrium with highest welfare
- Subgame perfection
-
- Focal points
- Mediation
- Communication
- Convention
- Learning
- Does not exist in all games
- May be hard to compute
- Finding a good one is NP-hard GilboaZemel
GEB-89, ConitzerSandholm IJCAI-03
8Existence of (pure strategy) Nash equilibria
- IF a game is finite
- and at every point in the game, the agent whose
turn it is to move knows what moves have been
played so far - THEN the game has a (pure strategy) Nash
equilibrium - (solvable by minimax search at least as long as
ties are ruled out)
9Rock-scissors-paper game
10Rock-scissors-paper game
11Mixed strategy Nash equilibrium
Mixed strategy agents chosen probability
distribution over pure strategies from its
strategy set
Each agent has a best response strategy and
beliefs (consistent with each other)
Symmetric mixed
strategy Nash eq
Each player
plays each pure
strategy with
probability 1/3
In mixed strategy equilibrium, each strategy that
occurs in the mix of agent i has equal expected
utility to i
Information set (the mover does not know which
node of the set she is in)
12Existence of mixed strategy Nash equilibria
- Every finite player, finite strategy game has at
least one Nash equilibrium if we admit mixed
strategy equilibria as well as pure Nash 50 - (Proof is based on Kakutanis fix point theorem)
13Complexity of finding a mixed-strategy Nash
equilibrium?
- Still an open question!
- Together with factoring the most important
concrete open question on the boundary of P
today Papadimitriou 2001 - We solved several related questions
- Conitzer Sandholm, International Joint
Conference on Artificial Intelligence 2003
14A useful reduction (SAT -gt game) CS, IJCAI-03
- Theorem. SAT-solutions correspond to
mixed-strategy equilibria of the following game
(each agent randomizes uniformly on support)
SAT Formula
(x1 or -x2) and (-x1 or x2 )
Solutions
x1true, x2true
x1false,x2false
Game
x1
x2
x1
-x1
x2
-x2
(x1 or -x2)
(-x1 or x2)
default
x1
-2,-2
-2,-2
0,-2
0,-2
2,-2
2,-2
-2,-2
-2,-2
-2,1
x2
-2,-2
-2,-2
2,-2
2,-2
0,-2
0,-2
-2,-2
-2,-2
-2,1
x1
-2,0
-2,2
1,1
-2,-2
1,1
1,1
-2,0
-2,2
-2,1
-x1
-2,0
-2,2
-2,-2
1,1
1,1
1,1
-2,2
-2,0
-2,1
x2
-2,2
-2,0
1,1
1,1
1,1
-2,-2
-2,2
-2,0
-2,1
-x2
-2,2
-2,0
1,1
1,1
-2,-2
1,1
-2,0
-2,2
-2,1
(x1 or -x2)
-2,-2
-2,-2
0,-2
2,-2
2,-2
0,-2
-2,-2
-2,-2
-2,1
(-x1 or x2)
-2,-2
-2,-2
2,-2
0,-2
0,-2
2,-2
-2,-2
-2,-2
-2,1
default
1,-2
1,-2
1,-2
1,-2
1,-2
1,-2
1,-2
1,-2
0,0
- Proof sketch
- Playing opposite literals (with any probability)
is unstable - If you play literals (with probabilities), you
should make sure that - for any clause, the probability of the literal
being in that clause is high enough, and - for any variable, the probability that the
literal corresponds to that variable is high
enough - (otherwise the other player will play this
clause/variable and hurt you) - So equilibria where both randomize over literals
can only occur when both randomize over same SAT
solution - These are the only good equilibria
15Complexity of mixed-strategy Nash equilibria with
certain properties
- This reduction implies that there is an
equilibrium where players get expected utility 1
each iff the SAT formula is satisfiable - Any reasonable objective would prefer such
equilibria to 0-payoff equilibrium - Corollary. Deciding whether a good equilibrium
exists is NP-hard - 1. equilibrium with high social welfare
- 2. Pareto-optimal equilibrium
- 3. equilibrium with high utility for a given
player i - 4. equilibrium with high minimal utility
- Also NP-hard (from the same reduction)
- 5. Does more than one equilibrium exists?
- 6. Is a given strategy ever played in any
equilibrium? - 7. Is there an equilibrium where a given strategy
is never played? - (5) weaker versions of (4), (6), (7) were known
Gilboa, Zemel GEB-89 - All these hold even for symmetric, 2-player games
16Counting the number of mixed-strategy Nash
equilibria
- Why count equilibria? If we cannot even count
the equilibria, there is little hope of getting a
good overview of the overall strategic structure
of the game - Unfortunately, our reduction implies
- Corollary. Counting Nash equilibria is P-hard!
- Proof. SAT is P-hard, and the number of
equilibria is 1 SAT - Corollary. Counting connected sets of equilibria
is just as hard - Proof. In our game, each equilibrium is alone in
its connected set - These results hold even for symmetric, 2-player
games
17Subgame perfect equilibrium credible
threatsSelten 72
- Proper subgame subtree (of the game tree) whose
root is alone in its information set - Subgame perfect equilibrium strategy profile
that is in Nash equilibrium in every proper
subgame (including the root), whether or not that
subgame is reached along the equilibrium path of
play - E.g. Cuban missile crisis
- Pure strategy Nash equilibria (Arm,Fold),
(Retract,Nuke) - Pure strategy subgame perfect equilibria
(Arm,Fold) - Conclusion Kennedys Nuke threat was not credible
18Conclusions on game-theoretic analysis tools
- Tools for building robust, nonmanipulable systems
with self-interested agents and different agent
designers - Different solution concepts
- For existence, use strongest equilibrium concept
- For uniqueness, use weakest equilibrium concept
19Mechanism design
Necessary for building nonmanipulable multiagent
systems (e.g. for automated negotiation)
20Goal of mechanism design
- Implementing a social choice function f(u1, ,
uA) using a game - Center auctioneer does not know the agents
preferences - Agents may lie
- Goal is to design the rules of the game (aka
mechanism) so that in equilibrium (s1, , sA),
the outcome of the game is f(u1, , uA) - Mechanism designer specifies the strategy sets Si
and how outcome is determined as a function of
(s1, , sA) ? (S1, , SA) - Variants
- Strongest There exists exactly one equilibrium.
Its outcome is f(u1, , uA) - Medium In every equilibrium the outcome is f(u1,
, uA) - Weakest In at least one equilibrium the outcome
is f(u1, , uA)
21Revelation principle
- Any outcome that can be supported in Nash
(dominant strategy) equilibrium via a complex
indirect mechanism can be supported in Nash
(dominant strategy) equilibrium via a direct
mechanism where agents reveal their types
truthfully in a single step
22Uses of the revelation principle
- Literal Only direct mechanisms needed
- Problems
- Strategy formulator might be complex
- Complex to determine and/or execute best-response
strategy - Computational burden is pushed on the center
(assumed away) - Thus the revelation principle might not hold in
practice if these computational problems are hard - This problem traditionally ignored in game theory
- Agent might not know its own preferences, and
figuring them out can be costly (e.g.
computationally). In an indirect mechanism, the
right outcome can sometimes be chosen without
eliciting everything about each agents
preferences - See e.g., Preference Elicitation in
Combinatorial Auctions by Conen Sandholm - Even if the indirect mechanism has a unique
equilibrium, the direct mechanism can have
additional bad equilibria - As an analysis tool
- Best direct mechanism gives tight upper bound on
how well any indirect mechanism can do - Space of direct mechanisms is smaller than that
of indirect ones - One can analyze all direct mechanisms pick best
one - Thus one can know when one has designed an
optimal indirect mechanism (when it is as good as
the best direct one)
23Implementation in dominant strategies
Strongest form of mechanism design
- Goal is to design the rules of the game (aka
mechanism) so that in dominant strategy
equilibrium (s1, , sA), the outcome of the
game is f(u1, , uA) - Nice in that agents cannot benefit from
counterspeculating each other - Others preferences
- Others rationality
- Others endowments
- Others capabilities
24Gibbard-Satterthwaite impossibility 1972
- Thrm. If O 3 (and each outcome would be
the social choice under f for some input profile
(u1, , uA) ) and f is implementable in
dominant strategies, then f is dictatorial
25(No Transcript)
26Special case where dominant strategy
implementation is possible Quasilinear
preferences -gt Clarke tax mechanism
- Outcome (x1, x2, ..., xk, m1, m2, ..., mA )
- Quasilinear preferences ui(x, m) mi vi(x1,
x2, ..., xk) - Utilitarian setting Social welfare maximizing
choice - Outcome s(v1, v2, ..., vA) maxx ?i vi(x1, x2,
..., xk) - Agents payment mi ?j?i vj(s(v)) - ?j?i
vj(s(v-i)) ? 0 is a tax - Thrm Every agents dominant strategy is to
reveal preferences truthfully - Intuition Agent internalizes the negative
externality he imposes on others by affecting the
outcome - Agent pays nothing if he does not change the
outcome - Example k1, x1joint pool built or not,
mi - E.g. equal sharing of construction cost -c / A
27Clarke tax mechanism
- Pros
- Social welfare maximizing outcome
- Truth-telling is a dominant strategy
- Feasible in that it does not need a benefactor
(?i mi ? 0) - Cons
- Budget balance not maintained (in pool example,
generally ?i mi lt 0) - Have to burn the excess money that is collected
- Thrm. Green Laffont 1979. Let the agents
have arbitrary quasilinear preferences. No
social choice function that is (ex post) welfare
maximizing (taking into account money burning as
a loss) is implementable in dominant strategies - If there is some party that has no private
information to reveal and no preferences over x,
welfare maximization and budget balance can be
obtained by having that partys payment be m0 -
?i1.. mi - Auctioneer could be called agent 0
- Clarke tax mechanism gt Second-price sealed-bid
(Vickrey) auction - Vulnerable to collusion
- Even by coalitions of just 2 agents
28Another approach for circumventing the
Gibbard-Satterthwaite impossibility (and
potentially other impossibility results)
- Design the game so that (although manipulations
exist), finding a beneficial manipulation is
computationally so complex for an agent that the
agent cannot do that - E.g. Complexity of Manipulating Elections with
Few Candidates - Conitzer Sandholm, National Conference on
Artificial Intelligence 2002 - E.g. Universal Voting Protocol Tweaks for Making
Manipulation Hard - Conitzer Sandholm, International Joint
Conference on Artificial Intelligence 2003
29Yet another approach for circumventing the
Gibbard-Satterthwaite impossibility (and
potentially other impossibility results)
- Designing the mechanism automatically to the
situation at hand - Input is the probabilistic information that the
center has about the agents - Output is an optimal mechanism where the agents
are motivated to reveal their preferences
truthfully, and a social objective is satisfied
to the optimal extent - Advantages
- Can be used even without side payments
quasilinear preferences - Could achieve better outcomes than Clarke tax
mechanism - Circumvents impossibility in many cases
- Complexity of Mechanism Design Conitzer
Sandholm, Conference on Uncertainty in AI 2002 - Designing a deterministic mechanism is
NP-complete - Designing a randomized mechanism is fast
- No loss in social objective, sometime a gain
- Both results hold for dominant strategy
implementation as well as Bayes-Nash
implementation