BestReply Mechanisms - PowerPoint PPT Presentation

1 / 35

About This Presentation

Title:

BestReply Mechanisms

Description:

Max Solvable Games ... Theorem: in max-solvable games, with n players, any (round-robin) best-reply ... Theorem: Max-solvable games converge in any ... – PowerPoint PPT presentation

Number of Views:36

Avg rating:3.0/5.0

Slides: 36

Provided by: csHu

Category:

more less

Transcript and Presenter's Notes

Title: BestReply Mechanisms

1
Best-Reply Mechanisms

Noam Nisan, Michael Schapira and Aviv Zohar

2
On The Agenda

Best-Reply Dynamics
Convergence issues - Max Solvable Games
Strategic issues Universally Max Solvable
Games.
Best Reply as a Mechanism
Examples
Single Item Auction, Matching, Congestion Control.

3
Best-Reply Dynamics

Repeatedly
Fix the strategies of all players but one.
Set that players strategy to be a best reply to
the others.
Greedy, myopic.
A natural naïve approach for computing pure Nash.
Often used as an actual strategy (Internet
protocols, markets)
Does it make sense to use best-reply in such
settings?

4
Example Battle of the Sexes
Column Player
Row Player
5
Three Desirable Properties

An equilibrium point pure Nash
At some point in time everything settles down.
Does not have to exist (e.g. rock-paper-scissors).
(Fast) convergence to equilibrium
Polynomial in the size of strategy spaces.
Incentive Compatibility
Players will want to follow the prescribed
strategy.

6
Potential Games

Defined using better-reply dynamics
MondererShapley
Potential games all games for which better
reply always converges.
Convergence may take exponential time.
It is PLS-Complete to find a pure Nash.
Fabrikant, Papadimitriou, Talwar
Not incentive compatible (an example later).

7
Max Dominated Strategies

Definition A strategy is max-dominatedif it is
not a best-reply to any strategy-profile of the
other players.
Any strictly-dominated strategy is max-dominated.
Ties can be handled too. (Not in this talk.)

Max Dominated Strategy
8
Max Solvable Games

Definition A max-solvable game is a game in
which iterated elimination of max-dominated
strategies leaves only one strategy for each
player.

9
Convergence

Theorem max-solvable games have a unique pure
Nash equilibrium.
Theorem in max-solvable games, with n players,
any (round-robin) best-reply dynamics converges
in n(Si mi ) steps.
mi is the size of the strategy-space of player i.

10
Asynchronous Convergence

Asynchronous Convergence
Players do not have to act one at a time.
Best-reply relies on the current action of
others.What if these messages get delayed?

11
Asynchronous Convergence

Theorem Max-solvable games converge in any
asynchronous timing that
does not delay any players activation
indefinitely.
does not delay messages indefinitely.

12
Incentive Compatibility

Prescribed behavior Best-Reply.
Will you follow it?
Notice not a fully observable setting. A player
does not always know the utilities of others.
To play best-reply a player only needs to know
his own utility and the actions of others.
Max solvable games are not enough to guarantee
incentive compatibility.

13
Example Not Incentive Compatible
Column Player
Row Player
14
Univesally Max Dominated Strategies

Definition A set of strategies for some player
is universally-max-dominated if its best payoff
is strictly worse than all payoffs of the other
strategies .

Not universally-max-dominated
Universally-max-dominated
15
Univesally Max Solvable Games

Definition A game is universally max-solvable if
repeated elimination of universally-max dominated
strategies leaves only one strategy profile.
Every universally-max-solvable game is also
max-solvable

16
Universally-Max-Solvable Games

Theorem The pure-Nash equilibrium in
universally-max-solvable games is
Collusion-proof.
No group of players can change strategies without
hurting at least one member.
Corollary The pure-Nash is also Pareto optimal.

17
Best Reply Mechanisms

Players have hidden utility functions (Their
types)
For simplicity, we assume a central mechanism
that queries them about best-replies.
The goal to decide on a strategy profile for
them to play that is hopefully a pure Nash.
Needed A penalty that the mechanism can activate
to punish players that did not converge.
Natural in our examples.
Needs to be worse than the equilibrium outcome.

18
Best Reply Mechanisms

The mechanism
Start with some strategy profile.
Go over the players in round-robin order and
repeatedly update their best-reply.
If in some round no one changes strategy, stop
and output the strategy profile.
If a certain (polynomial) number of rounds have
passed and players still did not converge, invoke
the penalty.

19
Best Reply Mechanisms

Theorem For a universally-max-solvable game the
given mechanism is incentive compatible in
ex-post Nash equilibrium.
Meaning
when queried you will always report your
best-reply, and not some other strategy.
The result of the mechanism will be the pure-Nash
equilibrium of the game.
Ex-post means that you will not act differently
even if you knew the specific utility functions
of all others.
All you assume they also play best-reply.

20
Examples of Universally-Max-SolvableGames
21
Single Item Auction

A single item is being auctioned.
Each player has a private value in 1,2,k.
Players announce what they are willing to pay.
Highest bidder gets the item for his bid.
(Ties are broken in some predefined way)

4
5
22
Single Item Auction

Utility of a player
0 if he did not win.
Valuation minus payment if he did win.
Best Reply Strategy
If BidgtValuation decrease bid to valuation.
(this involves tie breaking)
If not highest bidder and BidltValuation increase
bid by 1.

7
4
23
The Mechanism

Start at any initial bids (Not necessarily 0)
Query players in order and ask if they want to
change their bid
When no one wants to change, allocate the item.
If there is no convergence after kn2 rounds give
the item to no one.
Notice
We do not force ascending bids.
Do not have to start at 0

24
Single Item Auction

Theorem The single item auction is
universally-max-solvable (after tie breaking).
Therefore
A unique pure Nash exists.
We converge to it quickly if everyone is truthful
The mechanism we suggested is incentive
compatible
Note that this is just the English auction
behavior (but with rules that are less strict).

25
Congestion Control

The setting
A simplified model of packets flowing through a
computer network.
Assume a network graph with capacities on the
edges (Like a flow problem).

26
Congestion Control

Flows have a fixed unchangeable single path.
Vertices that get more flow than they can send
out must dump some.

27
Congestion Control

Policy of the vertices
Distribute the capacity of an edge equally
between flows.
If some flow does not use its full share,
distribute it evenly among the others.
Similar to the fair-queuing strategy in the
Internet
Maximizes the minimal flow.

28
Congestion Control Game

Each flow is a player.
Utility of a player How much he manages to send
through.
Decides alone how much to send through the
network.
Players do not know the structure of the network.
Only know how much of their flow goes through, or
if there is free capacity.

29
Congestion Control

Best-reply strategy
If there is free capacity increase your flow.
If you lose some of your flow decrease your flow.
(This is tie breaking between outcomes with equal
payoff)
THM congestion control is universally-max
solvable.
Natural Penalty Everyone sends full flows.
We thus have
A Pareto optimal pure Nash that maximizes min
flow.
Fast Convergence.
Incentive compatibility of following best-reply.

30
Stable Roommates

A set of college students needs to be paired up
to share dorm rooms.
Each student has strict preferences over the
other students (these are private).
We allow students to announce a single person
they want to pair up with.

31
Stable Roommates Game

A player gets the utility associated with the
roommate he selected if
that roommate selected him
that roommate would prefer him over his current
selection.
Nash equilibria in thisgame are stable matchings
There may be several.

32
Stable Roommates

The mechanism
Allow students to iteratively update their
selection
Stop after students no longer change
If after a while players do not stop, match no
one.

33
Preference Cycles

Definition A preference cycle is a cycle of
players such that each player prefers the
following player more than the previous player.

34
Stable Roomates

Theorem A roommate matching game is that has no
preference cycle is a universally-max-solvable
game.
Example of no-preference-cycle bipartite graphs
with an agreed preference. (Med. students and
hospitals)
Therefore for no-preference-cycle cases
There is a unique stable matching.
Best-reply converges to it (asynchronously) and
quickly
The mechanism we offered is incentive compatible.

35
Thanks!

Write a Comment

User Comments (0)