On the Convergence of Regret Minimizing Dynamics in Concave Games - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

On the Convergence of Regret Minimizing Dynamics in Concave Games

Description:

Cournot Oligopoly. Socially concave games. Selfish Routing. TCP Congestion Control. 7. Cournot Oligopoly [Cournot 1838] Best response dynamics: Converges for 2 players ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 26
Provided by: x7105
Category:

less

Transcript and Presenter's Notes

Title: On the Convergence of Regret Minimizing Dynamics in Concave Games


1
On the Convergence of Regret Minimizing Dynamics
in Concave Games
Uri Nadav Tel Aviv University, Tel Aviv Israel
Joint work with Eyal Even Dar , Yishay Mansour
Microsoft Research, Cambridge UK, March 26, 2009
2
Nash Equilibrium
  • Nash equilibrium is a steady state of the game
  • No player has an incentive to unilaterally
    deviate from his state

Player II
Player I
  • Existence (pure strategy)
  • Uniqueness
  • Quality
  • Price of Anarchy/Stability
  • Dynamics Reaching an equilibrium

3
Dynamics
Day 1
Day 2
Day 3
Day 4
Day 5
Player 1
Player 2
4
Example Dynamics
  • Best Response
  • On each day adjust to other players
  • Ignore the fact that they also adjust

Day 1
Day 2
Day 3
Day 4
Day 5
Player 1
Player 2
Unfortunately, does not always converge to
equilibrium
5
No External Regret
A procedure is without external regret if for
every sequence the external regret is sublinear
in T
  • No single action significantly outperforms
    dynamics
  • Define regret in T time steps as

(total cost of best fixed row in hindsight)
(total cost of alg)
-
RegretAlg ( T )
  • Many (different) algorithms can guarantee this
    Hannan 57, Blackwell 56Banos 68Megiddo
    80Fundberg, Levine 94Auer et. al 95

6
Our Main Result
Socially concave games
  • If each player uses a procedure without regret in
    some class of interesting games then their joint
    play converges to Nash equilibrium

Selfish Routing
Resource Allocation
Cournot Oligopoly
TCP Congestion Control
7
Cournot Oligopoly Cournot 1838
  • Firms select production level ( supply)
  • Market price depends on total supply
  • Firms maximize their Profit Revenue - Cost

Market price
Y
X
Cost1(X)
Cost2(Y)
P
X
y
Overall quantity
We will show no-regret dynamics converges to NE
for any number of players
  • Best response dynamics
  • Converges for 2 players
  • Diverges for n ? 5 Theocharis 1960

8
Resource Allocation Games
We can show that the best response dynamics
generally diverges for linear resource allocation
games
  • Equilibrium
  • Existence Uniqueness Hajek, Gopalakrishnan
  • Efficiency Loss (POA) 3/4 Johari, Tsitsiklis
  • Advertisers set budgets

5M
10M
17M
25M
  • Each advertiser wins a proportional market share

25
s allocated rate
5101725
  • Utility
  • Concave utility from allocated rate
  • Quasi-linear with money

9
Routing Games
  • Atomic
  • Splittable flows

s1
  • Costi ?p2 (si, ti) Latency(p) flowi (p)

f1, L
f1
f1, R
f2, T
t2
s2
f2,T
e
t1
f1,L
f2, B
Latency on edge e Le(f1,L f2,T)
f2
10
Socially Concave Games
There exists ?1,,?n gt 0 Such that ?1 u1 (x) ?2
u2(x)?n un(x)
  • Closed convex strategy set
  • A (weighted) social welfare is concave
  • The utility of a player is convex in the vector
    of actions of other players

R
Zero Sum Games ½ Socially concave games
  • Some socially concave games
  • Subclass of Cournot competition, Resource
    allocation, Selfish Routing, TCP congestion
    control(Near equilibrium)

11
Our Main Result
If each players uses a procedure without regret
in socially concave games then their joint play
converges to Nash equilibrium
  • The average action profile converges to NE

Day 1
Day 2
Day 3
Day T
Average of days 1T
Player 1
Player 2
?(T) - Nash equilibrium
Player n
  • The average daily payoff of each player converges
    to her payoff in NE

12
Convergence to NE Proof Outline
Definition of ? - Nash equilibrium
  • Goal Show that for every player, the utility
    from the average action profile equals the
    utility of playing best-response to the average

Utility of player i at average
Utility of i playing Best Response to the average
?
?

13
Convergence to NE Proof Outline
  • Upper bound on the utility of the average action
    profile

For each player i
Sum of utilities
Utility of average action profile
By definition of Best Response
14
Convergence to NE Proof Outline
  • Lower bound on the sum of average utilities

is concave
By assumption, there exists ?1,,?n such that
Utility of average action profile
(Average) Sum of utilities
15
Convergence to NE Proof Outline
  • Punch line

Upper Bound Lower Bound Average Regret
Upper Bound
Lower Bound
Q.E.D
16
Convergence in Almost Socially Concave Games
  • TCP game is a Concave game
  • Karp, Koutouspias, Papadimitriou, Shenker
  • And the weighted social welfare is concave
  • But, the utility of player i is not convex in the
    entire strategy space of the other players

Therefore, the convergence theorem cannot be
directly applied
Playing gradient based dynamics, guarantees no
regret in concave decision making Zinkevich
Playing gradient based dynamics, guarantees
playing in a socially concave zone
17
Regret Minimization Equilibrium
  • Zero sum game
  • Guarantee at least min-max value
  • Correlated equilibrium
  • Internal/Swap regret dynamics converge to it
    Foster, Vohra, Hart, Mas-Colell, Blum
    Mansour
  • Specific games
  • Routing Blum, Even-Dar, Ligett
  • Price of Total Anarchy Blum, Ligett, Hajiaghay,
    Roth

18
Ongoing Research I
  • We studied the allocation of a single link
  • Extend for general resource allocation games
  • A set of resources
  • Players buy a path (subset of resources)
  • Resource allocation in parallel edges is socially
    concave
  • An equilibrium does not necessarily exists in
    general networks
  • Always exists in Johari, Tsitsiklis extended
    game
  • Not socially concave

19
Ongoing Research II
  • Resource Allocation Game
  • Players act as price anticipators
  • Resource Allocation Market
  • Players act as price takers
  • Efficient competitive equilibrium exists (price
    bids) Kelly
  • Continuous time algorithms converge to
    equilibrium Kelly et. al
  • No regret
  • Players have no regret if they believe that they
    dont influence the market price
  • Simulation Results fast convergence to market
    equilibrium

20
Other stuff I work on
21
Thank you!
  • Questions, Comments?

22
TCP Congestion Control kkps 01
Fraction of good-put determined by router policy
User action push flow fi
Channel
gi
fi
li
good-putfraction of fi forwarded
loss li fraction of fi discard
User Utility ui gi ?i li
?i associated cost with lost flow retransmission
, lost bandwidth, utilization
23
Router Policy
  • Flow models
  • Random Early Discard (RED)
  • Number of dropped packets increases as queue grows
  • Tail Drop
  • Drop an incoming packet when out of space

Amount of flow to discard depends on the total
amount of flow
24
Resource Allocation Games
We can show that the best response dynamics
generally diverges for linear resource allocation
games
  • Equilibrium
  • Existence Uniqueness Hajek, Gopalakrishnan
  • Efficiency Loss (POA) 3/4 Johari, Tsitsiklis
  • Users choose payment per unit time

5M
10M
17M
25M
  • Users are allocated rate proportionally

25
s allocated rate
5101725
  • Utility
  • Concave utility from allocated rate
  • Quasi-linear with money

25
Nash Equilibrium
  • Nash equilibrium is a steady state of the game
  • No player has an incentive to unilaterally
    deviate from his state

Player II
½
½
½
Player I
½
  • Existence (pure strategy)
  • Uniqueness
  • Quality
  • Price of Anarchy/Stability
  • Dynamics Reaching an equilibrium
Write a Comment
User Comments (0)
About PowerShow.com