Analyzing Complex Strategic Interactions in Multi-Agent Systems Walsh, Das, Tesauro, Kephart in Proceedings of the Workshop on Game Theoretic and Decision Theoretic Agents (AAAI 2002) - PowerPoint PPT Presentation

About This Presentation
Title:

Analyzing Complex Strategic Interactions in Multi-Agent Systems Walsh, Das, Tesauro, Kephart in Proceedings of the Workshop on Game Theoretic and Decision Theoretic Agents (AAAI 2002)

Description:

Analyzing Complex Strategic Interactions in Multi-Agent Systems Walsh, Das, Tesauro, Kephart in Proceedings of the Workshop on Game Theoretic and Decision Theoretic ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0

less

Transcript and Presenter's Notes

Title: Analyzing Complex Strategic Interactions in Multi-Agent Systems Walsh, Das, Tesauro, Kephart in Proceedings of the Workshop on Game Theoretic and Decision Theoretic Agents (AAAI 2002)


1
Analyzing Complex Strategic Interactions in
Multi-Agent SystemsWalsh, Das, Tesauro,
Kephart in Proceedings of the Workshop on Game
Theoretic and Decision Theoretic Agents (AAAI
2002)
  • Presented by Deniz Sarioz
  • in Simon Parsons e-commerce seminar,
  • CUNY Grad Center, April 19th 2005.

2
Premise
  • Understanding interactions among various
    strategies can be valuable both to designers of
    markets (wishing to ensure economic efficiency
    and stability) and to designers of individual
    agents (wishing to maximize profits).
  • By demystifying strategic interactions among
    agents, can improve our ability to predict (and
    design) the overall behavior of multi-agent
    systems.
  • Question of which strategy is best is often not
    the most appropriate, a mix of strategies can be
    an equilibrium.
  • The tournament approach has the shortcoming of
    being one trajectory through an infinite space of
    possible interactions.
  • Authors present a more principled and complete
    method for analyzing interactions among
    strategies.

3
Modeling Approach
  • Start with a game that may include complex,
    repeated interactions between A agents.
  • The rules specify particular actions that agents
    may take as a function of the game state (e.g.,
    bid b at time t.)
  • Each of the agents has a choice among S
    exogenously specified, heuristic strategies.
  • Strategies are heuristic in that they are
    generally not the solution of Bayes / Nash
    equilibrium analysis
  • Compute a heuristic-payoff table that specifies
    the expected payoff to each agent as a function
    of the strategies played by all agents.
  • Agent strategies are selected independently from
    agent type (an assumption to help with
    tractability).

4
Modeling Approach
  • The heuristic-payoff table is an abstract
    representation of the fundamental game that
    reduces a potentially very complex game to a
    one-shot game in normal form.
  • Treat the choice of heuristic strategies rather
    than basic actions as the level of decision
    making for strategic analysis.
  • Standard payoff table for a normal-form game
    requires SA entries, which can be huge even for
    moderate S, A.
  • e.g., 3 agents X Y Z and 4 strategies S T U V
    need cells in the table for (XS, YS, ZS), (XS,
    YS, ZT), (XS, YS, ZU), ...
  • Due to same distribution of types, we can express
    the payoff of each strategy as a function of the
    number of agents playing each strategy
    ((XS,YS,ZT) ? (XT,YS,ZS)).

5
Modeling Approach
  • What does this combinatorial reduction buy us?
  • Number of entries in the table is the number of
    unique ways A agents can be partitioned into S
    strategies.
  • AS-1CA (AS-1)(AS-2) ... (A1) / (S-1)!
  • which is bounded below by and is approximately
    (for AgtgtS) AS-1/(S-1)!
  • This is fine for situations where even a lot of
    agents get to pick from / switch among a few
    strategies.
  • For A20 and S3, the symmetric payoff table
    contains 231 entries rather than 320 (3.5) 109
    in the asymmetric.
  • For sufficiently simple games, payoffs may be
    computed analytically. Complexity usually
    demands simulations.

6
Modeling Approach
  • With a payoff table table is computed, a variety
    of techniques can be used. 3 such techniques in
    this paper
  • A static analysis, entails computing Nash
    equilibria of the payoff table.
  • Model the dynamics of agents that switch to
    strategies that appear more successful.
  • Suggest techniques for understanding strategies
    at a deeper level. Specifically perturbation
    analysis

7
Equilibrium Computation
  • At the start of the game, each of the A agents
    chooses one of the S strategies
  • The payoff to agent i is a real function u of the
    strategies played by all agents.
  • Payoff is the expected reward
  • We assume symmetric strategy sets and payoffs, so
    payoff to an agent can be represented as the
    payoff to each strategy as a function of the
    number of agents playing each strategy.

8
Equilibrium Computation
  • Agent i may choose its strategies randomly
    according to a mixed strategy
  • pi (pi,1 , pi,2 , ... , pi,S )
  • where pi,j is the probability of agent i
    playing strategy j.
  • The (S-dim) vector of all agents mixed
    strategies is p and the ((AS-S)-dim) vector of
    mixed strategies for all agents except i is p-i
  • pi e j denotes the special case of pi,j
    1.
  • u(e j , p-i) denotes payoff for agent i for
    playing pure strategy j while others play a mixed
    strategy.
  • For mixed pi we have u(pi, p-i) ?Sj1
    u(e j , p-i) pi,j

9
Equilibrium Computation
  • In game theoretic analysis, usually assume
    everyone plays mixed Nash equilibrium strategies.
    i.e., no one can receive a higher payoff by
    unilaterally deviating to another strategy.
    Formally, p is a Nash equilibrium iff ?agent
    i
  • u(pi , p-i ) ? u (pi , p-i )
  • Rest of the paper specifically focus on
    symmetric mixed strategy equilibria--that is,
    ?agents i, k pi pk p
  • Denote an arbitrary (not necessarily equilibrium)
    symmetric mixed strategy by p and the
    probability that a given agent plays pure
    strategy j by pj
  • Symmetry assumption simplifies the analysis from
    A S-tuples of probabilities to
    only one such S-tuple.
  • Justification a symmetric game (to be defined
    later) always has at least one (usually
    non-unique) symmetric Nash equilibrium.

10
Equilibrium Computation
  • There are equivalent formulations to the concept
    of Nash equilibrium. But mostly do not exploit
    symmetry, which imposes a severe restriction on
    the problem size to be solved.
  • Formulate Nash Equilibrium as the minimum of a
    function on a polytope.
  • Considering only symmetric equilibria, the
    problem is to (find the p that) minimize(s) p
    is Nash Eq iff global min of v
  • v(p) ?Sj1 (max u(e j , p) - u(p, p) , 0
    )2
  • restatement of v(p) in pseudo-C by way of
    derivation
  • ?Sj1 (u(e j , p) gt u(p, p)) ? (u(e j , p) -
    u(p, p))2 0
  • The polytope is just the (S-1)-simplex in S
    dimensions defined by the constraints ?Sj1 pj
    1 and (?j) pj ? 0
  • e.g., if we have 3 strategies, look for
    equilibria on the interior of a triangle, which
    is a bounded subset of 2 dimensional space.

11
Equilibrium Computation
  • Used a software package called amoeba, a
    non-linear optimizer, to find the minimum of the
    stated function on the S-simplex.
  • The function is evaluated at each vertex of the
    simplex and the polytope attempts to move down
    the estimated gradient by a series of geometric
    transformations that strive to replace the
    worst-performing vertex.
  • Authors repeatedly ran amoeba for restarting at
    random points on the S-simplex and stopping when
    found 30 previously discovered equilibria in a
    row.
  • For A20, S3 took 10 minutes on a 450MHz
    machine.

12
Dynamic Analysis
  • Nash equilibria provide a theoretically
    satisfying view of the ideal static properties of
    a multi-agent system.
  • Dynamic properties may be of greater or equal
    concern.
  • In actual systems, may be unreasonable to assume
    that agents all have correct and common knowledge
    necessary to compute equilibria.
  • Borrow a well-developed model from evolutionary
    game theory (Weibull 1995) to analyze strategy
    choice dynamics.
  • Posit a very large population of N agents, and A
    agents (AltltN) are randomly chosen to play at each
    tick. Each agent plays one of the S pure
    strategies and the fraction of agents playing
    strategy j is pj
  • for sufficiently large N, pj may be treated as
    cts variable

13
Dynamic Analysis
  • Use the replicator dynamics formalism to model
    the evolution of p with time as
  • p.j u(e j , p) - u(p, p) pj
  • where u(p, p) is the population average payoff
    and
  • where u(e j , p) is the average payoff to agents
    currently using pure strategy j,
  • and (I think) p.j is by how much the jth
    component of p increases at every tick.
  • This equation models the tendency of strategies
    with greater than average payoff to attract
    followers and those with less than average payoff
    to suffer defections.

14
Dynamic Analysis
  • Prefer that a dynamic model assume minimal
    informational requirements for agents beyond
    their own actions and payoffs.
  • The replicator dynamics eq implies that agents
    know u(p, p) and that is a very implausible
    assumption.
  • However, can obtain the same population dynamics
    with a replication by imitation model (Weibull
    1995)
  • An agent switches to the strategy of a randomly
    chosen opponent who appears to be receiving a
    higher payoff.
  • Interpret p at any given time as representing a
    symmetric mixed strategy for all N players in the
    game. Then the fixed points of equation 2
    (where ?j p.j 0 ) correspond to Nash
    equilibria.
  • When strategy trajectories governed by equation 2
    converge to an equilibrium, the equilibrium is an
    attractor.
  • However, trajectories do not necessarily converge.

15
Dynamic Analysis
  • Two ways of thinking about this
  • When multiple Nash equilibria exist, those that
    are attractors are the only plausible equilibria
    within the evolutionary model. Those with larger
    basins of attraction are more likely, assuming
    that every initial population state is equally
    likely.
  • We can use the basins of attraction to understand
    which initial population mixes will lead to which
    equilibrium.
  • Use the heuristic payoff table and eq for p. to
    generate a large number of strategy trajectories,
    starting from a broad distribution of initial
    strategy vectors p
  • For a choice of 3 strategies, the resulting flows
    can be plotted in a 2-D unit simplex and have an
    immediate visual interpretation.

16
Perturbation of Payoffs
  • We are only considering very few strategies,
    whereas in fact the strategy space is infinite
    and strategies can in practice be slightly
    modified.
  • Perturb the payoff table in some meaningful ways,
    in order to perform some directed study of
    plausible effects of abstract changes in strategy
    behavior.
  • Key basic idea the n agents of strategy ?
    steal some part of the payoff of n- agents of
    strategy ?- in a way that preserves the total
  • q(?) ? min(n , n- ) q(?) / n
  • q(?-) - ? min(n , n- ) q(?-) / n-
  • Alternative changes pick on just one strategy,
    uniform improvements to all strategies, random,
    etc.

17
Applications
  • Applied the methodology to
  • Automated Dynamic Pricing (ADP)
  • Continuous Double Auction (CDA)
  • Chosen because computationally intractable to
    compute equilibria in these.
  • Body of literature exists which includes
    interesting heuristic strategies.
  • Simulators are available for computing the
    heuristic payoff tables.

18
Automated Dynamic Pricing (description)
  • Recent emergence of shopbots e.g., buy.com
    monitors and undercuts competitors prices
  • In this experiment, sellers choose one of three
    heuristic dynamic pricing strategies
  • GT game theory (Greenwald Kephart 1999)
    plays a mixed-strategy Nash equilibrium computed
    for the underlying game assuming all pricing and
    purchasing decisions are made simultaneously
  • DF derivative follower simple hill-climbing
    adaptation solely based on observed
    profitability, ignoring assumptions about other
    buyers and sellers.
  • NIR No Internal Regret (Greenwald Kephart
    1999) adapted from Foster Vohra (1997).

19
Automated Dynamic Pricing (analysis)
  • Looked at two scenarios for these 3 strategies,
    with 5 agents and with 20 agents.
  • Only A is a pure-strategy Nash equilibrium
  • When number of agents increased to 20, number of
    equilibria drops to one!

20
Automated Dynamic Pricing (analysis)
21
Automated Dynamic Pricing (analysis)
  • Since DF was never seen as an attractor, tried
    applied a perturbation analysis to compare the
    other two using the method specified
  • Result NIR would start becoming a strong
    strategy with a 6.75 improvement, and nearly
    dominant with 10 improvement.
Write a Comment
User Comments (0)
About PowerShow.com