Iterated Prisoners Dilemma Using Genetic Algorithms - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Iterated Prisoners Dilemma Using Genetic Algorithms

Description:

Why IPD is such a widely studied problem?? It offers deep insight into the dynamics of competition and ... [Onora O'Neil, Autonomy and Trust in Bioethics] ... – PowerPoint PPT presentation

Number of Views:126
Avg rating:3.0/5.0
Slides: 25
Provided by: pun4
Category:

less

Transcript and Presenter's Notes

Title: Iterated Prisoners Dilemma Using Genetic Algorithms


1
Iterated Prisoners Dilemma Using Genetic
Algorithms
GROUP I Preeti Goel preetig_at_ Puneet Kaur
puneetk_at_
2
To Come
  • Introduction
  • Past Work
  • Our Approach
  • Implementation
  • Approach Justification
  • Expected Results

3
Introduction
  • IPD is a classic problem of conflict and
    co-operation
  • Payoff Matrix
  • Goal of the Game To maximize the number of
    points

4
Why IPD is such a widely studied problem??
  • It offers deep insight into the dynamics of
    competition and cooperation.
  • what is necessary for cooperation to emerge,
    specifically under conditions in which
    individuals pursue their self-interest, without
    the aid of a central authority to force them to
    cooperate.
  • IPD demonstrates human altruistic behavior, it
    explains the Philosophy of Trust. Onora
    ONeil, Autonomy and Trust in Bioethics
  • The evolution of cooperation requires that
    individuals have a sufficiently large chance to
    meet again so that they have a stake in their
    future interaction.

5
continued..
  • Emory Brain Imaging Studies During the mutually
    cooperative social interactions, functional MRI
    scans revealed activation in the areas of the
    brain the nucleus accumbens, the caudate
    nucleus, ventromedial frontal/orbitofrontal
    cortex and rostral anterior cingulate cortex
    that are linked to reward processing. It suggests
    that the altruistic drive to cooperate is
    biologically embedded-- either genetically
    programmed or acquired through socialization
    during childhood and adolescence. July 18, 2002
    issue of the journal Neuron, published by Cell
    Press

6
Why use Genetic algorithms??
  • Evolution is a fundamental form of adaptation in
    a dynamic and complex environment. GA is an
    effective tool in the empirical study of
    evolution
  • GA is a method that tends to emulate the forces
    of evolution and natural selection to solve a
    problem. The premise of GA is that each
    subsequent generation is likely to contain better
    and better solutions just as each subsequent
    generation of biological organisms is likely to
    be better and better adapted to its environment.
  • GA make it possible to explore a far greater
    range of potential solutions to a problem than do
    conventional programs. It helps obtain a near
    optimal solution by combining useful sub
    structures to obtain fitter individuals.

7
GA for IPD
  • GA is a powerful technique for searching spaces
    that have the following difficulties.
  • Search space is combinatorially large.
  • Little is known a priori about the search space.
  • The search space contains local minima.
  • Games whose strategies involve identifying and
    exploiting an opponent are search spaces of this
    character. Hence, the GA seems a suitable
    technique to learn to play IPD.

8
Past Work
  • Robert M. Alexrod
  • Kristian Lindgren
  • John J. Grefenstette
  • P. J. Darwen and X. Yao

9
Axelrods Approach
  • First Tournament
  • - Fourteen entries
  • - Round-robin IPD Every player played
    every other player (including a clone of itself)
  • - Number of Rounds 200 times
  • - Number of generations 5
  • Winner Resembled Tit-For-Tat Strategy Start by
    cooperating

10
continued..
  • Second Tournament
  • - Sixty-two entries, plus RANDOM
  • - Results of the first tournament were used to
    initialize
  • Winner closely resembled TFT Strategy
  • Ecological Tournament
  • - Entries (plus RANDOM) from the second
    tournament used as the initial conditions
  • - Number of generations1000

11
continued
  • -The score of strategies of type T in the
    population pool at the beginning of generation G
    was set equal to the total number of points won
    by strategies of type T in the previous
    generation(G-1).
  • Winner closely resembled TFT Strategy
  • Co-operation rate rises to 100.
  • Hence Axelrods strategies had cooperative
    behavior and finally converged to a strategy
    closely resembling the TFT strategy.

12
Lindgrens work
  • He simulated the work by Axelrod and showed that
    when a population plays IPD against its own
    members, the high performance strategies which
    dominate the population for long periods of time,
    are suddenly wiped out Punctuated Equilibria
    of Natural Evolution Evolution is not a
    continual gradual change but long periods of
    stability intermittently disturbed by short burst
    of new species creation
  • He suggested that the strategies produced are not
    robust i.e. they do well against the local
    population, but when something new and innovative
    appears they fail.
  • Darwen and Xao gave a modified technique to
    counter these problems.

13
Grefenstettes Work
  • Grefenstette Seed the initial population with
    strategies known to be good.
  • Several of his genetic algorithm packages are in
    public domain. Genesis 5.0 was used by Darwen and
    Yao to implement their strategy.

14
Darwen and Yaos Approach
  • Number of rounds 50, Number of generations 250
  • Genotype 70 bits
  • - 64 bits to represent all the possible moves
    for a look back of 3 (444)
  • - 6 bits to find the index,and pre-game
    history for first 3 rounds
  • - juxtapose the 3 bits of the players to
    obtain the first 6 bits

15
Example
  • P1 C-D-C 010 C(co-operate) is represented by
    zero
  • P2 D-C-D 101 D(defect) is represented by one
  • P1s Index P2P1 ? 101010 42nd action
    starting from the 7th bit
  • P2s Index P1P2 ? 010101 11th action
    starting from the 7th bit

16
continued.
  • Round Robin Game
  • Fitness Sum of Scores that an individual
    achieves in all these games
  • To overcome bias towards all 0s (all cooperate
    strategy)
  • Seed the initial population with some known
    expert rules,instead of a completely random
    initial population
  • Include a few extra, unchanging genotypes in the
    round robin ,so that each individual of the
    population has to play every other member, plus
    the extra players.

17
Our Approach
  • Maintain 2 sets of population Host, Parasite
  • Co-evolution of Parasites
  • The fitness of parasites is judged by the
    how effectively they find weaknesses in them.
    Waves of epidemic and immunity create an arms
    race between the competing populations, keeping
    it in a constant state of flux. This is more
    likely to evolve more generalized solutions that
    survive in competition against a wide variety of
    parasites.

18
Initial Population
  • Host All Random
  • Parasite 75 Random. 25 Tit for Tat. These 25
    TFT individuals do not take part in evolution.
  • Number of generations 250
  • Number of rounds Random

19
Implementation
  • Fitness individuals in host which fail
    against it.

Fitness weighted sum of scores
Host Genotypes Initialized
Randomly
Parasites -25 TFT -75 Random
Do Not Evolve
Evolve
20
Fitness
  • Individuals in host Weighted sum of scores in
    the games they play against themselves(t) in
    round robin against individuals in parasite(p).
  • Fitness alpha t (1 alpha) p
  • Individuals in parasite The number of
    individuals in host which fail against it.

21
Approach Justification
  • To avoid atrophy of certain features
  • For example, in IPD, TFT decays since in a
    co-operative environment it can pay to not
    retaliate. This atrophy opens the way for
    exploitation.
  • To obtain a robust strategy
  • The performance should not be restricted to
    local population. It should perform well even
    against other opponents.

22
Evaluation Procedure
  • From the final population, find the average
    strategy and the best strategy.
  • Evaluate both strategies against the following
  • Tit for Tat
  • All co-operate
  • Trigger
  • Random
  • Highest Performing parasite / Average Parasite

23
Expected Result
  • A robust strategy moving towards global maxima.
  • It should not move towards 100 co-operate.

24
References
  • Robert Axelrod, Evolving New Strategies.
  • www-personal.umich.edu/axe/research/Evolving.
    pdf
  • Benjamin Hosp, The Genetic Algorithm and the
    Prisoner's Dilemma. The Journal of Computing in
    Small Colleges ,Volume 19 , Pages 135-146, Issue
    3(January 2004).
  • P. J. Darwen and X. Yao, The Exploitation of
    Cooperation in Iterated Prisoner's Dilemma,
    citeseer.nj.nec.com/448649.html
  • Kristian Lindgren, Evolutionary phenomena in
    simple dynamics. In Artificial Life 2, Volume 10
    of Santa Fe Institute Studies in the Sciences of
    Complexilty, pages 295-312, Addison-Wesley, 1991
  • whsc.emory.edu/_releases/2002july/altruism.html
Write a Comment
User Comments (0)
About PowerShow.com