Efficient reinforcement learning through Evolving Neural Topologies NEAT - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Efficient reinforcement learning through Evolving Neural Topologies NEAT

Description:

Crossover. Variable structure operators. Topology summary ... Difficult to define crossover. Middle ground: FF, search for hidden layer size. Complexification ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 36
Provided by: tomassi
Category:

less

Transcript and Presenter's Notes

Title: Efficient reinforcement learning through Evolving Neural Topologies NEAT


1
Efficient reinforcement learning through Evolving
Neural Topologies (NEAT)
  • Paper by K. Stanley, R. Miikkulainen
  • presented by Tomas Singliar
  • CS3120 Fall 2004

2
Outline
  • Introduction
  • ANNs Neuroevolution
  • Representational issues with evolving structure
  • Complexification and parsimony
  • NEAT (NeuroEvolution of Augmented Topologies)
  • Representational trick of innovation number
  • Evolutionary operators
  • Evaluation
  • Double pole balancing task
  • Results on various setups
  • Discussion

3
Perceptrons
Perceptron
  • Biologically inspired
  • Many dendrites inputs xi
  • One axon output y

1
w0
x1
S
x2
y

wn
xn
Computes a linear decision boundary
Multi-layer perceptrons compute non-linear
decision boundaries
4
NN topologies
  • feedforward
  • single-layer
  • multi-layer
  • recurrent

learning easy
learning hard
5
Neural network controller
6
NeuroEvolution
  • EA to obtain a neural network
  • Fixed topology
  • individual vector of real-valued weights
  • Typically feed-forward
  • Easy evaluation learning


7
NeuroEvolution
  • Mutation

8
NeuroEvolution
  • Mutation

9
NeuroEvolution
  • Mutation
  • Crossover



10
Variable structure operators
  • Mutation of weights
  • Mutation of structure remove link

11
Variable structure operators
  • Mutation of structure
  • Crossover



12
Topology summary
  • Fixed topology
  • individual vector of real-valued weights
  • Benefit simple to evolve
  • Drawback designer must choose the topology
  • (and solve the problem)
  • Evolving topology
  • Variable-length genome
  • Genotype/phenotype mapping difficulties
  • Difficult to define crossover
  • Middle ground FF, search for hidden layer size

13
Complexification
  • Choice of initial structures set
  • Structures never evolutionarily justified
  • Dimensionality problem
  • Init with simple structures and complexify
  • Parsimonious
  • Computationally efficient
  • Biologically plausible

14
Outline
  • Introduction
  • ANNs Neuroevolution
  • Representational issues with structure
  • Complexification
  • NEAT (NeuroEvolution of Augmented Topologies)
  • Representational trick of innovation number
  • Evolutionary operators
  • Speciation as innovation protection
  • Evaluation
  • Double pole balancing task
  • Results
  • Discussion

15
NEAT
  • NeuroEvolution of Augmented Topologies
  • Encoding scheme
  • Individual sequence of genes (variable)
  • Track historical origin of each gene
  • innovation number
  • Speciation model
  • Distance between individuals
  • Stochastic speciation
  • individual belongs to first species similar
    enough

16
NEAT Individual Encoding
  • List of nodes
  • Gene Link
  • Disabled bit
  • Innovation number
  • Global

5
4
1
3
2
(Omitting list of nodes)
17
NEAT operators
  • Mutations
  • Mutate link weight
  • Structural
  • Add link
  • Split link (Add node)

5
4
1
2
3
18
NEAT operators
  • Mutations
  • Mutate link weight
  • Structural
  • Add link
  • Split link (Add node)

5
4
1
2
3
19
NEAT operators
  • Mutations
  • Mutate link weight
  • Structural
  • Add link
  • Split link (Add node)

5
4
1
2
3
20
NEAT operators
  • Mutations
  • Mutate link weight
  • Structural
  • Add link
  • Split link (Add node)

5
4
1
2
3
21
NEAT operators
  • Mutations
  • Mutate link weight
  • Structural
  • Add link
  • Split link (Add node)

5
4
6
1
2
3
22
NEAT operators - Crossover


23
NEAT operators - Crossover
matching
disjoint excess
Fitness 15
Fitness 15
Fitness 4
24
NEAT operators - Crossover
matching
disjoint excess
Fitness 10
Fitness 15
Fitness 3
25
NEAT operators observations
  • Nodes are added by link split
  • No orphan nodes
  • Any two genomes cross over
  • No topological analysis ? Efficient
  • Genomes nondecreasing in size
  • No way to flip back the disabled genes
  • Dead genes
  • Discussion why not just remove it?
  • Evolutionary history?

26
Protecting innovation
  • Change in structure
  • ? weight reoptimization necessary
  • ? Poor performance of new structures
  • Why not fix the weights more cleverly?
  • alternatives evolve structure genetically and
    have a period of learning (youth) (with tutor ?)
  • Speciation
  • divide population into subspecies
  • strong competition inside species, weak outside
  • When is a new species born and when should it
    die?

27
Protecting innovation
  • Partition into species
  • define compatibility
  • place individual k into species s if for randomly
    chosen member m of i the distance
  • Explicit fitness sharing
  • individuals reproduce only within their species
  • next generation species size average species
    fitness
  • generational, entire population replaced

28
Outline
  • Introduction
  • ANNs Neuroevolution
  • Representational issues with structure
  • Complexification
  • NEAT (NeuroEvolution of Augmented Topologies)
  • Representational trick of innovation number
  • Evolutionary operators
  • Speciation as innovation protection
  • Evaluation
  • Double pole balancing task
  • Results
  • Discussion

29
Experimental setup
  • Double pole balancing problem
  • Cart on a track, 2 poles of different height
  • Various tasks combinations of variables
  • with velocities inputs
  • without velocities inputs
  • Implemented as a simulation, problem in classical
    mechanics, numerical evaluation of differential
    equations

30
Setup simulated environment
  • Track 4.8m long
  • Actuation
  • force lt-10N,10Ngt
  • weight not specified, unclear how to interpret
    force
  • frequency 0.02s sim time
  • Poles
  • lengths 0.1m and 1.0m
  • short upright, long 1 tilted
  • easy?
  • Success criteria
  • Both poles balanced (-36,36) for 30 min sim
    time
  • Generalizes well (1/3 of initialization
    space)
  • DPV fitness proportion of time pole balanced
  • DPNV proportion of time balance wiggle penalty

31
Setup EA parameters
  • Population size
  • 150 with velocities
  • 1000 without velocities (more difficult task)
  • Mutation rates
  • 80 chance to mutate all weights
  • 90 slightly perturbed, 10 reassigned
  • 5 or 30 (resp.) prob. of new link
  • Parent selection
  • stagnant species eradicated (improve in 15
    generations or die)
  • uniform within 40 elite of species
  • 0.1 chance for elite individual to mate with
    other species
  • Speciation parameters
  • c1 c2 1.0
  • c3 3.0, dt4.0 for DPNV c3 0.4, dt3.0 for
    DPV

32
Experimental results
  • NEAT achieves performance of evolved agents
    comparable to previous approaches with smaller
    computational demands.
  • DPNV champion generalizes (balances for at least
    20 seconds) 286 out of 625 initial configurations
  • DPV champion not reported
  • Same number of generations as ESP, but fewer
    evaluations
  • More parsimonious individuals
  • Division into subpopulations important
  • Great amount of diversity preserved 14 species
    at time of solution discovery

33
Thank you!
  • Discuss! Discuss! Discuss!

34
Discussion questions
  • Agent learns in simulated ideal, deterministic
    environment. Robustness?
  • Any disadvantages to complexification?
  • Why the distinction between excess and disjoint
    genes?

35
Enforced subpopulations
  • Fixed placeholder frame
  • Neurons selected to click in place
  • Subpopulations evolve naturally (SANE)
  • Speedup Make them explicit (ESP)
  • Reinforcement for network is evenly distributed
    to participating neurons


Write a Comment
User Comments (0)
About PowerShow.com