Evolutionary Algorithms EVO Introduction and Local Search L1 and L2 - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Evolutionary Algorithms EVO Introduction and Local Search L1 and L2

Description:

To familiarise you with a variety of nature-inspired problem solving techniques, ... Huygens Competition. http://gungurru.csse.uwa.edu.au/cara/huygens/cec2006.php ... – PowerPoint PPT presentation

Number of Views:155
Avg rating:3.0/5.0
Slides: 55
Provided by: SusanS156
Category:

less

Transcript and Presenter's Notes

Title: Evolutionary Algorithms EVO Introduction and Local Search L1 and L2


1
Evolutionary Algorithms (EVO)Introduction and
Local Search (L1 and L2)
  • John A. Clark
  • Professor of Critical Systems
  • Non-Standard Computation Group

2
Aim of the Module
  • To familiarise you with a variety of
    nature-inspired problem solving techniques, in
    particular those inspired by concepts in
    evolution
  • Genetic algorithms (of various sorts)
  • Simple GA
  • More advanced GAs
  • Estimation of distribution algorithms (EDAs)
  • Multi-objective Genetic Algorithms (MOGAs)
  • Genetic programming
  • Grammatical evolution
  • Evolutionary strategies
  • Co-evolution.
  • Will also give a variety of unusual applications
    towards the end of the module.

3
Presentational Strategy
  • Generally aim to indicate why techniques emerged.
  • Technique A will have its strengths.
  • Technique A also has its limitations
  • too limited domain of application, e.g. only
    applies to continuous functions.
  • cannot cope practically with large problems, e.g.
    algorithms might take longer than the age of the
    universe to find a solution, or may require
    infeasible amounts of memory.
  • basic operation seems ill suited to the problem
    at hand.
  • Technique B is developed to fix some aspect of
    As deficiencies.
  • Technique B has its strengths
  • Technique B also has its limitations
  • .

4
Lectures and Practicals
  • Lectures in weeks 6,7,8,9
  • Monday 10.15-12.15. P/L/001 (Physics)
  • Thursday 16.15-18.15. ATB/056 (Langwith)
  • Practicals in weeks 7,8,9,10
  • Wednesday 9.15-11.15 CS/007 (Computer Science)

6
7
8
9
10
5
Orders Matter!
  • Scenario 1
  • Please tell me what your problem is.
  • Pause
  • I think the answer to your problem is
    nature-inspired computation.
  • Scenario 2
  • I think the answer to your problem is
    nature-inspired computation.
  • Pause.
  • Please tell me what your problem is.

6
A world outside NI-computation
  • Optimisation did not start with NI-computation.
  • There is a great deal of research in mathematics
    and operations research aimed at finding optimal
    or indeed good solutions to problems.
  • Before we decide I think the answer to your
    problem is nature-inspired computation lets
    take a brief look at a few of the techniques out
    there.
  • Let me give you an excellent piece of advice

7
Evolutionary ComputationJust say No!
8
Linear Programming
  • Linear programming is an extremely powerful
    solution technique.
  • Maximize f(x1,..,xn) c c1x1.. cnxn
  • Subject to the linear constraints
  • a11x1.. a1nxnltb1
  • a21x1.. a2nxnltb2
  • ak1x1.. aknxnltbk

9
LP Example (Taha)
  • Suppose a paint manufacturer makes two types of
    paint (interior and exterior).
  • Interior paint sells for 2000 per tonne and
    exterior paint sells for 3000 per tonne.
  • Making each tonne of paint requires the indicated
    amounts of raw materials A and B.
  • Demand for interior paint cannot exceed that for
    exterior by more than 1 tonne.
  • Maximum demand for interior paint is 2 tonnes.
  • How much of each type should be produced?

10
LP Example (Taha)
  • Let x be the amount of exterior paint and y be
    the amount of interior paint produced.
  • Maximise
  • f(x,y) 2000 y 3000x
  • subject to
  • x2ylt6
  • 2xylt8
  • y-x lt 1
  • ylt2
  • xgt0, ygt0

y Interior
Dotted lines show values of f(x,y)constant Optimu
m occurs when x3 1/3, y1 1/3
Optimum occurs here
3
6
x Exterior
For excellent introduction to LP see Tahas
Operations Research
11
Linear Programming
  • A vast amount of work exists in the area of
    linear programming.
  • Commercial and freeware tools available.
  • What if not continuous variables?
  • Integer programming techniques exist.
  • Mixed variable type techniques too.
  • Dont blunder into using NI-computational
    techniques just because you dont know if there
    are any special purpose techniques for the
    problem at hand.
  • Look for them!!!

12
Dijkstras Shortest Path Algorithm
Dijkstra's Algorithm solves the single-source
shortest path problem in weighted graphs.
Non-negative weights
1 function Dijkstra(G, w, s) 2 for each vertex
v in VG // Initializations 3 dv
infinity 4 previousv undefined 5 ds
0 // Distance from s to s 6 S empty set
7 Q VG // Set of all vertices 8 while Q
is not an empty set // The algorithm itself 9
u Extract_Min(Q) 10 S S union u 11
for each edge (u,v) outgoing from u 12 if
du w(u,v) lt dv // Relax (u,v) 13 dv
du w(u,v) 14 previousv u
http//en.wikipedia.org/wiki/Dijkstra's_algorithm
13
Dijkstras Shortest Path Algorithm
14
Dijkstras Shortest Path Algorithm
  • With simple implementation using linked lists
    this algorithm has complexity O(V2)
  • With sparse graphs and with smart implementation
    (e.g. using Fibonacci heap) this can be improved
    to O(E V logV).
  • There are a few constraints (e.g. non-negative
    edge weights), but.
  • if you have a shortest path problem like the one
    shown, use a shortest path algorithm.
  • Note there are further algorithms that handle
    some limitations, e.g. Bellman-Ford algorithm
    allows negative edge weights.

15
Dijkstras Shortest Path Algorithm
C
0.99
0.99
E
Most reliable path from A to D is?
0.97
A
0.99
0.98
0.98
D
B
Now consider a network where the probability of a
message passing reliably across a link is as
shown. Reliability of a path is now the product
of the edge weights (probabilities) along that
path.
Time to reach for evolutionary computation?????
No! Take logs of weights. Add the minimum to all.
And now use Dijkstras SPA.
16
Moral of the Tale
Even if your problem does not seem to be solvable
by known efficient algorithms, a transformation
of it might be.
17
Calculus 1 variable
  • Calculus is a well known method of finding
    optima.
  • Find minimum of
  • y(x)3x22x1
  • dy/dx6x2
  • Set dy/dx0
  • gtx-1/3,y2/3
  • Strictly should checkd2y/dx26 gt0for minimum

y(x) is a differentiable function. Same sorts of
ideas apply in higher dimensions
18
Finding zeroes of a polynomial function
  • Calculus is a well known method of finding
    optima.
  • Find zeroes of
  • y(x)x2-5x6
  • Analytic solutions exist for quadratics
  • y(x)ax2bxc0
  • And for cubics
  • y(x)ax3bx2cxd0
  • And for quartics
  • y(x)ax4bx3cx2dxe0
  • And for quintics?????

Not pleasant! Look it up.
Not pleasant! Look it up.
No formulae exist.
19
Moral of the Tale
If your problem looks like a (special purpose)
nail
use a (special purpose) hammer
Available from all good mathematics and
operations research departments at reasonable
cost. Read the small print.
20
A bit of guidance goes a long way
Newton-Raphson zero finding
0.9
0.7
0.5
0.3
0.1
x1
x2
x3
x0
-0.1
-0.3
0.5
0.7
0.9
1.1
1.3
1.5
1.7
1.9
2.1
2.3
2.5
Approach uses the gradient at the current xn to
guide movement in the right direction to generate
a better xn1
21
Local Search
22
Local Search Procedure
  • A local search comprises a trace of execution
  • Trace (s0,r0), (s1,r1), , (s2,r2), ,
    (send,rend)
  • sk are members of the search space
  • rk are the fitness values (evaluated in the
    solution space) of the corresponding sk
  • Consecutive sk are related in a particular way.
  • for all k1..(end-1) . Sk1 is_in
    Neighbourhood(sk)
  • The neighbourhood function N(sk)Neighbourhood(sk)
    defines a set of points that are somehow deemed
    to be near to or close to or in the
    locality of (sk)

23
Some Local Search Questions
  • How do you determine the start state s0?
  • How do you define the the neighbourhood function
    N() ?
  • How do you determine which member of
    neighbourhood is selected to be the next state?

24
Hill Climbing
  • Let the current solution or point be x.
  • Define the neighbourhood N(x) to be the set of
    solutions that are close to x
  • If possible, move to a neighbouring solution that
    improves the value of f(x), otherwise stop.
  • Choose any y as next solution provided f(y) gt
    f(x)
  • weak hill-climbing (dont go down)
  • Choose y as next solution such that
    f(y)supf(v) v in N(x)
  • steepest gradient ascent (climb as fast as you
    can)
  • For many purposes hill-climbing works very well
    particularly when you climb the right hill - but
    there is a problem.

25
Local Optimisation - Hill Climbing
Neighbourhood of a point x might be
N(x)x1,x-1
f(x)
Second (right hand) choice of x0 much better!
Search goes x0 ? x1 ? x2 ? xopt
sincef(x0)ltf(x1)ltf(x2) lt f(xopt) gt f(xopt1)
Search x0 ? x1 ? x2 sincef(x0)ltf(x1)ltf(x2) gt
f(x3)
26
Landscapes
  • The hills, peaks and the like are what we
    generally refer to as the fitness landscape.
  • We have laid out the solutions in an ordered way
    and can see how the fitness varies as we traverse
    the solution space.
  • This is easy to visualise with 1- or 2-D
    solutions spaces but the idea generalises.

27
Landscapes
  • Things that may affect our searches
  • Fitness differences between neighboring solution
    space points
  • Generally referred to as ruggedness
  • Smooth landscapes, jagged/spiky landscapes,
    fractal landscapes.
  • The number of local optima
  • A single local optimum helps a lot!!!!!
  • Many, many problems of interest have multiple
    local optima
  • The distribution of the local optima in the
    search space
  • Are optima similar, e.g. do they all have
    particular important characteristics, or do
    optima occur in radically different parts of
    search space?
  • Can have implications if you try to mate
    solutions as part of the search process (e.g.
    genetic algorithms)
  • Sometimes possible to construct the global
    optimum from many local optima.
  • The topology of basins of attraction of the local
    optima
  • Search techniques have their own characteristics,
    once the candidate solution falls into a local
    optimums territory it may not be able to escape!
  • Great if you fall into the territory of the
    solution you want. Not so good if you have been
    taken prisoner by mediocrity (stuck in a
    distinctly sub-optimal local optimum)
  • Basins of attractions can be very complex (cf
    Mandelbrot set)

28
Measures
  • Wont go into details here but a variety of
    measures have been proposed, e.g.
  • Time series autocorrelation
  • Go on a random walk around the search space and
    measure correlation between f(xt) and f(xtk)
  • Fitness distance correlation
  • Developed for use in genetic algorithms
  • Measures how much fitness increases as we
    approach a local optimum.

New Ideas in Optimisation. Editors Corne, Dorigo
and Glover
29
Nastier- Fractal Landscapes
Given 1000 function evaluations whats the best
you can achieve? Huygens Competition.
http//gungurru.csse.uwa.edu.au/cara/huygens/cec20
06.php
30
Local search solution No pain no gain
Allow non-improving moves so that it is possible
to go down
z(x)
in order to rise again
to reach global optimum
x
31
Simulated Annealing
  • Inspired by physics.
  • In condensed matter physics, annealing is known
    as a thermal process for obtaining low energy
    states of a solid in a heat bath.
  • Two steps
  • increase the temperature of the heat bath until
    the solid metal melts and
  • decrease carefully the temperature of the heat
    bath until the particles arrange themselves in
    the ground state of the solid.
  • In liquid phase particles arrange themselves
    randomly. In ground state the particles are
    arranged in a highly structured lattice and the
    energy of the system is minimal.
  • Compare with quenching very rapid lowering of
    temperature (e.g. by dropping into a bath of cold
    water).

32
Simulated Annealing
  • Thermal equilibrium is characterised by the
    Boltzmann distribution - the probability of being
    in state i with energy Ei is given by
  • Simulated annealing mimics the trajectory of
    physical transitions between states of various
    energies as thermal equilibrium is achieved.

33
Simulated Annealing
  • Candidate solutions in a combinatorial
    optimisation problem are equivalent to the states
    of a physical system.
  • The cost of a solution is equivalent to the
    energy of a state.
  • We know that if we cool metals carefully enough,
    we can achieve very low energy states.
  • Why not ape that process for optimisation?
  • Thats the inspiration for simulated annealing.
    Transitions between states (candidate solutions)
    are carried out probabilistically in an analogous
    manner to the distribution for describing
    physical state transitions.

34
Simulated Annealing
  • Improving moves always accepted
  • Non-improving moves may be accepted
    probabilistically and in a manner depending on
    the temperature parameter T. Loosely
  • the worse the move the less likely it is to be
    accepted
  • a worsening move is less likely to be accepted
    the cooler the temperature
  • The temperature T starts high and is gradually
    cooled as the search progresses.
  • Initially (when things are hot) virtually
    anything is accepted, at the end (when things are
    nearly frozen) only improving moves are allowed
    (and the search effectively reduces to
    hill-climbing)

35
Simulated Annealing (Minimisation)
At each temperature Tk consider Lk moves
Always accept improving moves
Accept worsening moves probabilistically. Gets
harder to do this the worse the move. Gets
harder as Temp decreases.
Calculate next number of trial moves at
TkCalculate next temperature Tk
36
Acceptance Criterion
If Dlt0 then we clearly have an improvement and
we move to the trial state Snew. If Dgt 0 then
we clearly have a non-improving move. We also
have -D/Tk lt 0 and so 0ltexp(-D/Tk
)lt1 Therefore exp(-D/Tk )gtU(0,1) is a
probabilistic test for acceptance.
37
Cooling the System
  • It is possible to do this in various ways.
  • Most common is geometric cooling. This simply
    reduces the temperature by some multiplicative
    factor a, where 0lt a lt1. Thus we have TkT(k-1) x
    a
  • Cooling factors most typically in the range 0.8
    0.99 (with a bias towards the higher end).
  • Other methods are possible, e.g. logarithmic
    cooling but the rough rate of cooling is
    generally found to be more important than the
    precise means of reduction.
  • THE RATE OF COOLING MATTERS A LOT
  • Also, if you thing the search isnt going well
    (i.e. getting stuck) then you can reheat the
    system too.

38
Achieving Thermal Equilibrium
  • At each temperature a number Lk of trial moves
    are investigated.
  • How big should Lk be?
  • Harder to say.
  • There is some theoretical advice on how many
    moves you need to consider, but most people
    simply experiment.
  • People want results in good time and so feel a
    need to take short cuts. If experiments are not
    giving good enough results then greater values of
    will then be used.
  • Some researchers spend less time at higher
    temperatures.
  • Many simply make Lk constant over all k.

39
Very Basic Simulated Annealing Example
Iteration
1 Do 400 trial moves
2 Do 400 trial moves
3 Do 400 trial moves
4 Do 400 trial moves
m Do 400 trial moves
n Do 400 trial moves
40
Simulated Annealing
  • Simulated annealing is a tremendously simple form
    of search.
  • The theory is based on Markov chains
    (Markovgtlack of memory property)
  • Once the search reaches a state, it doesnt
    matter how it got there. It effectively forgets
    its past.
  • To move from sk to a neighbouring state sk1 that
    state must
  • Be selected for consideration, with some
    probability pk,k1
  • Pass the acceptance test, with some probability
    qk,k1
  • These probabilities may vary between
    temperatures within a temperature cycle they are
    history independent.

41
Initial State
  • Most common approach to initial state selection
    is random choice.
  • Though to overcome some of the limitations of the
    technique, multiple runs with different starting
    states may be carried out.

42
Initial Temperature
  • How do you choose the initial temperature T0?
  • We want a temperature at which a lot of moves are
    accepted.
  • One way is to progressively increase the
    temperature and execute an inner loop. When the
    acceptance rate reaches, say, 95, we have an
    appropriate T0 and can begin the annealing
    proper.
  • Some tools progressively double temperature as
    the means of determining T0 .

43
I want to Stop!
  • Time constrained usually.
  • Various criteria
  • No state change for a long time (you decide).
  • Temperature below some threshold.The following
    (by Lundy and Mies) aims to provide a result
    within e of the global optimum with probability
    q.
  • A real solution has been detected.

44
Neighbourhoods
  • One aspect of neighbourhood definition concerns
    rapid cost function evaluation.
  • Sometimes it is possible to simply calculate the
    change in cost function. For example, in the
    Travelling Salesperson Problem.

4
4
4
6
6
8
9
7
7
5
5
5
New 4756(45) Old ((45)-(89)) Old
Delta
45
Neighbourhoods
  • You are given a set of integers SS1, S2,,Sn
  • Can you find a subset Q of these integers that
    sums to a particular value V?

S Q u (S-Q) A move could put an element in
(S-Q) into Q or remove an element from Q. Let
target sum V be 274. Then Current cost
(4376963286) - 274
333-27459Delta cost easy to calculate too
(maintain current subset sum).
43
76
In the subset
96
32
86
40
Not in the subset
56
13
97
46
Neighbourhoods
  • But the real point to note is YOU DEFINE THE
    NEIGHBOURHOOD.
  • Variations are clearly possible. For subset
    problem
  • We changed the in/out status of a single element
  • But we could have changed the status of 2 (3,
    4,..) such elements.
  • Of course, the number of elements in the
    neighbourhood gets larger with such k-element
    modification.
  • Need to take into account the cost function too
    the changes in cost should not be too radical
    we should have some degree of continuity in the
    neighbourhood.

47
Lack of Memory may be a Problem
  • There is nothing in standard annealing to prevent
    you going back to previously visited states
    during the search.
  • Despite its ability to accept worsening moves you
    may still get stuck in local optima.
  • Our next technique (tabu search) aims to
    incorporate memory as part of its search
    procedure. This promotes diversification.

48
Tabu Search
  • Simulated annealing theory is based on Markov
    chains
  • No memory. Once the search reaches a point, it
    doesnt matter how it got there.
  • Tabu search adds memory to the search with
    notions such as
  • Tabu list. When a move is taken (or state
    visited), it is placed on the tabu list for some
    number L moves. It usually cannot be retaken for
    the next L moves it is tabu. This helps
    avoid cycles in the search. It is a form of short
    term memory for the search. Promotes
    diversification.
  • Aspiration. But if taking a tabu move would give
    the best result yet then it may be taken!
    Promotes convergence.
  • Frequency. Long-term memory. Can be used to
    ensure particular move types are not taken too
    frequently over the whole search. Again, promotes
    diversification.

49
Tabu Search - Example
Fitness10
Fitness16
2
3
4
5
6
7
Top 5
2
3
4
5
6
7
Top 5
1
5,4
6
1
3,1
2
2
7,4
4
2
2,3
1
3
3,6
2
3
3,6
-1
4
2,3
0
4
3
7,1
-2
5
4,1
-1
5
6,1
-4
Tabu length 3 for this example
6
6
Example is a permutation problem representing
order of filter applications ( a move is simply a
pairwise swap) and is taken from Tabu Search
tutorial by Glover and Laguna in Modern
Heuristic Techniques for Combinatorial Problems
(edited by Colin Reeves). An excellent
introduction to search generally.
50
Tabu Search -Example
Fitness14
Fitness18
2
3
4
5
6
7
Top 5
2
3
4
5
6
7
Top 5
1
2
4,5
6
1
3
1,3
-2
Tabu
Tabu
2
3
5,3
2
2
2,4
-4
3
7,1
0
3
7,6
-6
Tabu
4
1
1,3
-3
4
2
4,5
-7
Tabu
5
2,6
-6
5
5,3
-9
6
6
Aspiration suggests we should take the tabu move
anyhow.
51
Local Search Trajectories
  • There may be a great deal of useful information
    in the trajectory (trace) of a search.
  • We are doing guided search and each decision
    whether to move to a new state is based on
    information from the cost function landscape.
  • Why throw all this away?
  • The final result is only one of the outputs
    from a search.
  • Sometimes analysis of the trajectory may provide
    information on the desired solution, even when
    the final result is not that desired solution.
  • Failure may not be as bad as it seems!

52
Local Searches
  • Why bother with anything else?
  • The local optimum you end up in may depend very
    much on the initial starting state. If the search
    space is sufficiently large it may be very
    unlikely that a local search technique will find
    the global optimum (within any reasonable amount
    of time anyhow).
  • Some population techniques have been found to
    sample search spaces more effectively and learn
    features of high performing solutions.
  • A good deal of experimentation has shown that
    population based approaches may give better
    overall results but often only when elements of
    local search are added to do the final tuning.
    The reading spectacles of local search are
    great. Other techniques have better binoculars.
  • Other approaches may get you near the summit of
    Everest very effectively. But local search then
    gives you the oxygen and the kit to ascend to the
    peak within view.
  • A simple local search (e.g. annealing) is very
    cheap to implement. Try local search first, and
    then something more sophisticated if you need to.

53
Local Searches
  • Local searches may be used in other ways too.
  • Multiple runs may be employed.
  • If we can identify several local optima we may be
    able to search the space more effectively using
    this information.

54
Summary and Comments
  • If your problem looks like a special purpose nail
    use a special purpose hammer. Otherwise move
    on.
  • Local search progressively investigates solutions
    close to the current solution. Effectively, the
    search trace is a walk around the search space,
    where only small steps can be taken.
  • But small step is defined by you.
  • Some steps you just want to take, and others you
    take in the hope of reward later.
  • Convergence is promoted in various ways (e.g.
    always accept improving moves, aspiration
    criterion)
  • Diversification is promoted in various ways
    (e.g. probabilistically accept non-improving
    moves, tabu lists)
  • Local search is good place to start...
Write a Comment
User Comments (0)
About PowerShow.com