Biologically%20Inspired%20Computing:%20%20Evolutionary%20Algorithms:%20Hillclimbing,%20Landscapes,%20Neighbourhoods - PowerPoint PPT Presentation

About This Presentation

Title:

Biologically%20Inspired%20Computing:%20%20Evolutionary%20Algorithms:%20Hillclimbing,%20Landscapes,%20Neighbourhoods

Description:

Biologically Inspired Computing: Evolutionary Algorithms: Hillclimbing, Landscapes, Neighbourhoods This is lecture 4 of `Biologically Inspired Computing – PowerPoint PPT presentation

Number of Views:157

Avg rating:3.0/5.0

Slides: 65

Provided by: Profess52

Category:

more less

Transcript and Presenter's Notes

Title: Biologically%20Inspired%20Computing:%20%20Evolutionary%20Algorithms:%20Hillclimbing,%20Landscapes,%20Neighbourhoods

1
Biologically Inspired Computing Evolutionary
Algorithms Hillclimbing, Landscapes,
Neighbourhoods

This is lecture 4 of
Biologically Inspired Computing
Contents
hillclimbing, local search, landscapes

2
Back to Basics

With your thirst for seeing example EAs
temporarily quenched, the story now skips to
simpler optimization algorithms.
1. Hillclimbing
2. Local Search
These have much in common with EAs, but with no
population they just use a single solution and
keep trying to improve it.
HC and LS can be very effective algorithms when
appropriately engineered. But by looking at them,
we will discover certain limitations, and this
leads us directly to algorithm design strategies
that look like Eas. That is, we can arrive at
EAs from an algorithm design route, not just a
bio-inspiration route.

3
The Travelling Salesperson Problem(also see
http//francisshanahan.com/tsa/tsaGAworkers.htm )

An example (hard) problem, for illustration

The Travelling Salesperson Problem Find the
shortest tour through the cities.
A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
The one below is length 33
B
C
E
D
A
4
Simplest possible EA Hillclimbing

0. Initialise Generate a random solution c
evaluate its fitness, f(c). Call c the
current solution. 1. Mutate a copy of the
current solution call the mutant m Evaluate
fitness of m, f(m). 2. If f(m) is no worse than
f(c), then replace c with m, otherwise do
nothing (effectively discarding m). 3. If a
termination condition has been reached, stop.
Otherwise, go to 1.
Note. No population (well, population of 1).
This is a very simple version of an EA, although
it has been around for much longer.
5
Why Hillclimbing?

Suppose that solutions are lined up along the x
axis, and that mutation always gives you a nearby
solutions. Fitness is on the y axis this is a
landscape

9
6
10
7
5, 8
4
3
1
2

Initial solution 2. rejected mutant 3. new
current solution,
4. New current solution 5. new current solution
6. new current soln
7. Rejected mutant 8. rejected mutant 9. new
current solution,
10. Rejected mutant,

6
Example HC on the TSP

We can encode a candidate solution to the TSP as
a permutation

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
Here is our initial random solution ACEDB
with fitness 32

Current solution
B
C
E
D
A
7
Example HC on the TSP

We can encode a candidate solution to the TSP as
a permutation

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9

Current solution
Copy of current
B
B
C
C
E
D
E
A
D
A
8
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
We randomly mutate it (swap randomly
chosen adjacent nodes) from ACEDB to
ACDEB which has fitness 33 -- so current stays
the same Because we reject this mutant.

Current solution
Mutant
B
B
C
C
E
D
E
A
D
A
9
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
We now try another mutation of Current (swap
randomly chosen adjacent nodes) from ACEDB to
CAEDB. Fitness is 38, so reject that too.

Current solution
Mutant
B
B
C
C
E
D
E
A
D
A
10
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
Our next mutant of Current is from ACEDB
to AECDB. Fitness 33, reject this too.

Current solution
Mutant
B
B
C
C
E
D
E
A
D
A
11
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
Our next mutant of Current is from ACEDB
to ACDEB. Fitness 33, reject this too.

Current solution
Mutant
B
B
C
C
E
D
E
A
D
A
12
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
Our next mutant of Current is from ACEDB
to ACEBD. Fitness is 32. Equal to Current, so
this becomes the new Current.

Current solution
Mutant
B
B
C
C
E
D
E
A
D
A
13
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
ACEBD is our Current solution, with fitness 32

Current solution
14
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
ACEBD is our Current solution, with fitness
32. We mutate it to DCEBA (note that first and
last are adjacent nodes) fitness is 28. So this
becomes our new current solution.

Current solution
Mutant
B
C
E
D
A
15
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
Our new Current, DCEBA, with fitness 28

Current solution
B
C
E
D
A
16
HC on the TSP

A B C D E
A 5 7 4 15
B 5 3 4 10
C 7 3 2 7
D 4 4 2 9
E 15 10 7 9
Our new Current, DCEBA, with fitness 28 . We
mutate it, this time getting DCEAB, with
fitness 33 so we reject that and DCEBA is still
our Current solution.

Mutant
Current solution
B
B
C
C
E
E
D
D
A
A
17
And so on

18
HC again, on a 1D landscape
FITNESS
x - the genotype
19
Initial random point
FITNESS
x - the genotype
20
mutant
FITNESS
x - the genotype
21
Current solution (unchanged)
FITNESS
x - the genotype
22
Next mutant
FITNESS
x - the genotype
23
New current solution
FITNESS
x - the genotype
24
Next mutant
FITNESS
x - the genotype
25
New current solution
FITNESS
x - the genotype
26
ETC
FITNESS
x - the genotype
27
How will HC do on this landscape?
28
Some other landscapes
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
Landscapes

Recall S, the search space, and f(s), the
fitness of a candidate in S

f(s)
members of S lined up along here
The structure we get by imposing f(s) on S is
called a landscape
34
Neighbourhoods

Recall S, the search space, and f(s), the
fitness of a candidate in S

f(s)
0 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17
Showing the neighbourhoods of two candidate
solutions, assuming The mutation operator adds a
random number between -1 and 1
35
Neighbourhoods

Recall S, the search space, and f(s), the
fitness of a candidate in S

f(s)
0 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17
Showing the neighbourhood of a candidate
solutions, assuming The mutation operator adds a
random integer between -2 and 2
36
Neighbourhoods

Recall S, the search space, and f(s), the
fitness of a candidate in S

f(s)
0 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17
Showing the neighbourhood of a candidate
solutions, assuming the mutation operator simply
changes the solution to a new random Number
between 0 and 20
37
Neighbourhoods

Recall S, the search space, and f(s), the
fitness of a candidate in S

f(s)
0 1 2 3 4 5 6 7 8 9 10 11
12 13 14 15 16 17
Showing the neighbourhood of a candidate
solution, assuming the mutation operator adds a
Gaussian (ish) with mean zero.
38
Neighbourhoods

Let s be an individual in S, f(s) is our
fitness function, and M is our mutation operator,
so that M(s1) ? s2, where s2 is a mutant of s1.

Given M, we can usually work out the
neighbourhood of an individual point s the
neighbourhood of s is the set of all possible
mutants of s E.g. Encoding permutations of
k objects (e.g. for k-city TSP)
Mutation swap any adjacent pair of objects.
Neighbourhood Each individual has k
neighbours. E.g. neighbours of EABDC
are AEBDC, EBADC, EADBC, EABCD, CABDE
Encoding binary strings of length L (e.g.
L-item 2-bin-packing) Mutation choose
a bit randomly and flip it.
Neighbourhood Each individual has L
neighbours. E.g. neighbours of 00110
are 10110, 01110, 00010, 00100, 00111
39
Landscape Topology

Mutation operators lead to slight changes in the
solution, which tend to lead to slight changes in
fitness.
Why are big mutations generally a bad idea to
have in a search algorithm ??

40
Typical Landscapes
f(s)

members of S lined up along here
Typically, with large (realistic) problems, the
huge majority of the landscape has very poor
fitness there are tiny areas where the decent
solutions lurk. So, big random changes are very
likely to take us outside the nice areas.
41
Typical Classes of Landscapes

Plateau
Unimodal
Multimodal
Deceptive
As we home in on the good areas, we can identify
broad types of Landscape feature. Most
landscapes of interest are predominantly
multimodal. Despite being locally smooth, they
are globally rugged
42
Beyond Hillclimbing

HC clearly has problems with typical landscapes
There are two broad ways to improve HC, from the
algorithm viewpoint
Allow downhill moves a family of methods called
Local Search does this in various ways.
Have a population so that different regions can
be explored inherently in parallel I.e. we keep
poor solutions around and give them a chance to
develop.

43
Local Search

Initialise Generate a random solution c
evaluate its
fitness, f(s) b call c the
current solution,
and call b the best so far.
Repeat until termination conditon reached
Search the neighbourhood of c, and choose one, m
Evaluate fitness of m, call that x.
2. According to some policy, maybe replace c with
x, and
update c and b as appropriate.

E.g. Monte Carlo search 1. same as
hillclimbing 2. If x is better, accept it as
new current solutionif x is worse, accept it
with some probabilty (e.g. 0.1).
E.g. tabu search 1. evaluate all immediate
neighbours of c 2. choose the best from (1)
to be the next current solution, unless it
is tabu (recently visited), in which choose the
next best, etc.
44
Population-Based Search

Local search is fine, but tends to get stuck in
local optima, less so than HC, but it still gets
stuck.
In PBS, we no longer have a single current
solution, we now have a population of them. This
leads directly to the two main algorithmic
differences between PBS and LS
Which of the set of current solutions do we
mutate? We need a selection method
With more than one solution available, we
neednt just mutate, we can recombine,
crossover, etc two or more current solutions.
So this is an alternative route towards
motivating our nature-inspired EAs and also
starts to explain why they turn out to be so
good.

An extra bit about Encodings
Direct vs Indirect

46
Encoding / Representation

Maybe the main issue in (applying) EC
Note that
Given an optimisation problem to solve, we need
to find a way of encoding candidate solutions
There can be many very different encodings for
the same problem
Each way affects the shape of the landscape and
the choice of best strategy for climbing that
landscape.

47
E.g. encoding a timetable I
mon tue wed thur
900 E4, E5 E2 E3, E7
1100 E8
200 E6
400 E1

4, 5, 13, 1, 1, 7, 13, 2
Exam2 in 5th slot
Exam1 in 4th slot
Etc

Generate any string of 8 numbers between 1 and
16,
and we have a timetable!
Fitness may be ltclashesgt ltconsecsgt etc
Figure out an encoding, and a fitness function,
and
you can try to evolve solutions.

48
Mutating a Timetable with Encoding 1
mon tue wed thur
900 E4, E5 E2 E3, E7
1100 E8
200 E6
400 E1

4, 5, 13, 1, 1, 7, 13, 2
Using straightforward single-gene mutation
Choose a random gene
49
Mutating a Timetable with Encoding 1
mon tue wed thur
900 E4, E5 E2 E7
1100 E8 E3
200 E6
400 E1

4, 5, 6 , 1, 1, 7, 13, 2
Using straightforward single-gene mutation
One mutation changes position of one exam
50
Alternative ways to do it

This is called a direct encoding. Note that
A random timetable is likely to have lots of
clashes.
The EA is likely (?) to spend most of its time
crawling through clash-ridden areas of the search
space.
Is there a better way?

51
E.g. encoding a timetable II
mon tue wed thur
900 E4 E7
1100 E8 E3
200 E6, E5, E2
400 E1

4, 5, 13, 1, 1, 7, 13, 2

Etc
Use the 13th clash-free slot for exam3 (clashes
with E1)
Use the 5th clash-free slot for exam2 (clashes
with E4 and E8)
Use the 4th clash-free slot for exam1
52
So, a common approach is to build an encoding
around an algorithm that builds a solution

Dont encode a candidate solution directly
instead encode parameters/features for a
constructive algorithm that builds a candidate
solution

E2
53

An aside., for study in your own time

54
Encoding a human
acgtctctcactcgagctactactccattctcctagg ccgcgagcgcga
gcgcggttatctagctccttagc caatctctagtagtctcagcgcttta
ctactcacgca agagctcagcggcattataaattctatctcatctcatt
agagccgaacgagccggatactcgatcgatcgac ttcgaccgacgcgg
gaggcttcatagcatcgatcg
The nucleus contains our DNA this is a very big
molecule resembling a long chain actually it is
cut into bits called chromosomes, but for our
purposes we can usually think of it as one big
molecule, and in fact we can think of it as a
long string of letters from the set A, C, G, T
-- in humans, about 3,000,000,000 letters.
55
How the encoding works Genes
CATCGGCTTATCTAGCTAATCGAGCTCTCTGAAGAGAAATATCATCTACG
ACTACTACGACACACATCGACGAGGCATC
Certain sections of DNA are genes. In most
organisms, they are few and far between (e.g. 3
of the genome), so the distance above is
misleading, although they often come in clusters
close together.
You can think of a cell as a protein factory Each
gene is a blueprint for a protein, which gets
manufactured in the cell, and then goes and does
some job elsewhere in the body, or maybe in the
same cell. A gene specifies how to make a
specific protein, using the materials typically
found inside the cell (amino acids) .
56
From Genes to Proteins
This is what seems to happen
ACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC
First, owing to something about the shape and
chemical properties of the DNA molecule near the
start of the gene, a specific collection of
proteins in the cell are attracted to this region.
57
Genes ? Proteins
ACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC
In turn, this collection attracts a big thing
(actually a collection of proteins itself) called
DNA polymerase.
58
Genes ? Proteins
RNA
CUAGCUCGA
ACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC
The DNA polymerase then travels along the gene,
and on the way (via complex and elegant
biochemical gymnastics) it makes an RNA
transcript of the gene. This is a kind of
negative of the gene, carrying all of the
information in it. The DNA polymerase falls of
at the end of the gene, owing to something about
the pattern of nucleotides it finds there. RNA
is similar to DNA, but uses U in place of T.
59
Genes ? Proteins
ACTCGCGATCGAGCTACGAGACTCATGCAGCTATGAC
ribosome
CUAGCUCGAUGCUCUGAGUACGUC
The RNA transcript then finds its way out of the
nucleus, and is attracted to a ribosome (which
is, you guessed it, made of a collection of
proteins).
The ribosome then manufactures a protein by, put
most simply and rather misleadingly, attaching
amino acids to the RNA transcript. There are 20
different amino acids floating about in the cell,
and which are attached where depends on 3-letter
sequences of RNA called codons Note If you
dont eat your vitamins (vital amino acids) they
arent around enough in your cells to make
certain key proteins.
60
Genes ? Proteins
RNA transcript
CUAGCUCGAUGCUCUGAGUACGUCUAG
L A R C S E Y
V stop
The ribosome translates each 3-letter codon
into a specific amino acid this mapping is
called the genetic code.
61
Protein Structure
L A R C S E Y
V
C
A
O
O
H
H
H
C
C
N
C
C
C
C
N
C
N
H
H
O
H
H
R
L
A protein has a backbone of Carbon and Nitrogen
atoms, with an amino acid residue attached to
every second carbon along it. The sequence of
amino acids determines the structure of the
protein
62
Protein Structure II
Backbone

etc
Sequence of amino acid residues
This long chain (e.g. around 300-2000 atoms),
with amino acids strung along it like beads,
instantaneously folds into a 3D structure,
governed by the chemical and electrostatic
environment. The resulting structure is a
protein.
Notice how the information contained in the gene
is now reflected in the sequence of amino acids,
which in turn determines the 3D structure and
properties of the protein.
63
These proteins, and a few other things, then
interact in complex ways, and build you.
64

That was an example of an indirect encoding
Some common terminology is genotype the data
structure (e.g. a vector of real numbers), and
phenotype the solution interpreted from the
genotype (e.g. a design for a two-phase jet
nozzle).
The genotype?phenotype mapping is an important
part of the encoding. Sometimes it is direct
(i.e. simple), sometimes it is indirect
(involving many complex steps).