Title: Estimating Stationary Distributions of Markov Chains Modeling EAs Using the Quotient Construction Method.
1Estimating Stationary Distributions of Markov
Chains Modeling EAs Using the Quotient
Construction Method.
- Boris Mitavskiy
- School of Medicine and Kroto Research Institute
- University of Sheffield
2Notation
denotes a set, usually finite, called a search
space
is the
fitness function
3How Does an Evolutionary Computation Algorithm
Work?
is chosen randomly.
4Selection is performed so that we obtain a new
population
In other words, all the individuals present in
are also present in (no new
individuals appear in ) .
5ExampleFitness-proportional selection
6Recombination
is just some probabilistic rule which sends a
given population
7Recombination
is just some probabilistic rule which sends a
given population to the population
8Recombination
In our framework we only assume the
following weak purity (in the sense of
Radcliffe) about recombination
with probability 1.
9Recombination
In other words, uniform populations stay uniform
with probability 1.
10Mutation
of the
population
select a mutation
transformation
where is the family of mutation
transformations
11Replace every individual
of
with the individual
12This once again gives us a newpopulation
Assume mutation transformations are selected
independently.
13Our assumption about mutation
Suppose is a metric space with an
integer- valued metric ( is the
hamming-distance in case of classical GAs and
this is the correct Intuition to keep in mind).
In practice mutation is controlled by a
positive parameter . We assume the
individual is obtained from the
individual with probability
as .
14Quotients of Markov Chains
is the state space of our irreducible
Markov chain
15Quotients of Markov Chains
is the state space of our irreducible
Markov chain
Partition into equivalence classes
16Quotients of Markov Chains
is the state space of our irreducible
Markov chain
How do we define the transition probabilities
among the equivalence classes?
?
?
?
?
?
?
?
?
?
17Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
18Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
19Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
20Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
21Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
22Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
23Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
24Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
25Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
26Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
27Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
28Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
29Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at .
30Imagine that the chain runs for a very long time.
Let denote the stationary distribution of
this Markov chain. Suppose we run the original
Markov chain For an extensive period of time.
We are interested in computing the probability Of
reaching given that we are at . An
element arises with frequency
. On the other hand, a given inside of
occurs with relative frequency
among the states inside of . We therefore,
obtain the following transition probability
formula
31Where is the probability of getting
somewhere inside of Starting from .
Computing is also quite easy
32Where is the probability of getting
somewhere inside of Starting from .
Computing is also quite easy
33Where is the probability of getting
somewhere inside of Starting from .
Computing is also quite easy
34Where is the probability of getting
somewhere inside of Starting from .
Computing is also quite easy And
so we finally obtain
35It is not surprising then that the quotient
Markov chain is also irreducible and
its stationary distribution is coherent with the
original one. The irreducibility is left as
an exercise. Let denote the distribution
obtained from as follows For every
equivalence class we let
.
36It is not surprising then that the quotient
Markov chain is also irreducible and
its stationary distribution is coherent with the
original one. The irreducibility is left as
an exercise. Let denote the distribution
obtained from as follows For every
equivalence class we let
. It can be verified By
direct computation that is the stationary
distribution.
37Although the transition probabilities of the
quotient chain are defined in terms of the
stationary distribution of the original chain
they may still give us some useful information
38Consider the quotient Markov chain with states
, and
Notice that so that
or, equivalently,
39Notice that the inequality involves only
and but not !
(At least, not directly)
40In summary,
41Remark
- The quotient construction method presented in the
current paper is a significant improvement of the
one presented here 2 years ago since it drops the
requirement for the sets and to cover
the whole state space . In other words we
do not need .
42Markov Chains Modelling EAs
- State space of such a Markov chain is
- where is the number
of - individuals in a population and is the
- search space.
- Notice, we assume ordered populations here
43A very simple example
Assume either a recombination ? mutation ?
selection or mutation ? recombination ?
selection algorithm.
This is really not much of a restriction (can
later use continuity arguments later to position
mutation last)
Consider a binary GA with string length
and population size
44Say we have uniform populations
and
45 To go from the population to the
population after a single GA cycle we may
get into the population of the form
first, after mutation, and then into after
selection.
46Getting from into for some
after mutation happens with probability
while getting into afterwards upon
completion of selection happens with
probability
47Recombination has no effect at all and so the
total transition probability is at least
Likewise,
48Thus, our estimate gives us
where
49(No Transcript)
50A Result from CEC2007
51We then obtain
where
is the maximal probability of not
obtaining the population starting with the
population
52Results from an upcoming paper
This is done in terms of the following
parameters of the given EA
53To bound from below consider
the following parameters
54(No Transcript)
55(No Transcript)
56Summarizing the above, we obtain and
We then deduce that
57The bound can be improved in case when
selection follows mutation by observing that upon
completion of mutation one still has to get away
from by selecting the mutated
individual at least once
58We then obtain the bound
Newest developments
Although the bounds obtained above are often
poor, they are, nevertheless, The first type of
rigorous bounds available so far. The lumping
quotient method presented here may be pushed
further as follows
59Recall we used which can be rewritten as
In the previous bounds we completely ignored the
term
It turns out we can obtain better bounds by
applying the same equation to and then
substituting into the above
60Solving for we finally obtain
61At SEAL06 we managed to establish (using
inferior tools)
Considering the limit means that we select a
high enough selection pressure depending
on the selected small mutation rate .
62Thank you very much for your attention!
?
Questions?