Evolving the Structure of Hidden Markov Models - PowerPoint PPT Presentation

1 / 38

About This Presentation

Title:

Evolving the Structure of Hidden Markov Models

Description:

132 sequences are used for Baum-Welch training, 43 for fitness evaluation ... the training data are trained while another half are used for fitness evaluation ... – PowerPoint PPT presentation

Number of Views:50

Avg rating:3.0/5.0

Slides: 39

Provided by: cse12

Category:

more less

Transcript and Presenter's Notes

Title: Evolving the Structure of Hidden Markov Models

1
Evolving the Structure of Hidden Markov Models

Paper from IEEE Trans. on EC
Present Cyrus

2
Outline

Background and Motivation
The Proposed Method
Result and Analysis
Comments

3
Outline

Background and Motivation
The Proposed Method
Result and Analysis
Comments

4
Hidden Markov Models (HMMs)

probabilistic finite-state machines used to find
the underlying structures of sequential data
defined by the set of states, the transition
probabilities between states, and a table of
emission probabilities associated with each state
for all possible symbols (e.g. ATCG in DNA) that
occur in the sequence

5
HMMsIllustration

An example for DNA sequences

C 0.6 T 0.4
A 0.25 T 0. 5 C 0.25
A 0.9 T 0.1
A 0.05 C 0.05 G 0.9
G 1
x1
x2
xi
xn
x3

S1S2S3SiSn
x1x2x3xixn
AGTCG
AGCTG
S1S2S1S2Si
x1x2x1x2xi
AGTGC
6
HMMsGiven the model and parameters

We can calculate
Probability of an observed sequence
forward-backward algorithm
The most likely sequence of hidden states that
could have generated a given output sequence
Viterbi algorithmdynamic programming algorithm

7
HMMsParameter Estimation

Given a set of output sequences
The modeling problem How to determine the
transition and emission probabilities
Baum-Welch algorithm
Update the parameters to maximize the model
likelihood
An expectation maximization method iterations
until convergence
Briefly introduced

8
Baum-Welch Algo

HMM Parameter Estimation
Forward Variable
Backward Variable
Will be used for parameter calculation

Markovian nature of the model
9
Update Rules

EM procedure, the parameters that maximize the
likelihood value are calculated iteratively
Update of the transition probability
Where is the number of transitions from state
i to state j summed over the sequence

10
Update Rules

Update of the emission probability
In Baum-Welch, unknown transition and emission
frequencies are replaced by their expected values

the number of times the symbol a is emitted
when in state i
11
The procedure

The parameters are re-estimated by the update
rules
The procedure is iterated until some convergence
criterion is met
Pseudo-counts are used to avoid excessive
over-fitting

12
What Architecture to Choose?

Usually involved with expert knowledge to
manually design the HMM

Or
?
13
Automatic discovery of HMM structures

Some significant attractions
Allow the data to speak for themselves and get
rid of the requirement of experts
Possibility to find completely novel structures,
free from theoretical prejudice
Automation provides many more structures to be
tested than is possible from manual design

14
Motivation

The aim of the paper
utilize genetic algorithms (GAs) to gain the
advantage of automatic HMM structure discovery
retain some of the benefits of a hand-designed
architecture for biological sequences analysis
HMMs have received little attention from
evolutionary computing community
It is novel that evolving the architecture of
HMMs using GA

15
Outline

Background and Motivation
The Proposed Method
Result and Analysis
Comments

16
Representation of Block HMM

In order to constrain the search of HMM
topologies to biologically meaningful structures
HMMs structure is represented as a number of
blocks
Three basic structures in biological analysis are
used as the blocks

17
Blocks of HMMs

(a) linear (to model conserved regions)
(b) self-loop (to model a sequence of any length)
(c) forward blocks (to describe varying length
subsequences)

18
Block HMM

Types tied or untied
Tied all the emission and transition
probabilities inside the block are set equal
The blocks are fully linked together to form the
whole HMM architecture

19
Genetic Operators

Crossover
Each block is represented by a pair
First element type linear, self-loop
Second element tied or untied and other info
The whole HMM is represented by a string of
pairs

20
Crossover

Combination between blocks
full transitions between the blocks are not shown
for simplicity

21
Mutation

Delete/add a transition

22
Mutation

Delete/add a state

23
Fitness Evaluation

To achieve generalization ability and avoid
over-fitting
training data are split into two sets
one half is used as training set using BaumWelch
to estimate the parameters
the other half as a evaluation set to measure the
fitness for selecting members from the population

24
Fitness Function

stands for parameters of the HMM individual,
is the ith sequence for evaluation and is
its length
Notice that the formula in the paper is not
precise

25
Reproduction

The individuals are selected for reproduction
with Boltzmann probability
where
s (1)is the parameter to control selection
strength N is the population size
Stochastic universal sampling is used to reduce
genetic drift in selection

a development of Fitness proportionate selection
which exhibits no bias and minimal spread uses a
single random value to sample all of the
solutions by choosing them at evenly spaced
intervals
26
Stochastic universal sampling
(A)
Expectation pie.
(B)
Divide another pie by population size to get
children pie.
2.77 1.23 4
Child 1
0.58
i1
i4
Child 4
Child 2
i2
2.55 0.22 2.77
i3
Child 1
0.58 1.97 2.55
Child 2
Pie slice for each E(i)
Child 4
(C)
Child 2
Choose a random number in (0,1) and spin
children pie by that amount
Child 3
27
Stochastic universal sampling
(D)
Child 1
Superimpose children pie on top of expectation
pie. This gives the number of children of each
individual.
Child 4
Child 2
Child 3
The number of children generated cannot be less
than the floor of E(i) and cannot be greater than
the ceiling of E(i).
28
Outline

Background and Motivation
The Proposed Method
Result and Analysis
Comments

29
Artificial Data

Model (ATG) and (AAGATGAGGACG)
Two-block models are used
GA configuration
Results

30
Promoter Model of C. jejuni

For each individual HMM in GA
175 sequences available
132 sequences are used for Baum-Welch training,
43 for fitness evaluation
The best HMMs found with nine- or eight-block
settings
9, 8 find the AAGGA and TAtAAT regions
9, 8 find the presence of semi-conserved TGx
upstream of TATA box
9 finds the ten-base periodicity which is
discovered in a handcrafted HMM
7 only AAGGA is found

31
HMMs found by GA
ten-base periodicity
9-block
8-block
7-block
32
Discrimination Test

These HMM architectures are tested for
discrimination ability
The architectures are kept while the parameters
are reset to be random for Baum-Welch training
Five-fold cross validation each time of the 175
sequences 140 are for training and 35 for testing
Background sequences are generated by a
third-order Markov chain
A log-odds threshold is set so that there are ten
or fewer false positives (FP) and then the number
of true positives (TP) is measured

33
Results

The total true positives in the five-folds are
(355) 175
Compared with a previous handcrafted HMM (with
expert knowledge)

34
Outline

Background and Motivation
The Proposed Method
Result and Analysis
Comments

35
Contributions

This paper proposes a novel method to
automatically discover the structure of HMMs
using GA
To preserve biologically meaningful building
blocks of HMMs, block representation is employed
GA explores different combinations of these
blocks and mutates the blocks to form new HMMs
To avoid over-fitting only half of the training
data are trained while another half are used for
fitness evaluation

36
Problems

The huge complexity is still great weakness, as
the authors mention
Each individual involves a whole process of
training and testing a HMM!
Some descriptions are not clear
I have to refer to other three related papers by
the same authors to get a more complete view
Incorrect figure how to mutate to untied lack
systematic descriptions of the mutation cases
fitness function not precise

37
Questions

Experiment
more details are desired such as the emission and
transition probabilities of the blocks in HMM
more explanation is needed about those blocks
which are not triple-equivalent in the results
and their affect
the reason of obtaining TP by setting a threshold
so that there are ten or fewer FP should be
justified

38
Evolving the Structure of Hidden Markov Models