Hidden Markov Model - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Hidden Markov Model

Description:

Hidden Markov Model 11/28/07 Na ve Bayes approximation When x is high dimensional, it is difficult to estimate Na ve Bayes Classifier Usually the independence ... – PowerPoint PPT presentation

Number of Views:332
Avg rating:3.0/5.0
Slides: 46
Provided by: Guoc6
Category:
Tags: hidden | markov | model | network

less

Transcript and Presenter's Notes

Title: Hidden Markov Model


1
Hidden Markov Model
  • 11/28/07

2
Bayes Rule
  • The posterior distribution
  • Select k with the largest posterior distribution.
  • Minimizes the average misclassification rate.
  • Maximum likelihood rule is equivalent to Bayes
    rule with uniform prior.
  • Decision boundary is

3
Naïve Bayes approximation
  • When x is high dimensional, it is difficult to
    estimate

4
Naïve Bayes Classifier
  • When x is high dimensional, it is difficult to
    estimate
  • But if we assume independence, then it becomes a
    1-D problem.

5
Naïve Bayes Classifier
  • Usually the independence assumption is not valid.
  • But sometimes the NBC can still be a good
    classifier.
  • A lot of times simple models may not perform
    badly.

6
Hidden Markov Model
7
A coin toss example
  • Scenario You are betting with your friend using
    a coin toss. And you see (H, T, T, H, )

8
A coin toss example
Scenario You are betting with your friend using
a coin toss. And you see (H, T, T, H, ) But,
you friend is cheating. He occasionally switches
from a fair coin to a biased coin of course,
the switch is under the table!
Biased
Fair
9
A coin toss example
This is what really happening (H, T, H, T, H,
H, H, H, T, H, H, T, ) Of course you cant see
the color. So how can you tell your friend is
cheating?
10
Hidden Markov Model
Hidden state (the coin)
Observed variable (H or T)
11
Markov Property
Hidden state (the coin)
Observed variable (H or T)
12
Markov Property
transition probability
prior distribution
Biased
Fair
13
Observation independence
Hidden state (the coin)
Observed variable (H or T)
Emission probability
14
Model parameters
  • A (aij) (transition matrix)
  • p(yt xt) (emission probability)
  • p(x1) (prior distribution)

15
Model inference
  • Infer states when model parameters are known.
  • Both states and model parameters are unknown.

16
Viterbi algorithm
time
t-1
t
t1
1
2
state
3
4
17
Viterbi algorithm
time
  • Most probable path

t-1
t
t1
1
2
state
3
4
18
Viterbi algorithm
time
  • Most probable path

t-1
t
t1
1
2
state
3
4
19
Viterbi algorithm
time
  • Most probable path

t-1
t
t1
1
2
state
3
4
Therefore, the path can be found iteratively.
20
Viterbi algorithm
time
  • Most probable path

t-1
t
t1
1
2
state
3
4
Let vk(i) be the most probable path ending in
state k. Then
21
Viterbi algorithm
  • Initialization (i0)
  • Recursion (i1,...,L)
  • Termination
  • Traceback (i L, ..., 1)

22
Advantage of Viterbi path
  • Identify the most probable path very efficiently.
  • The most probable path is legitimate, i.e., it is
    realizable by the HMM process.

23
Issue with Viterbi path
  • The most probability path does not predict the
    confidence level of a state estimate.
  • The most probably path may not be much more
    probable then other paths.

24
Posterior distribution
  • Estimate p(xk y1, ..., yL).
  • Strategy
  • This is done by a forward-backward algorithm

25
Forward-backward algorithm
  • Estimate fk(i)

26
Forward algorithm
Estimate fk(i)
Initialization Recursion Termination
27
Backward algorithm
Estimate bk(i)
28
Backward algorithm
Estimate bk(i)
Initialization Recursion Termination
29
Probability of fair coin
1
P(fair)
30
Probability of fair coin
1
P(fair)
31
Posterior distribution
  • Posterior distribution predicts the confidence
    level of a state estimate.
  • Posterior distribution combines information from
    all paths.
  • But..
  • The predicted path may not be legitimate.

32
Estimating parameters when state sequence is known
  • Given the state sequence xk
  • Define
  • Ajk transitions from j to k.
  • Ek(b) emissions of b from k.
  • The maximum likelihood estimates of parameters
    are

33
Infer hidden states together with model parameters
  • Viterbi training
  • Baum-Welch

34
Viterbi training
  • Main idea Use an iterative procedure
  • Estimate state for fixed parameters using the
    Viterbi algorithm.
  • Estimate model parameters for fixed states.

35
Baum-Welch algorithm
  • Instead of using the Viterbi path to estimate
    state, consider the expected number of Akl and
    Ek(b)

36
Baum-Welch algorithm
  • Instead of using the Viterbi path to estimate
    state, consider the expected number of Akl and
    Ek(b)

37
Baum-Welch is a special case of EM algorithm
  • Given an estimate of parameter qt , try to find a
    better q.
  • Choose q to maximize Q

38
Baum-Welch is a special case of EM algorithm
  • E-step Calculate the Q function
  • M-step Maximize Q(qqt) with respect to q.

39
Issue with EM
  • EM only finds local maxima.
  • Solution
  • Run multiple EM starting with different initial
    guesses.
  • Use more sophisticated algorithm such as MCMC.

40
Dynamic Bayesian Network
Kelvin Murphy
41
Software
  • Kevin Murphys Bayes Net Toolbox for Matlab
  • http//www.cs.ubc.ca/murphyk/Software/BNT/bnt.ht
    ml

42
Applications
Copy number changes
(Yi Li)
43
Applications
Protein-binding sites
44
Applications
Sequence alignment
www.biocentral.com
45
Reading list
  • Hastie et al. (2001) the ESL book
  • p184-185.
  • Durbin et al. (1998) Biological Sequence Analysis
  • Chapter 3.
Write a Comment
User Comments (0)
About PowerShow.com