Expectation Maximization - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Expectation Maximization

Description:

... weights for every data point by running the responsibilities (weights) ... Maximize a loglikelihood function with the weights given by E step to update the ... – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 21
Provided by: yang55
Category:

less

Transcript and Presenter's Notes

Title: Expectation Maximization


1
Expectation Maximization
  • An Approach to Parameter Estimation
  • Yang Ran
  • ENEE 698a

2
Outline
  • Basic ideas of Expectation Maximization
  • An example problem of Gaussian mixture
  • General EM algorithm
  • Further readings and conclusion

3
EM-Background
  • How to classify points and estimate parameters of
    the models in a mixture at the same time?
  • (Chicken and egg problem)
  • In mixture of models, two targets are twisted
  • The parameters of the models
  • The assignment of each data point to the process
    that generate it

4
Intuition in EM
  • What is the INTUITION behind EM?
  • Each of the step is easy assuming the other is
    solved
  • Know the assignment of each data points, we can
    estimate the parameters
  • Know the parameters of the distributions, we can
    assign each point to a model ( eg. by MLE)

5
Key Factor in EM
  • Adaptive hard clustering k-mean. Assign at each
    point to only one class at each step.
  • Adaptive soft clustering EM. Data is assigned to
    each class with a probability equal to the
    relative likelihood of that point belonging to
    the class.

6
Structure of EM Algorithm
  • Initialization Pick start values for parameters
  • Iteratively process until parameters converge
  • Expectation (E) step Calculate weights for every
    data point by running the responsibilities
    (weights)
  • Maximization (M) step Maximize a loglikelihood
    function with the weights given by E step to
    update the parameters of the models

7
Example Gaussian Mixtures
  • Problem N Samples Yi is from an population,
  • where ? is 0 or 1 with P(? 1 ) p
  • Question How to estimate the parameters
    and the coefficients ( membership)
    ?using EM?

8
Example Gaussian Mixture
  • The Log-likelihood based on N training samples
    is
  • Directly maximization is numerically hard, but is
    made easier with ?i known
  • Suppose ?i is known, the log-likelihood is

9
Sample Gaussian Mixture
10
Simulation
  • The nonparametric density estimation of 1,000
    simulated data generated by the distribution done
    by Matlab

11
Simulation
  • Results

12
Discuss for Gaussian Mixture
  • How to construct initial values
  • Simple way
  • Think more
  • Global maximizer is
    not optimal
  • Looking for a good local maximizer s.t.
  • In this sample
  • Run EM with different initial values
  • Choose the run giving highest likelihood

13
Discuss for Gaussian Mixture
  • Results to show 3 mixtures of Gaussians

14
The EM in General
  • Data Augmentation Maximization of the likelihood
    is difficult, but is made easier by enlarging the
    sample with latent (unobserved) data.
  • Latent (missing) data
  • Previous sample the model memberships
  • In general actual data that should have been
    observed but missing

15
The EM in General-cntl
  • Equation formulation
  • 1.Bayesian gives

2.Taking conditional expectations w.r.t
16
The EM in General-cntl
  • Why maximization in Q results in maximization in
  • is the expectation of a log-likelihood
    function, w.r.t the same density
  • By Jensens inequality, if maximizes Q

17
The EM in General-cntl
18
Conclusion
  • Why? The problem of learning from incomplete data
    where the labels of the data are missing is
    considered.
  • How? An iterative procedure called EM based on
    maximum likelihood and Bayes theorem is applied
    to estimate the model and labels.

19
Conclusion
  • Although EM method is powerful in many cases, but
    we still have the following difficulties
  • EM is a method of MLE based on the complete data,
    which assumes that the distribution family is
    given. But in practice, this assumption is
    usually inaccessible.
  • It is difficult to validate the optimal solution
    of EM method.
  • Converge to local maxima.

20
References
  • A. P. Dempster, N. M. Laird and D. B. Rubin,
    "Maximum likelihood from incomplete data via the
    EM algorithm", Journal of the Royal Statistical
    Society Series B, vol. 39, no. 1, pp. 1--38, Nov.
    1977
  • T. Moon, "The Expectation-Maximization
    algorithm", IEEE Signal Processing Magazine, pp.
    47--60, Nov. 1996
  • Yair Weiss, "Motion Segmentation using EM a short
    tutorial, www.cs.huji.ac.il/yweiss/tutorials.htm
    l
Write a Comment
User Comments (0)
About PowerShow.com