EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank - PowerPoint PPT Presentation

About This Presentation
Title:

EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank

Description:

EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank DataMining, Morgan Kaufmann, p218-227 Mining Lab. – PowerPoint PPT presentation

Number of Views:853
Avg rating:3.0/5.0
Slides: 45
Provided by: aitimesTi
Category:

less

Transcript and Presenter's Notes

Title: EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank


1
EM AlgorithmExpectation MaximazationClustering
Algorithmbook DataMining, Morgan Kaufmann,
Frank
  • DataMining, Morgan Kaufmann, p218-227
  • Mining Lab. ???
  • 2004? 10? 27?

2
Content
  • Clustering
  • K-Means via EM
  • Mixture Model
  • EM Algorithm
  • Simple examples of EM
  • EM Application WEKA
  • References

3
Clustering (1/2)
  • Clustering ?
  • Clustering algorithms divide a data set into
    natural groups (clusters).
  • Instances in the same cluster are similar to each
    other, they share certain properties.
  • e.g Customer Segmentation.
  • Clustering vs. Classification
  • Supervised Learning
  • Unsupervised Learning
  • Not target variable to be predicted.

4
Clustering (2/2)
  • Categorization of Clustering Methods
  • Partitioning mehtods
  • K-Means / K-medoids / PAM / CRARA / CRARANS
  • Hierachical methods
  • CURE / CHAMELON / BIRCH
  • Density-based methods
  • DBSCAN / OPTICS
  • Grid-based methods
  • STING / CLIQUE / Wave-Cluster
  • Model-based methods
  • EM / COBWEB / Bayesian / Neural

Model-Based Clustering
Probability-based Clustering
Statistical Clustering
5
K-Means (1)Algorithm
  • Step 0
  • Select K objects as initial centroids.
  • Step 1 (Assignment)
  • For each object compute distances to k centroids.
  • Assign each object to the cluster to which it is
    the closest.
  • Step 2 (New Centroids)
  • Compute a new centroid for each cluster.
  • Step 3 (Converage)
  • Stop if the change in the centroids is less than
    the selected covergence criterion.
  • Otherwise repeat Step 1.

6
K-Means (2)simple example
New Centroids (Check)
Input Data
Random Centroids
Assignment
New Centroids (check)
Assignment
Assignment
Centroids (check)
7
K-Means (3)weakness on outlier (noise)
8
K-Means (4)Calculation
1. (4,4), (3,4)
0. (4,4), (3,4)
(4,2), (0,2), (1,1), (1,0) (100, 0)
(4,2), (0,2), (1,1), (1,0)
1. 1) lt3.5, 4gt lt21, 1gt
1. 1) lt3.5, 4gt lt1.5, 1.25gt
2) lt3.5, 4gt - (0,2), (1,1), (1,0),(3,4),(4,4),(4,
2) lt21, 1gt - (100,1)
2) lt3.5, 4gt - (3, 4), (4, 4), (4, 2) lt1.5,
1.25gt - (0, 2) (1, 1), (1, 0)
2. 1) lt2.1, 2.1gt lt100, 0gt
2. 2) lt3.67, 3.3gt lt0.67, 1gt
2) lt2.1, 2.1gt - (0, 2),(1,1),(1,0),(3,4),(4,4),(4
,2) lt100, 1gt - (100, 1)
3) lt3.67, 3.3gt - (3, 4), (4, 4), (4, 2)
lt0.67, 1gt - (0, 2) (1, 1), (1, 0)
9
K-Means (5)comparison with EM
C1
  • K-Means
  • Hard Clustering.
  • A instance belong to only one Cluster.
  • Based on Euclidean distance.
  • Not Robust on outlier, value range.
  • EM
  • Soft Clustering.
  • A instance belong to several clusters with
  • membership probability.
  • Based on density probability.
  • Can handle both numeric and nominal attributes.

I
C2
C1
0.7
0.3
I
C2
10
Mixture Model (1)
  • A Mixture is a set of k probability
    distributions, repesenting k clusters.
  • A probability distribution have mean and
    variances.
  • The mixture model combines several normal
    distributions.

11
Mixture Model (2)
  • Only one numeric attribute
  • five parameter

12
Mixture Model (3) Simple Example
  • Probability that an instance x belongs to cluster
    A

Probability Density Function
13
Mixture Model (4)Probability Density Function
  • Normal Distribution
  • Gaussian Density Function
  • Poisson Distribution

14
Mixture Model (5)Probability Density Function
  • Iteration

Iteration
15
EM Algorithm (1)
  • Step 1. (Initialization)
  • Random probability
  • Step 2. (Maximization Step)
  • Re-create cluster model
  • Re-compute the parameter T(mean, variance)
  • normal distribution.
  • Step 3. (Expectation Step)
  • Update records weight
  • Step 4.
  • Calculate log-likelihood
  • If the value saturates, exit
  • If not, Go to Step 2.

Parameter Adjustment
Weight Adjustment
16
EM Algorithm (2)Initialization
  • Random Probability
  • M-Step
  • Example

17
EM Algorithm (3)M-Step Parameter (Mean, Dev)
  • Estimating parameters from weighted instances
  • Parameters
  • means, deviations.

18
EM Algorithm (3)M-Step Parameter (Mean, Dev)
19
EM Algorithm (4)E-Step Weight
  • compute weight
  • here

20
EM Algorithm (5)E-Step Weight
  • compute weight
  • here

21
EM Algorithm (6)Objective Function (check)
  • Log-likelihood Function
  • For all instances, its probability belong to
    cluster A,
  • Use log for analysis

1-Dimensional data 2-Cluster A,B
N-Dimensional data K-cluster - Mean vector -
Covariance matrix
22
EM Algorithm (7)Objective Function (check)
- Covariance Matrix - Mean Vector
23
EM Algorithm (8)Termination
  • Termination
  • Procedure stops when log-likelihood saturates.

Q4
Q3
Q2
Q1
Q0
of Iteration
24
EM Algorithm (1)Simple Data
  • EM example
  • 6 data (3 sample per 1 class)
  • 2 class (circle, rectangle)

25
EM Algorithm (2)
Likelihood function of two component means T1, T2
26
EM Algorithm (3)
27
EM Example (1)
  • Example dataset
  • 2 Column(Math, English), 6 record

28
EM Example (2)
  • Distri. Of Math
  • mean 56.67
  • variance 776.73
  • Distri. Of Eng
  • mean 82.5
  • variance 197.50

100
50
0
100
50
29
EM Example (3)
  • Random Cluster Weight

30
EM Example (4)
  • Iteration 1

Maximization Step (parameter adjustment)
31
EM Example (4)
32
EM Example (5)
  • Iteration 2

Expectation Step (Weight adjustment)
Maximization Step (parameter adjustment)
33
EM Example (6)
  • Iteration 3

Expectation Step (Weight adjustment)
Maximization Step (parameter adjustment)
34
EM Example (6)
  • Iteration 3

Expectation Step (Weight adjustment)
Maximization Step (parameter adjustment)
35
EM Application (1)Weka
  • Weka
  • Waikato University in Newzealand
  • Open Source Mining Tool
  • http//www.cs.waikato.ac.nz/ml/weka
  • Experiment Data
  • Iris data
  • Real Data
  • Department Customer Data
  • Modified Customer Data

36
EM Application (2)IRIS Data
  • Data Info
  • Attribute Information
  • sepal length in cm / sepal width / petal length /
    petal width in cm
  • class Iris Setosa / Iris Versicolour / Iris
    Virginica

37
EM Application (3)IRIS Data
38
EM Application (4)Weka Usage
  • Weka Clustering Packages
  • Command line Execution
  • GUI Execution

Weka.clusterers
Java weka.clusterers.EM t iris.arff N 2
Java weka.clusterers.EM t iris.arff N 2 -V
Java jar weka.jar
39
EM Application (4)Weka Usage
  • Options for clustering in weka

40
EM Application (5)Weka usage
41
EM Application (5)Weka usage input file format
Summary Statistics Min Max
Mean SD Class Correlation sepal length
4.3 7.9 5.84 0.83 0.7826 sepal
width 2.0 4.4 3.05 0.43 -0.4194 petal
length 1.0 6.9 3.76 1.76 0.9490
(high!) petal width 0.1 2.5 1.20 0.76
0.9565 (high!) _at_RELATION iris _at_ATTRIBUTE
sepallength REAL _at_ATTRIBUTE sepalwidth
REAL _at_ATTRIBUTE petallength REAL _at_ATTRIBUTE
petalwidth REAL _at_ATTRIBUTE class
Iris-setosa,Iris-versicolor,Iris-virginica _at_DA
TA 5.1,3.5,1.4,0.2,Iris-setosa 4.9,3.0,1.4,0.2,Iri
s-setosa 4.7,3.2,1.3,0.2,Iris-setosa
42
EM Application (6)Weka usage output format
Number of clusters 3 Cluster 0 Prior
probability 0.3333 Attribute
sepallength Normal Distribution. Mean 5.006
StdDev 0.3489 Attribute sepalwidth Normal
Distribution. Mean 3.418 StdDev
0.3772 Attribute petallength Normal
Distribution. Mean 1.464 StdDev
0.1718 Attribute petalwidth Normal Distribution.
Mean 0.244 StdDev 0.1061 Attribute
class Discrete Estimator. Counts 51 1 1
(Total 53) 0 50 ( 33) 1 48 (
32) 2 52 ( 35) Log likelihood -2.21138
43
EM Application (6)Result Visualization
44
References
  • DataMining
  • Morgan Cauffmann. IAN H. p218-p255.
  • DataMining, Concepts and Techiques.
  • Jiawei Han. Chapter 8.
  • The Expectation Maximization Algorithm
  • Frank Dellaert, Febrary 2002.
  • A Gentle Tutorial of the EM Algorithm and its
    Application to Parameter Estimation for Gaussian
    Mixture and Hidden Markov Models.
Write a Comment
User Comments (0)
About PowerShow.com