EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank - PowerPoint PPT Presentation

About This Presentation

Title:

EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank

Description:

EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank DataMining, Morgan Kaufmann, p218-227 Mining Lab. – PowerPoint PPT presentation

Number of Views:853

Avg rating:3.0/5.0

Slides: 45

Provided by: aitimesTi

Category:

more less

Transcript and Presenter's Notes

Title: EM Algorithm: Expectation Maximazation Clustering Algorithm book: DataMining, Morgan Kaufmann, Frank

1
EM AlgorithmExpectation MaximazationClustering
Algorithmbook DataMining, Morgan Kaufmann,
Frank

DataMining, Morgan Kaufmann, p218-227
Mining Lab. ???
2004? 10? 27?

2
Content

Clustering
K-Means via EM
Mixture Model
EM Algorithm
Simple examples of EM
EM Application WEKA
References

3
Clustering (1/2)

Clustering ?
Clustering algorithms divide a data set into
natural groups (clusters).
Instances in the same cluster are similar to each
other, they share certain properties.
e.g Customer Segmentation.
Clustering vs. Classification
Supervised Learning
Unsupervised Learning
Not target variable to be predicted.

4
Clustering (2/2)

Categorization of Clustering Methods
Partitioning mehtods
K-Means / K-medoids / PAM / CRARA / CRARANS
Hierachical methods
CURE / CHAMELON / BIRCH
Density-based methods
DBSCAN / OPTICS
Grid-based methods
STING / CLIQUE / Wave-Cluster
Model-based methods
EM / COBWEB / Bayesian / Neural

Model-Based Clustering
Probability-based Clustering
Statistical Clustering
5
K-Means (1)Algorithm

Step 0
Select K objects as initial centroids.
Step 1 (Assignment)
For each object compute distances to k centroids.
Assign each object to the cluster to which it is
the closest.
Step 2 (New Centroids)
Compute a new centroid for each cluster.
Step 3 (Converage)
Stop if the change in the centroids is less than
the selected covergence criterion.
Otherwise repeat Step 1.

6
K-Means (2)simple example
New Centroids (Check)
Input Data
Random Centroids
Assignment
New Centroids (check)
Assignment
Assignment
Centroids (check)
7
K-Means (3)weakness on outlier (noise)
8
K-Means (4)Calculation
1. (4,4), (3,4)
0. (4,4), (3,4)
(4,2), (0,2), (1,1), (1,0) (100, 0)
(4,2), (0,2), (1,1), (1,0)
1. 1) lt3.5, 4gt lt21, 1gt
1. 1) lt3.5, 4gt lt1.5, 1.25gt
2) lt3.5, 4gt - (0,2), (1,1), (1,0),(3,4),(4,4),(4,
2) lt21, 1gt - (100,1)
2) lt3.5, 4gt - (3, 4), (4, 4), (4, 2) lt1.5,
1.25gt - (0, 2) (1, 1), (1, 0)
2. 1) lt2.1, 2.1gt lt100, 0gt
2. 2) lt3.67, 3.3gt lt0.67, 1gt
2) lt2.1, 2.1gt - (0, 2),(1,1),(1,0),(3,4),(4,4),(4
,2) lt100, 1gt - (100, 1)
3) lt3.67, 3.3gt - (3, 4), (4, 4), (4, 2)
lt0.67, 1gt - (0, 2) (1, 1), (1, 0)
9
K-Means (5)comparison with EM
C1

K-Means
Hard Clustering.
A instance belong to only one Cluster.
Based on Euclidean distance.
Not Robust on outlier, value range.
EM
Soft Clustering.
A instance belong to several clusters with
membership probability.
Based on density probability.
Can handle both numeric and nominal attributes.

I
C2
C1
0.7
0.3
I
C2
10
Mixture Model (1)

A Mixture is a set of k probability
distributions, repesenting k clusters.
A probability distribution have mean and
variances.
The mixture model combines several normal
distributions.

11
Mixture Model (2)

Only one numeric attribute
five parameter

12
Mixture Model (3) Simple Example

Probability that an instance x belongs to cluster
A

Probability Density Function
13
Mixture Model (4)Probability Density Function

Normal Distribution
Gaussian Density Function
Poisson Distribution

14
Mixture Model (5)Probability Density Function

Iteration

Iteration
15
EM Algorithm (1)

Step 1. (Initialization)
Random probability
Step 2. (Maximization Step)
Re-create cluster model
Re-compute the parameter T(mean, variance)
normal distribution.
Step 3. (Expectation Step)
Update records weight
Step 4.
Calculate log-likelihood
If the value saturates, exit
If not, Go to Step 2.

Parameter Adjustment
Weight Adjustment
16
EM Algorithm (2)Initialization

Random Probability
M-Step
Example

17
EM Algorithm (3)M-Step Parameter (Mean, Dev)

Estimating parameters from weighted instances
Parameters
means, deviations.

18
EM Algorithm (3)M-Step Parameter (Mean, Dev)
19
EM Algorithm (4)E-Step Weight

compute weight
here

20
EM Algorithm (5)E-Step Weight

compute weight
here

21
EM Algorithm (6)Objective Function (check)

Log-likelihood Function
For all instances, its probability belong to
cluster A,
Use log for analysis

1-Dimensional data 2-Cluster A,B
N-Dimensional data K-cluster - Mean vector -
Covariance matrix
22
EM Algorithm (7)Objective Function (check)
- Covariance Matrix - Mean Vector
23
EM Algorithm (8)Termination

Termination
Procedure stops when log-likelihood saturates.

Q4
Q3
Q2
Q1
Q0
of Iteration
24
EM Algorithm (1)Simple Data

EM example
6 data (3 sample per 1 class)
2 class (circle, rectangle)

25
EM Algorithm (2)
Likelihood function of two component means T1, T2
26
EM Algorithm (3)
27
EM Example (1)

Example dataset
2 Column(Math, English), 6 record

28
EM Example (2)

Distri. Of Math
mean 56.67
variance 776.73
Distri. Of Eng
mean 82.5
variance 197.50

100
50
0
100
50
29
EM Example (3)

Random Cluster Weight

30
EM Example (4)

Iteration 1

Maximization Step (parameter adjustment)
31
EM Example (4)
32
EM Example (5)

Iteration 2

Expectation Step (Weight adjustment)
Maximization Step (parameter adjustment)
33
EM Example (6)

Iteration 3

Expectation Step (Weight adjustment)
Maximization Step (parameter adjustment)
34
EM Example (6)

Iteration 3

Expectation Step (Weight adjustment)
Maximization Step (parameter adjustment)
35
EM Application (1)Weka

Weka
Waikato University in Newzealand
Open Source Mining Tool
http//www.cs.waikato.ac.nz/ml/weka
Experiment Data
Iris data
Real Data
Department Customer Data
Modified Customer Data

36
EM Application (2)IRIS Data

Data Info
Attribute Information
sepal length in cm / sepal width / petal length /
petal width in cm
class Iris Setosa / Iris Versicolour / Iris
Virginica

37
EM Application (3)IRIS Data
38
EM Application (4)Weka Usage

Weka Clustering Packages
Command line Execution
GUI Execution

Weka.clusterers
Java weka.clusterers.EM t iris.arff N 2
Java weka.clusterers.EM t iris.arff N 2 -V
Java jar weka.jar
39
EM Application (4)Weka Usage

Options for clustering in weka

40
EM Application (5)Weka usage
41
EM Application (5)Weka usage input file format
Summary Statistics Min Max
Mean SD Class Correlation sepal length
4.3 7.9 5.84 0.83 0.7826 sepal
width 2.0 4.4 3.05 0.43 -0.4194 petal
length 1.0 6.9 3.76 1.76 0.9490
(high!) petal width 0.1 2.5 1.20 0.76
0.9565 (high!) _at_RELATION iris _at_ATTRIBUTE
sepallength REAL _at_ATTRIBUTE sepalwidth
REAL _at_ATTRIBUTE petallength REAL _at_ATTRIBUTE
petalwidth REAL _at_ATTRIBUTE class
Iris-setosa,Iris-versicolor,Iris-virginica _at_DA
TA 5.1,3.5,1.4,0.2,Iris-setosa 4.9,3.0,1.4,0.2,Iri
s-setosa 4.7,3.2,1.3,0.2,Iris-setosa
42
EM Application (6)Weka usage output format
Number of clusters 3 Cluster 0 Prior
probability 0.3333 Attribute
sepallength Normal Distribution. Mean 5.006
StdDev 0.3489 Attribute sepalwidth Normal
Distribution. Mean 3.418 StdDev
0.3772 Attribute petallength Normal
Distribution. Mean 1.464 StdDev
0.1718 Attribute petalwidth Normal Distribution.
Mean 0.244 StdDev 0.1061 Attribute
class Discrete Estimator. Counts 51 1 1
(Total 53) 0 50 ( 33) 1 48 (
32) 2 52 ( 35) Log likelihood -2.21138
43
EM Application (6)Result Visualization
44
References

DataMining
Morgan Cauffmann. IAN H. p218-p255.
DataMining, Concepts and Techiques.
Jiawei Han. Chapter 8.
The Expectation Maximization Algorithm
Frank Dellaert, Febrary 2002.
A Gentle Tutorial of the EM Algorithm and its
Application to Parameter Estimation for Gaussian
Mixture and Hidden Markov Models.

Write a Comment

User Comments (0)