# Independent Component Analysis: The Fast ICA algorithm - PowerPoint PPT Presentation

PPT – Independent Component Analysis: The Fast ICA algorithm PowerPoint presentation | free to download - id: 10cb85-NjVlN

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Independent Component Analysis: The Fast ICA algorithm

Description:

### Kurtosis for gaussian random variables is 0. Con not a robust measure of ... Instead of kurtosis function, choose a contrast function G that doesn't grow too ... – PowerPoint PPT presentation

Number of Views:1650
Avg rating:3.0/5.0
Slides: 34
Provided by: jonath100
Category:
Tags:
Transcript and Presenter's Notes

Title: Independent Component Analysis: The Fast ICA algorithm

1
Independent Component Analysis The Fast ICA
algorithm
• Jonathan Kam
• EE 645

2
Overview
• The Problem
• Definition of ICA
• Restrictions
• Ways to solve ICA
• NonGaussianity
• Mutual Information
• Maximum Likelihood
• Fast ICA algorithm
• Simulations
• Conclusion

3
The Problem
• Cocktail Problem
• Several Sources
• Several Sensors
• Ex Humans hear mixed signal, but able to unmix
signals and concentrate on a sole source
• Recover source signals given only mixtures
• No prior knowledge of sources or mixing matrix
• aka Blind Source Separation (BSS)

4
Assumptions
• Source signals are statistically independent
• Knowing the value of one of the components does
not give any information about the others
• ICs have nongaussian distributions
• Initial distributions unknown
• At most one Gaussian source
• Recovered sources can be permutated and scaled

5
Definition of ICA
• Observe N linear mixtures x1,,xn of n
independent components
• xj aj1s1 aj2s2  ajnsn, for all j
• aj is the column of the mixing matrix A
• Assume each mixture xj and each IC sk is a random
variable
• Time difference between mixes dropped
• Independent components are latent variables
• Cannot be directly observed

6
Definition of ICA
• ICA Mixture model xAs
• A is mixing matrix s is matrix of source signals
• Goal
• Find some matrix W, so that
• s Wx
• W inverse of A

7
Definition Independence
• Two functions independent if
• Eh1(y1)h2(y2) Eh1(y1) Eh2(y2)
• If variables are independent, they are
uncorrelated
• Uncorrelated variables
• Defined Ey1y2 Ey1 Ey2 0
• Uncorrelation doesnt equal independence
• Ex (0,1),(0,-1),(1,0),(-1,0)
• Ey12y22 0 ? ¼ Ey12 Ey22
• ICA has to prove independence

8
ICA restrictions
• Cannot determine variances
• s and A are unknown
• Scalar multipliers on s could be canceled out by
a divisor on A
• Multiplier could even be -1
• Cannot determine order
• Order of terms can changed.

9
ICA restrictions
• At most 1 Gaussian source
• x1 and x2 Gaussian, uncorrelated, and unit
variance
• Density function is completely symmetric
• Does not contain info on direction of the columns
of the mixing matrix A.

10
ICA estimation
• Nongaussianity estimates independent
• Estimation of y wT x
• let z AT w, so y wTAs zTs
• y is a linear combination of si, therefore zTs is
more gaussian than any of si
• zTs becomes least gaussian when it is equal to
one of the si
• wTx zTs equals an independent component
• Maximizing nongaussianity of wTx gives us one of
the independent components
• Maximizing nongaussianity by measuring
nongaussiantiy
• Minimizing mutual information
• Maximum Likelihood

11
Measuring nongaussianity
• Kurtosis
• Fourth order cumulant
• Classical measure of nongaussianity
• kurt(y) Ey4 3(Ey2)2
• For gaussian y, fourth moment 3(Ey2)2
• Kurtosis for gaussian random variables is 0
• Con not a robust measure of nongaussianity
• Sensitive to outliers

12
Measuring nongaussianity
• Entropy (H) degree of information that an
observation gives
• A Gaussian variable has the largest entropy among
all random variables of equal variance
• Negentropy J
• Based on the information theoretic quantity of
differential entropy
• Computationally difficult

13
Negentropy approximations
• Classical method using higher-order moments
• Validity is limited to nonrobustness of kurtosis

14
Negentropy approximations
• Hyvärinen 1998b maximum-entropy principle
• G is some contrast function
• v is a Gaussian variable of zero mean and unit
variance
• Taking G(y) y4 makes the equation kurtosis
based approximation

15
Negentropy approximations
• Instead of kurtosis function, choose a contrast
function G that doesnt grow too fast
• Where 1a12

16
Minimizing mutual information
• Mutual information I is defined as
• Measure of the dependence between random
variables
• I 0 if variables are statistically independent
• Equivalent to maximizing negentropy

17
Maximum Likelihood Estimation
• Closely related to infomax principle
• Infomax (Bell and Sejnowski, 1995)
• Maximizing the output entropy of a neural network
with non-linear outputs
• Densities of ICs must be estimated properly
• If estimation is wrong ML will give wrong results

18
Fast ICA
• Preprocessing
• Fast ICA algorithm
• Maximize non gaussianity
• Unmixing signals

19
Fast ICA Preprocessing
• Centering
• Subtract its mean vector to make x a zero-mean
variable
• ICA algorithm does not need to estimate the mean
• Estimate mean vector of s by A-1m, where m is the
mean the subtracted mean

20
Fast ICA Preprocessing
• Whitening
• Transform x so that its components are
uncorrelated and their variances equal unity
• Use eigen-value decomposition (EVD) of the
covariance matrix E
• D is the diagonal matrix of its eigenvalues
• E is the orthogonal matrix of eigenvectors

21
Fast ICA Preprocessing
• Whitening
• Transforms the mixing matrix into Ã.
• Makes Ã orthogonal
• Lessons the amount of parameters that have to be
estimated from n2 to n(n-1)/2
• In large dimensions an orthogonal matrix contains
approximately ½ the number of parameters

22
Fast ICA Algorithm
• One-unit (component) version
• 1. Choose an initial weight vector w.
• 2. Let w Exg(wTx) Eg'(wTx)w
• Derivatives of contrast functions G
• g1(u) tanh(a1u),
• g2(u) u exp (-u2/2)
• 3. w w/w. (Normalization step)
• 4. If not converged go back to 2
• -converged if norm(wnew wold) gt ? or
norm(wold-wnew)gt ?
• - ? typically around 0.0001

23
Fast ICA Algorithm
• Several unit algorithm
• Define B as mixing matrix and B' as a matrix
whose columns are the previously found columns of
B
• Add projection step before step 3
• Step 3 becomes
• 3. Let w(k) w(k) - B'B'Tw(k). w w/w

24
Simple Simulation
• Separation of 2 components
• Figure 1 Two independent non gaussian wav samples

25
Simple Simulation
• Figure 2 Mixed signals

26
Simple Simulation
• Recovered signals vs original signals

Figure 3 Recovered signals
Figure 4 Original signals
27
Simulation Results
• IC 1 recovered in 6 steps and IC 2 recovered in 2
steps
• Retested with 20000 samples
• Requires approximately the same number of steps

28
Gaussian Simulation
Figure 5 2 wav samples and noise signal
29
Gaussian Simulation
Figure 6 3 mixed signals
30
Gaussian Simulation
• Comparison of recovered signals vs original
signals

Figure 7 Recovered signals
Figure 8 Original signal
31
Gaussian Simulation 2
• Tried with 2 gaussian components
• Components were not estimated properly due to
more than one Gaussian component

Figure 10 Original signals
Figure 11 Recovered signals
32
Conclusion
• Fast ICA properties
• No step size, unlike gradient based ICA
algorithms
• Finds any non-Gaussian distribution using any non
linear g contrast function.
• Components can be estimated one by one
• Other Applications
• Separation of Artifacts in image data
• Find hidden factors in financial data
• Reduce noise in natural images
• Medical signal processing fMRI, ECG, EEG
(Mackeig)

33
References
• 1 Aapo Hyvärinen and Erkki Oja, Independent
Component Analysis Algorithms and Applications.
Neural Networks Research Centre Helsinki
University of Technology Neural Networks, 13
(4-5) 411-430, 2000
• 2 Aapo Hyvärinen and Erkki Oja, A Fast
Fixed-Point Algorithm for Independent Component
Analysis. Helsinki University of Technology
Laboratory of Computer and Information Science,
Neural Computation, 914831492, 1997
• 3 Anthony J. Bell and Terrence J. Sejnowski,
The Independent Components of Natural Scenes
are Edeg Filters. Howard Hughes Medical Institute
Computational Neurobiology Laboratory
• 4 Te-Won Lee, Mark Girolami, Terrence J.
Sejnowski, Independent Component Analysis Using
and Extended Infomax Algorithm for Mixed
Subgaussian and Supergaussian Sources. 1997
• 5 Antti Leino, Independent Component Analysis
An Overview. 2004
• 6 Erik G. Learned-Miller, John W. Fisher III,
ICA Using Spacings Estimates of Entropy Journal
of Machine Learning Research 4 (2003) 1271-1295.