Three Algorithms for Nonlinear Dimensionality Reduction presentation

About This Presentation

Transcript and Presenter's Notes

Title: Three Algorithms for Nonlinear Dimensionality Reduction

1
Three Algorithms for Nonlinear Dimensionality
Reduction

Haixuan YangGroup Meeting
Jan. 011, 2005

2
Outline

Problem
PCA (Principal Component Analysis)
MDS (Multidimentional Scaling)
Isomap (isometric mapping)
A Global Geometric Framework for Nonlinear
Dimensionality Reduction. Science, 292(22),
2319-2323, 2000.
LLE (locally linear embedding)
Nonlinear Dimensionality Reduction by Locally
Linear Embedding. Science, 292(22), 2323-2326,
2000.
Eigenmap
Laplacian Eigenmaps and Spectral Techniques for
Embedding and Clustering. NIPS01.

3
Problem

Given a set x1, , xk of k points in Rl, find a
set of
points y1, , yk in Rm (m ltlt l) such
that yi represents xi as accurately as
possible.
If the data xi is placed in a super plane in
high dimensional space, the traditional
algorithms, such as PCA and MDS, work well.
However, when the data xi is placed in a
nonlinear manifold in high dimensional space,
then the linear algebra technique can not work
any more.
A nonlinear manifold can be roughly understood
as a distorted super plane, which may be twisted,
folded, or curved.

4
PCA (Principal Component Analysis)

Reduce dimensionality of data by transforming
correlated variables (bands) into a smaller
number of uncorrelated components
Reveals meaningful latent information
Best preserves the variance as measured in the
high-dimensional input space.
Nonlinear structure is invisible to PCA

5
First, a graphical look at the problem
Band 2
Two (correlated) Bands of data
Band 1
6
Regression LineSummarizes the Two Bands
Band 2
Band 1
7
Rotate axes to create two orthogonal
(uncorrelated) components
PC1
Band 2
PC2
Reflected X- and y-axes
Band 1
8
Partitioning of Variance
PC1
Var(PC1)
Band 2
Var(PC2)
PC2
Band 1
9
PCA algorithm description

Step 1 Calculate the average x of xi .
Step 2 Estimate the Covariance Matrix by
Step 3 Let ?p be the p-th eigenvalue (in
decreasing order) of the matrix M, and vpi be the
i-th component of the p-th eignvector. Then set
the p-th componet of the d-dimentional coordinate
vector yi equal to

10
MDS

Step 1 Given the distance d(i, j) between i and
j.
Step 2 From d(i, j), get the covariance matrix
M by

Step3 The same as PCA

11
An example of embedding of a two dimentional
manifold into a three dimentional space
Not the true distance
The true distance
12
Isomap basic idea

Learn the global distance by the local distance.
The local distance calculated by the Euclidean
distance is relatively accurate because a patch
in the nonlinear manifold looks like a plane when
it is small, and therefore the direct Euclidean
distance approximates the true distance in this
small patch.
The global distance calculated by the Euclidean
distance is not accurate because the manifold is
curved.
Best preserve the estimated distance in the
embedded space in the same way as MDS.

13
Isomap algorithm description

Step 1 Construct neighborhood graph
Define the graph over all data points by
connecting points i and j if they are closer than
e (e-Isomap), or if i is one of the n nearest
neighbors of j (k-Isomap). Set edge lengths equal
to dX(i,j).
Step 2 Compute shortest paths
Initialize dG(i,j) dX(i,j) if i and j are
linked by an edge dG(i,j) 8
otherwise. Then compute the shortest path
distances dG(i,j) between all
pairs of points in weighted graph G. Let
DG( dG(i,j) ).
Step 3 Construct d-dimensional embedding
Let ?p be the p-th eigenvalue (in decreasing
order) of the matrix t(DG), and vpi be the i-th
component of the p-th eignvector. Then set the
p-th componet of the d-dimentional coordinate
vector yi equal to .

14
An example each picture, a 4096
(6464)-dimensional point, can be mapped into
2-dinesional plane
15
Another example the 3-dimentional points are
maped into 2-dimentional plane
16
LLE basic idea

Learn the local linear relation by the local data
The local data is relatively linear because a
patch in the nonlinear manifold looks like a
plane when it is small.
Globally the data is not linear because the
manifold is curved.
Best preserve the local linear relation in the
embedded space in the similar way as PCA.

17
LLE algorithm description

Step 1 Discovering the Adjacency Information
For each xi find its n nearest neighbors,
.
Step 2 Constrcting the Approximation Matrix
Choose Wij by minimizing
Under the condition that
Step 3 Compute the Embedding
The embedding vectors yi can be found by
minimizing

18
An example 4096-dimentional face pictures are
embedded into a 2-dimentional plane
19
Eigenmap Basic Idea

Use the local information to decide the embedded
data.
Motivated by the way that heat transmits from one
point to another point.

20
Eigenmap

Step 1 Construct neighborhood graph
The same as Isomap.
Step 2 Compute the weights of the graph
If node i and node j are connected, put
Step 3 Construct d-dimensional embedding
Compute the eigenvalues and eigenvectors for
the generalized eigenvector problem
, where D is a diagonal matrix, and

21
Cont.

Let f0,,fk-1 be the solutions of the above
equation,
ordered increasingly according to their
eignvalues,
Lf0?0Df0
Lf1?1Df1
Lfk-1?k-1Dfk-1
Then yi is determined by the ith component of the
d
eigenvectors f1,,fd .

22
An example 256-dimentional speech data is
represented in a 2-dimentional plane
23
Conclusion

Isomap, LLE and Eigenmap can find the
meaningful low-dimensional structure hidden in
the high-dimensional observation.
These three algorithms work well especially in
the nonlinear manifold. In such a case, the
linear methods such as PCA and MDS can not work.

Write a Comment

User Comments (0)

About PowerShow.com

Three Algorithms for Nonlinear Dimensionality Reduction PowerPoint PPT Presentation