Nonlinear Dimensionality Reduction - PowerPoint PPT Presentation

1 / 31
About This Presentation
Title:

Nonlinear Dimensionality Reduction

Description:

Few free parameters. Good theoretical base for its metrics preserving properties. 17 ... weights W. Compute embedding coordinates Y using weights W: ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 32
Provided by: draganav
Category:

less

Transcript and Presenter's Notes

Title: Nonlinear Dimensionality Reduction


1
Nonlinear Dimensionality Reduction
  • Presented by Dragana Veljkovic

2
Overview
  • Curse-of-dimensionality
  • Dimension reduction techniques
  • Isomap
  • Locally linear embedding (LLE)
  • Problems and improvements

3
Problem description
  • Large amount of data being collected leads to
    creation of very large databases
  • Most problems in data mining involve data with a
    large number of measurements (or dimensions)
  • E.g. Protein matching, fingerprint recognition,
    meteorological predictions, satellite image
    repositories
  • Reducing dimensions increases capability of
    extracting knowledge

4
Problem definition
  • Original high dimensional data
  • X (x1, , xn) where xi(xi1,,xip)T
  • underlying low dimensional data
  • Y (y1, , yn) where yi(yi1,,yiq)T and qltltp
  • Assume X forms a smooth low dimensional manifold
    in high dimensional space
  • Find the mapping that captures the important
    features
  • Determine q that can best describe the data

5
Different approaches
  • Local or Shape preserving
  • Global or Topology preserving
  • Local embeddings
  • Local simplify representation of each object
    regardless of the rest of the data
  • Features selected retain most of the information
  • Fourier decomposition, wavelet decomposition,
    piecewise constant approximation, etc.

6
Global or Topology preserving
  • Mostly used for visualization and classification
  • PCA or KL decomposition
  • MDS
  • SVD
  • ICA

7
Local embeddings (LE)
  • Overlapping local neighborhoods, collectively
    analyzed, can provide information on global
    geometry
  • LE preserves the local neighborhood of each
    object
  • preserving the global distances through the
    non-neighboring objects
  • Isomap and LLE

8
Another classification
  • Linear and Non Linear methods

9
Neighborhood
  • Two ways to select neighboring objects
  • k nearest neighbors (k-NN) can make non-uniform
    neighbor distance across the dataset
  • e-ball prior knowledge of the data is needed to
    make reasonable neighborhoods size of
    neighborhood can vary

10
Isomap general idea
  • Only geodesic distances reflect the true low
    dimensional geometry of the manifold
  • MDS and PCA see only Euclidian distances and
    there for fail to detect intrinsic
    low-dimensional structure
  • Geodesic distances are hard to compute even if
    you know the manifold
  • In a small neighborhood Euclidian distance is a
    good approximation of the geodesic distance
  • For faraway points, geodesic distance is
    approximated by adding up a sequence of short
    hops between neighboring points

11
Isomap algorithm
  • Find neighborhood of each object by computing
    distances between all pairs of points and
    selecting closest
  • Build a graph with a node for each object and an
    edge between neighboring points. Euclidian
    distance between two objects is used as edge
    weight
  • Use a shortest path graph algorithm to fill in
    distance between all non-neighboring points
  • Apply classical MDS on this distance matrix

12
Isomap
13
Isomap on face images
14
Isomap on hand images
15
Isomap on written two-s
16
Isomap - summary
  • Inherits features of MDS and PCA
  • guaranteed asymptotic convergence to true
    structure
  • Polynomial runtime
  • Non-iterative
  • Ability to discover manifolds of arbitrary
    dimensionality
  • Perform well when data is from a single well
    sampled cluster
  • Few free parameters
  • Good theoretical base for its metrics preserving
    properties

17
Problems with Isomap
  • Embeddings are biased to preserve the separation
    of faraway points, which can lead to distortion
    of local geometry
  • Fails to nicely project data spread among
    multiple clusters
  • Well-conditioned algorithm but computationally
    expensive for large datasets

18
Improvements to Isomap
  • Conformal Isomap capable of learning the
    structure of certain curved manifolds
  • Landmark Isomap approximates large global
    computations by a much smaller set of calculation
  • Reconstruct distances using k/2 closest objects,
    as well as k/2 farthest objects

19
Locally Linear Embedding (LLE)
  • Isomap attempts to preserve geometry on all
    scales, mapping nearby points close and distant
    points far away from each other
  • LLE attempts to preserve local geometry of the
    data by mapping nearby points on the manifold to
    nearby points in the low dimensional space
  • Computational efficiency
  • Representational capacity

20
LLE general idea
  • Locally, on a fine enough scale, everything looks
    linear
  • Represent object as linear combination of its
    neighbors
  • Representation indifferent to affine
    transformation
  • Assumption same linear representation will hold
    in the low dimensional space

21
LLE matrix representation
  • X WX where
  • X is pn matrix of original data
  • W is nn matrix of weights and
  • Wij 0 if Xj is not neighbor of Xi
  • rows of W sum to one
  • Need to solve system Y WY
  • Y is qn matrix of underlying low dimensional
    data
  • Minimize error

22
LLE - algorithm
  • Find k nearest neighbors in X space
  • Solve for reconstruction weights W
  • Compute embedding coordinates Y using weights W
  • create sparse matrix M (I-W)'(I-W)
  • Compute bottom q1 eigenvectors of M
  • Set i-th row of Y to be i1 smallest eigen vector

23
Numerical Issues
  • Covariance matrix used to compute W can be
    ill-conditioned, regularization needs to be used
  • Small eigen values are subject to numerical
    precision errors and to getting mixed
  • But, sparse matrices used in this algorithm make
    it much faster then Isomap

24
LLE
25
LLE effect of neighborhood size
26
LLE with face picture
27
LLE Lips pictures
28
PCA vs. LLE
29
Problems with LLE
  • If data is noisy, sparse or weakly connected
    coupling between faraway points can be attenuated
  • Most common failure of LLE is mapping close
    points that are faraway in original space
    arising often if manifold is undersampled
  • Output strongly depends on selection of k

30
References
  • Roweis, S. T. and L. K. Saul (2000). "Nonlinear
    dimensionality reduction by locally linear
    embedding " Science 290(5500) 2323-2326.
  • Tenenbaum, J. B., V. de Silva, et al. (2000). "A
    global geometric framework for nonlinear
    dimensionality reduction " Science 290(5500)
    2319-2323.
  • Vlachos, M., C. Domeniconi, et al. (2002).
    "Non-linear dimensionality reduction techniques
    for classification and visualization." Proc. of
    8th SIGKDD, Edmonton, Canada.
  • de Silva, V. and Tenenbaum, J. (2003). Local
    versus global methods for nonlinear
    dimensionality reduction, Advances in Neural
    Information Processing Systems,15.

31
Questions?
Write a Comment
User Comments (0)
About PowerShow.com