Author: Hao Cheng, Kien A Hua, and Khanh Vu - PowerPoint PPT Presentation

1 / 23
About This Presentation
Title:

Author: Hao Cheng, Kien A Hua, and Khanh Vu

Description:

Data usually reside in a high dimensional space. ... ISOMAP, Locally Linear Embedding (LLE), Hessian LLE (HLLE), Local Tangent Space ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 24
Provided by: haoc6
Category:
Tags: author | cheng | hao | hessian | hua | khanh | kien

less

Transcript and Presenter's Notes

Title: Author: Hao Cheng, Kien A Hua, and Khanh Vu


1
Local and Global Structures Preserving Projection
  • Author Hao Cheng, Kien A Hua, and Khanh Vu
  • University of Central Florida
  • ICTAI 07

2
Overview
  • Introduction
  • Proposed Algorithm
  • Experiments
  • Conclusions

3
Introduction
  • Data usually reside in a high dimensional space.
  • The intrinsic dimensionality of data is much
    lower.
  • Manifold learning
  • finds a low dimensional embedding of the raw
    data and the embedding can well preserve the
    intrinsic structures of the data.
  • a recent popular research topic.

4
Related Work
  • Principal Component Analysis (PCA)
  • Local Preserving Projection (LPP)
  • Many others

5
PCA
  • Principal Component Analysis (PCA)
  • PCA projects the data along a set of axes which
    exhibit greater variances than other axes
  • PCA minimizes the distortion of all the pairwise
    distances of the data after the reduction.
  • PCA can well preserve the global structures of
    the data.

6
LPP
  • Local Preserving Projection (LPP)
  • LPP constructs a similarity matrix W
  • If point i is the top K nearest neighbor of point
    j, then W(i,j) W(j,i) 1. Otherwise W(i,j)
    0.
  • W encodes local neighborhood information.
  • LPP finds a set of axes in order to minimize the
    pairwise distances of the data (indicated by W).
  • LPP can well preserve the neighborhoods.

7
Nonlinear Methods
  • Both PCA and LPP are linear methods.
  • Nonlinear methods
  • ISOMAP, Locally Linear Embedding (LLE), Hessian
    LLE (HLLE), Local Tangent Space Alignment (LTSA),
    Diffusion Maps (DM).
  • Problems
  • Computational intensive.
  • Do not scale well.
  • Performances are not very robust.

8
Motivation
  • PCA global structure
  • LPP local structure
  • Both global and local structures are important,
    and should be properly preserved!
  • Look at the toy examples.

9
Toy Example 1
  • Two classes of data

10
Toy Example 2
  • Two classes of data

Neither of them does well!
11
LGSPP
  • Local and Global Structure Preserving Projection
    (LGSPP)
  • Extracts local and global structures
  • Derives the embedding to preserve the structures.
    Minimizes the distortions.

12
Local Structure
  • For each data point x,
  • S(x) is the set of points include x itself and
    its Ks nearest neighbors (Ks is a system
    parameter).
  • S(x) is the local neighborhood around the point x.

13
Global Structure
  • For each data point x,
  • D(x) is the set of Kd points, which are far from
    point x and also far from each other (Kd is
    another parameter).
  • For example

Blue dot x Red/Green dots in D(x).
Points in D(x) and point x are from different
dense regions.
14
Extraction Algorithm
  • Select a random sample set.
  • Pick the one farthest from point x, denoted as
    d1.
  • Pick the one which is farthest from x and d1,
    denoted as d2.
  • Continue till find Kd points.

15
S(x) and D(x)
  • S(x) local neighborhood of x.
  • D(x) point x and points in D(x) are highly
    likely from different dense regions in the
    dataset.
  • Local and global structures
  • S(x) and D(x) for each point x.

16
Embedding
  • Goals of embedding
  • Keep the points in S(x) close to each other in
    the reduced space minimize the pairwise
    distances in S(x)
  • Keep the points in D(x) far from those in S(x) in
    the reduced space maximize the pairwise
    distances between S(x) and D(x)

17
Optimization
  • find a set of projection axes pi
  • Equivalent to

18
Rewrite
  • Equivalent to
  • Generalized Eigenvalue Problem.

19
Toy Examples Revisit
  • LGSPP

20
Synthetic datasets
  • 2-dimensional data.
  • free variable, from -1 to 1.
  • 1st dimension
  • 2nd dimension

21
More datasets
  • LGSPP

22
Conclusions
  • LGSPP
  • Extracts local and global structures.
  • Computes a salient embedding.
  • LGSPP
  • Address the limitations of PCA and LPP.
  • Linear, fast, robust.
  • Works well on both synthetic and real-world
    examples.

23
Questions?
Write a Comment
User Comments (0)
About PowerShow.com