Graph Embedding and Extensions: A General Framework for Dimensionality Reduction - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Graph Embedding and Extensions: A General Framework for Dimensionality Reduction

Description:

Graph Embedding and Extensions: A General Framework for Dimensionality Reduction IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE – PowerPoint PPT presentation

Number of Views:345
Avg rating:3.0/5.0
Slides: 55
Provided by: cmlabCsi4
Category:

less

Transcript and Presenter's Notes

Title: Graph Embedding and Extensions: A General Framework for Dimensionality Reduction


1
Graph Embedding and Extensions A General
Framework for Dimensionality Reduction
  • IEEE TRANSSACTIONS ON PATTERN ANALYSIS AND
    MACHINE INTELLIGENCE
  • Shuicheng Yan, Dong Xu, Benyu Zhang, Hong-Jiang
    Zhang, Qiang Yang, Stephen Lin
  • Presented by meconin

2
Outline
  • Introduction
  • Graph Embedding (GE)
  • Marginal Fisher Analysis (MFA)
  • Experiments
  • Conclusion and Future Work

3
Introduction
  • Dimensionality Reduction
  • Linear
  • PCA, LDA, are the two most popular due to
    simplicity and effectiveness
  • LPP, preserves local relationships in the data
    set, and uncovers its essential manifold structure

4
Introduction
  • Dimensionality Reduction
  • For nonlinear methods, ISOMAP, LLE, Laplacian
    Eigenmap are three algorithms have been developed
    recently
  • Kernel trick
  • linear methods ? nonlinear ones
  • performing linear operations on higher or even
    infinite dimensional by kernel mapping function

5
Introduction
  • Dimensionality Reduction
  • Tensor based algorithms
  • 2DPCA, 2DLDA, DATER

6
Introduction
  • Graph Embedding is a general framework for
    dimensionality reduction
  • With its linearization, kernelization, and
    tensorization, we have a unified view for
    understanding DR algorithms
  • The above-mentioned algorithms can all be
    reformulated with in it

7
Introduction
  • This paper show that GE can be used as a platform
    for developing new DR algorithms
  • Marginal Fisher Analysis (MFA)
  • Overcome the limitations of LDA

8
Introduction
  • LDA (Linear Discriminant Analysis)
  • Find the linear combination of features best
    separate classes of objects
  • Number of available projection directions is
    lower than class number
  • Based upon interclass and intraclass scatters,
    optimal only when the data of each class is
    approximately Gaussian distributed

9
Introduction
  • MFA advantage (compare with LDA)
  • The number of available projection directions is
    much larger
  • No assumption on the data distribution, more
    general for discriminant analysis
  • The interclass margin can better characterize the
    separability of different classes

10
Graph Embedding
  • For classification problem, the sample set is
    represented as a matrix X x1, x2, , xN, xi
    ? Rm
  • In practice, the feature dimension m is often
    very high, thus its necessary to transform the
    data to a low-dimensional oneyi F(xi), for all
    i

11
Graph Embedding
12
Graph Embedding
  • Different motivations of DR algorithms, their
    objectives are similar to derive lower
    dimensional representation
  • Can we reformulate them within a unifying
    framework? Whether the framework assists design
    new algorithms?

13
Graph Embedding
  • Give a possible answer
  • Represent each vertex of a graph as a
    low-dimensional vector that preserves
    similarities between the vertex pairs
  • The similarity matrix of the graph characterizes
    certain statistical or geometric properties of
    the data set

14
Graph Embedding
  • G X, W be an undirected weighted graph with
    vertex set X and similarity matrix W ? RN?N
  • The diagonal matrix D and the Laplacian matrix L
    of a graph G are defined as L D ? W, Dii
    , ? i

15
Graph Embedding
  • Graph embedding of G is an algorithm to find
    low-dimensional vector representations
    relationships among the vertices of G
  • B is the constraint matrix, and d is a constant,
    for avoid trivial solution

16
Graph Embedding
  • For larger similarity between samples xi and xj,
    the distance between yi and yj should be smaller
    to minimize the objective function
  • To offer mappings for data points throughout the
    entire feature space
  • Linearization, Kernelization, Tensorization

17
Graph Embedding
  • LinearizationAssuming y XTw
  • Kernelization? x ? F, assuming

18
Graph Embedding
  • The solutions are obtained by solving the
    generalized eigenvalue decomposition problem
  • F. Chung, Spectral Graph Theory, Regional Conf.
    Series in Math.,no. 92, 1997

19
Graph Embedding
  • Tensor
  • the extracted feature from an object may contain
    higher-order structure
  • Ex
  • an image is a second-order tensor
  • sequential data such as video sequences is a
    third-order tensor

20
Graph Embedding
  • Tensor
  • In n dimensional space, nr directions, r is the
    rank(order) of a tensor
  • For tensor A, B ? Rm1?m2??mnthe inner product

21
Graph Embedding
  • Tensor
  • For a matrix U ? Rmk?mk, B A ?k U

22
Graph Embedding
  • The objective funtion
  • In many case, there is no closed-form solution,
    but we can obtain the local optimum by fixing the
    projection vector

23
General Framework for DR
  • The differences of DR algorithms
  • the computation of the similarity matrix of the
    graph
  • the selection of the constraint matrix

24
General Framework for DR
25
General Framework for DR
  • PCA
  • seeks projection directions with maximal
    variances
  • it finds and removes the projection direction
    with minimal variance

26
General Framework for DR
  • KPCA
  • applies the kernel trick on PCA, hence it is a
    kernelization of graph embedding
  • 2DPCA is a simplified second-order tensorization
    of PCA and only optimizes one projection direction

27
General Framework for DR
  • LDA
  • searches for the directions that are most
    effective for discrimination by minimizing the
    ratio between the intraclass and interclass
    scatters

28
General Framework for DR
  • LDA

29
General Framework for DR
  • LDA
  • follows the linearization of graph embedding
  • the intrinsic graph connects all the pairs with
    same class labels
  • the weights are in inverse proportion to the
    sample size of the corresponding class

30
General Framework for DR
  • The intrinsic graph of PCA is used as the penalty
    graph of LDA

PCA
LDA
31
General Framework for DR
  • KDA is the kernel extension of LDA
  • 2DLDA is the second-order tensorization of LDA
  • DATER is the tensorization of LDA in arbitrary
    order

32
General Framework for DR
  • LLP
  • ISOMAP
  • LLE
  • Laplacian Eigenmap (LE)

33
Related Works
  • Kernel Interpretation
  • Ham et al.
  • KPCA, ISOMAP, LLE, LE share a common KPCA
    formulation with different kernel definitions
  • Kernel matrix v.s Laplacian matrix from
    similarity matrix
  • Only unsupervised v.s more general

34
Related Works
  • Out-of-Sample Extension
  • Brand
  • Mentioned the concept of graph embedding
  • Brands work can be considered as a special case
    of our graph embedding

35
Related Works
  • Laplacian Eigenmap
  • Work with only a single graph, i.e., the
    intrinsic graph, and cannot be used to explain
    algorithms such as ISOMAP, LLE, and LDA
  • Some works use a Gaussian function to compute the
    nonnegative similarity matrix

36
Marginal Fisher Analysis
  • Marginal Fisher Analysis

37
Marginal Fisher Analysis
  • Intraclass compactness (intrinsic graph)

38
Marginal Fisher Analysis
  • Interclass separability (penalty graph)

39
The first step of MFA
40
The second step of MFA
41
Marginal Fisher Analysis
  • Intraclass compactness (intrinsic graph)

42
Marginal Fisher Analysis
  • Interclass separability (penalty graph)

43
The third step of MFA
44
First of Four steps of MFA
45
LDA v.s MFA
  1. The available projection directions are much
    greater than that of LDA
  2. There is no assumption on the data distribution
    of each class
  3. The interclass margin in MFA can better
    characterize the separability of different
    classes than the interclass variance in LDA

46
Kernel MFA
  • The distance between two samples
  • For a new data point x, its projection to the
    derived optimal direction

47
Tensor MFA
48
Experiments
  • Face Recognition
  • XM2VTS, CMU PIE, ORL
  • A Non-Gaussian Case

49
Experiments
  • XM2VTS, PIE-1, PIE-2, ORL

50
Experiments
51
Experiments
52
Experiments
53
Experiments
54
Experiments
Write a Comment
User Comments (0)
About PowerShow.com