Edwin R' Hancock and Richard Wilson - PowerPoint PPT Presentation

1 / 66
About This Presentation
Title:

Edwin R' Hancock and Richard Wilson

Description:

The University of York. Outline. Motivation Background. Graphs from images ... (Dickinson, ... of all the paths of length r from state i to state j is positive. ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 67
Provided by: benoi3
Category:

less

Transcript and Presenter's Notes

Title: Edwin R' Hancock and Richard Wilson


1
The University of York
Recent Progress on Learning with Graph
Representations
Edwin R. Hancockand Richard Wilson With help
from Bai Xiao Bin Luo, Antonio Robles-Kelly and
Andrea Torsello. University of YorkComputer
Science DepartmentYORK Y010 5DD,
UK. erh_at_cs.york.ac.uk
2
Outline
  • Motivation Background
  • Graphs from images
  • Spectral invariants
  • Lifting cospectrality
  • Generative models and description length
  • Conclusions

3
Motivation
4
Problem
In computer vision graph-structures are used to
abstract image structure. However, the algorithms
used to segment the image primitives are not
reliable. As a result there are both additional
and missing nodes (due to segmentation error) and
variations in edge-structure. Hence image
matching and recognition can not be reduced to a
graph isomorphism or even a subgraph isomorphism
problem. Instead inexact graph matching methods
are needed.
5
Problem
In computer vision graph-structures are used to
abstract image structure. However, the algorithms
used to segment the image primitives are not
reliable. As a result there are both additional
and missing nodes (due to segmentation error) and
variations in edge-structure. Hence image
matching and recognition can not be reduced to a
graph isomorphism or even a subgraph isomorphism
problem. Instead inexact graph matching methods
are needed.
6
Problem
In computer vision graph-structures are used to
abstract image structure. However, the algorithms
used to segment the image primitives are not
reliable. As a result there are both additional
and missing nodes (due to segmentation error) and
variations in edge-structure. Hence image
matching and recognition can not be reduced to a
graph isomorphism or even a subgraph isomorphism
problem. Instead inexact graph matching methods
are needed.
7
Measuring similarity of graphs
  • Early work on graph-matching is vision ( Barrow
    and Popplestone) introduced association graph and
    showed how it could be used to locate maximum
    common subgraph,
  • Work on syntactic and structural pattern
    recognition in 1980s unearthed problems with
    inexact matching (SanfeliuEshera and Fu.
    Haralick and Shapiro, Wong etc) and extended
    concept of edit distance from strings to graphs.
  • Recent work has aimed to develop probability
    distributions for graph matching (Christmas,
    Kittler and Petrou, Wilson and Hancock, Seratosa
    and Sanfeliu) and match using advanced
    optmisation methods(Simic, Gold and Rangarjan).
  • Renewed interest in placing classical methods
    such as edit distance (Bunke) and max-clique
    (Pelillo) on a more rigorous footing.

8
Viewed from the perspective of learning
This work has shown how to measure the similarity
of graphs. It can be used to locate inexact
matches when significant levels of structural
error are present. May also provide a means by
which modes of structural variation can be
assessed.
9
Learning with graphs (circa 2000)
  • Learn class structure Assign graphs to classes.
    Need a distance measure or vector of graph
    characteristics. Central clustering is possible
    with characteristics but difficult when number of
    nodes and edges varies and correspondences are
    not known. Easier to perform pairwise
    clustering. (Bunke, Buhman).
  • Embed graphs in a low dimensional space
    Correspondences are again needed, but spectral
    methods may offer a solution. Can apply standard
    statistical and geometric learning methods to
    graph-vectors.
  • Learn modes of structural variation Understand
    how edge (connectivity) structure varies for
    graphs belonging to the same class.
    (Dickinson,Williams)
  • Build generative model Borrow ideas from
    graphical models (Langley, Friedman, Koller).

10
Why is structural learning difficult
  • Graphs are not vectors There is no natural
    ordering of nodes and edges. Correspondences must
    be used to establish order.
  • Structural variations Numbers of nodes and
    edges are not fixed. They can vary due to
    segmentation error.
  • Not easily summarised Since they do not reside
    in a vector space, mean and covariance hard to
    characterise.

11
Structural Variations
12
Contributions
  • Permutation invariant graph characteristics from
    Laplacian spectrum (Wilson, Hancock, Luo PAMI
    2005).
  • Computation of edit distance between graphs and
    spectral clustering (Robles-Kelly, Torsello and
    Hancock IJCV 2007, Robles-Kelly and Hancock
    PAMI 2005).
  • Embedding based on properties of random walk and
    geometric characterisation of embedded nodes (Qiu
    and Hancock, PAMI 2007).
  • Spectral embedding of graphs (Luo, Wilson and
    Hancock Patt. Rec.2004).
  • Learn generative model of tree structure using
    description length (Torsello and Hancock PAMI
    2006).

13
Spectral Methods
Use eigenvalues and eigenvectors of adjacency
graph (or Laplacian matrix) - Biggs, Cvetokovic,
Fan Chung
  • Singular value methods for exact graph-matching
    and point-set alignment). (Umeyama)
  • Singular value methods for point-set
    correspondence (Scott and Longuet-Higgins,
    Shapiro and Brady).
  • Use of eigenvalues for image segmentation (Shi
    and Malik) and for perceptual grouping (Freeman
    and Perona, Sarkar and Boyer).
  • Graph-spectral methods for indexing shock-trees
    (Dickinson and Shakoufandeh)

14
Graph (structural) representations of shape
  • Region adjacency graphs ( Popplestone etc,,
    Worthington, Pizlo, Rosenfeld)
  • View graphs (Freeman, Ponce)
  • Aspect graphs (Dickisnon)
  • Trees (Forsyth, Geiger).
  • Shock graphs (Siddiqi, Zucker, Kimia).

Idea is to segment shape primitives from image
data and to abstract them using a graph. Shape
recognition becomes a problem of graph matching.
However, statistical learning of modes of shape
variation becomes difficult since available
methodology is limited.
15
Delaunay Graph
16
MOVI Sequence
17
Shock graphs
Type 1 shock(monotonically increasing radius)
Type 2 shock(minimum radius)
Type 3 shock(constant radius)
Type 4 shock(maximum radius)
18
Graph characteristics
  • Laplacian spectrum provides natural permutation
    invariant for graph, but discards information in
    eigensystem.
  • Symmetric polynomials over spectral matrix give
    rich family of invariants.
  • Can be extended to attributed graphs using
    complex number encoding and Hermitian extension
    of Laplacian.
  • Recent work has shown how invariants are linked
    to moments from Mellin transform of heat kernel.

19
Pairwise clustering
  • Compute tree/graph similarity using edit
    distance.
  • Simplifying structure can simplify process
    (convert graph to string).
  • Extract pairwise clusters using EM algorithm and
    eigevectors of an affinity matrix between graphs.
  • Applied to learn shape classes.

20
Embeddings
  • Embed nodes of a graph into a vector space so as
    to preserve node affinity properties of graph.
  • Examples include Laplacian eigenmap, diffusion
    map.
  • Have shown how commute time leads to embedding
    that is robust to modifications in edge structure.

21
Generative model
  • In structural domain model can be learned using
    EM algorithm to fit mixture over classes to
    sample of trees.
  • Each class is characterised by a prototype from
    which trees belonging to class can be obtained
    through tree edit operations.
  • Prototypes formed by merging trees. Merging
    criterion is description length.
  • Edit distance between trees is linked to
    description length advantage, and entropy
    associated with ML node probabilities.

22
Spectral Generative Model
  • Embed nodes of graph in vector space using
    heat-kernel.
  • Align embedded node positions using Procrustes
    alignment.
  • Compute covariance matrix for node positions.
  • Deform node positions in directions of
    eigenvectors of covariance matrix.

23
Algebraic graph theory (PAMI 2005)
  • Use symmetric polynomials to construct
    permutation invariants from spectral matrix

24
.joint work with Richard Wilson
25
Spectral Representation
  • Compute Laplacian matrix LD-A, where A is the
    adjacency matrix and D is the matrix with the
    node degree on the diagonal.
  • Perform spectral decomposition on the Laplacian
    matrix
  • Construct spectral matrix

26
Properties of the Laplacian
  • Eigenvalues are positive and smallest eigenvalue
    is zero
  • Multiplicity of zero eigenvalue is number
    connected components of graph.
  • Zero eigenvalue is associated with all-ones
    vector.
  • Eigenvector associated with the second smallest
    eigenvector is Fiedler vector.
  • Fiedler vector can be used to perform clustering
    of nodes of graph by recursive bisection .

27
Eigenvalue spectrum
Vector of ordered eigevectors is permutation
invariant
28
Eigenvalues are invariant to permutations of the
Laplacian.
  • ..would like to construct family of permutation
    invariants from full spectral matrix.

29
Why
  • According to perturbation analysis eigenvalues
    are relatively stable to noise.
  • Eigenvectors are not stable to noise and undergo
    large rotations for small additions of noise.

30
Symmetric polynomials
31
Power symmetric polynomials
32
Symmetric polynomials on spectral matrix
  • Symmetric polynomials and power symmetric
    polynomials related by Newton Giraud formula

33
Spectral Feature Vector
  • Construct a matrix of permutation invariants by
    applying symmetric polynomials to elements in
    columns of the spectral matrix. Use entropy
    measure to flatten distribution
  • Stack columns of F to form a long-vector B.
  • Set of graphs represented by data-matrix

34
extend to weighted attributed graphs.
35
Complex Representation
  • Encode attributes as complex numbers.
  • Off-diagonal elements. Edge weights (W) as
    modulus and normalised attributes as phase (y)
  • Diagonal elements encode node attributes (x) and
    ensure H is positive semi-definite

36
Spectral analysis
  • Perform spectral analysis on H. Real eigenvalues
    and complex eigenvectors
  • Construct spectral matrix of scaled complex
    eigenvectors
  • Complex Laplacian

37
Pattern Spaces
  • PCA Project long vectors onto leading
    eigenvectors of covariance matrix
  • MDS Embed graphs in low dimensional space
    spanned by eigenvectors of distance matrix
  • LLP Locally linear projection (Niyogi) perform
    eigenvector analysis on weighted covariance
    matrix (mixture of PCA and MDS). PCA/MDS hybrid.

38
Manifold learning methods
  • ISOMAP construct neighbourhood graph on pairwise
    geodesic distance between data-points. Low
    distortion embedding by applying MDS to weighted
    graph (Tennenbaum).
  • Locally linear embedding apply variant of PCA to
    data (Roweiss Saul)
  • Locally linear projection use interpoint
    distances to compute weighted covariance matrix,
    and apply PCA (HeNiyogi).

39
Separation under structural error
Mahalanobis distance between feature vectors for
noise corrupted graph and remaining graphs
Distance between graph and edge-edited variants
Distance between graph and random graphs of same
size and edge density
40
Variation under structural error (MDS)
MDS applied to Mahalanobis distances between
feature vectors.
41
CMU Sequence
42
MOVI Sequence
43
YORK Sequence
44
Visualisation (LLPLaplacian Polynomials)
45
Cospectrality problem for trees
  • Classical random walks are determined by spectrum
    of Laplacian matrix. Gives path-length
    distribution, hitting and commute times.
  • Non-isomorphic graphs can have the same spectra
    (co-spectrality). This problem is severe for
    trees.
  • Turn to quantum walks to overcome this problem
    and develop new algorithms for graph analysis
    based on random walks.

46
Cospectral trees
  • Nearly every tree has a (adjacency,
    laplacian,...) cospectral partner.
  • Such trees can be easily generated.
  • The spectrum of S(U3) distinguished all such
    trees it was tested on.

pairs of cospectral trees
47
Overcome using quantum random walk
  • The unitary operator governing the evolution of
    the walk can be written in matrix form as
  • where the basis states are the set of all
    ordered pairs (i,j) such that
  • Eigenvalues of U are

48
The positive support of a matrix
  • For a real valued matrix, M, define its positive
    support, S(M) by
  • S(Ur)ij is non-zero if and only if the sum of
    all the paths of length r from state i to state j
    is positive.
  • Interference effects on the quantum walk ensure
    that S(Ur)ij gives useful information about the
    graph when classical analogues do not.

49
Cospectral Trees
Spectrum of positive support for UxUxU not
determined by spectrum of L and lifts
cospectrality problem
50
Stongly regular graphs
  • There is no method proven to be able to decide
    whether two SRGs are isomorphic in polynomial
    time.
  • There are large families of strongly regular
    graphs that we can test the method on.

MDS embeddings of the SRGs with parameters
(25,12,5,6)-red, (26,10,3,4)-blue,
(29,14,6,7)-black, (40,12,2,4)-green using the
adjacency spectrum (top) and the spectrum of
S(U3) (bottom).
51
Generative Tree Union Model
  • Probability distribution over the union tree

52
..work with Andrea Torsello
53
Ingredients
  • Set of tree unions
  • Set of node observation probabilities for each
    node (probability of observing ith
    node of union c).
  • Set of node correspondences

54
Illustration
55
Cluster structure
  • Cluster indicator
  • Number of trees assigned to cluster c
  • Number of nodes in union c

56
Model
  • Describe data using a mixture of tree unions
  • Where N is the node-set and O is the order
    relation of the tree union and is the set
    of node probabilities.

57
Union as tree distribution
  • For each node in the union we know how often it
    is encountered in the sampled trees.
  • We can generate new trees by sampling with the
    node probability equal to the normalised sample
    frequency.
  • The union represents a generative model for a
    distribution of trees.

58
Generative Model
  • Aim is to make maximum likelihood estimate of
    the model .
  • Problem we do not know how sample-nodes map to
    model-nodes.
  • Let node observation probability depend on
    correspondence map M (determined later).

59
Max-likelihood parameters
  • Log-likelihood
  • Given M, L is maximized by any T consistent with
    the hierarchies and by

60
Description length
  • Model coding cost of encoding k-dimensional
    parameterisation of an m-dimensional
    sample-vector is

Expected value of data log likelihod given best
fit model parameters
Cost of coding model (parametersstructure)
61
Expectation on observation density
depends on node entropy
62
Tree Union
  • Cost of describing tree union

Negative likelihood of data given model
Cost of encoding node probabilities
Cost of encoding mixture
Cost of encoding tree structure
63
Simplified Description Cost
  • Cost of describing tree union

64
Description Length Gain
  • Which nodes should be merged?
  • The description advantage obtained by merging
    nodes v and v
  • Set of merges M that minimizes descriptor length
    maximizes
  • Edit distance linked to node entropy

65
Unattributed
Pairwise clustering of tree edit distance.
Mixture of tree unions
66
Future
  • Links between spectral geometry and
    graph-spectra.
  • MDL in spectral domain.
Write a Comment
User Comments (0)
About PowerShow.com