Title: Spectral LPM: An Optimal Locality-Preserving Mapping using the Spectral (not Fractal) Order
1Spectral LPM An Optimal Locality-Preserving
Mapping using the Spectral (not Fractal) Order
- Mohamed F. Mokbel
- Walid G. Aref
- Ananth Grama
2Locality-Preserving Mapping
- Multi-dimensional data is more difficult to
process than one-dimensional data. - A mapping function f is required to map the
multi-dimensional space into the one dimensional
space. - Locality-Preservation is a desirable property for
the mapping function f. - Mapping data from the multi-dimensional space
into the one-dimensional space is considered
locality-preserving if the points that are nearby
in the multi-dimensional space are nearby in the
one-dimensional space.
3Applications of Locality-Preserving Mapping
- Range Queries, nearest-neighbor queries
- Multi-dimensional spatial join.
- Multi-dimensional indexing.
- R-Tree Packing
- Spatial Access Method.
- Declustering
- Memory Management
- GIS
- Disk Scheduling
- Traveling Salesman problem
- Parallel processing.
4Distance-Preserving or Locality-Preserving
- An optimal Distance-Preserving mapping algorithm
maps the multi-dimensional space into the
one-dimensional space such that the distance
between each pair of points in the
multi-dimensional space is preserved in the
one-dimensional space. - A point P in the D-dimensional space has 2D
neighbors with Manhattan distance M1. Mapping P
and its neighbors into the one-dimensional space
allows only two neighbors to have M1. Thus, the
distance between 2(D-1) of the points cannot be
preserved. - An optimal Distance-Preserving mapping is
infeasible, what about Locality-Preserving
mapping ?
5The Good Mapping
- Divide the 2D neighbors into two equal groups.
- Map the first group to the left of P, and the
second group to the right of P. - The same algorithm is applied for points with
Manhattan distance M gt1 - An optimal Locality-Preserving mapping with
respect to P. - What about Q and R..?
6The Bad Mapping (Fractals)
- Fractals divide the space into a number of
fragments. - Once a fractal starts to visit points from a
certain fragment, no other fragment is visited
until the current one is completely exhausted. - Fractals perform a local optimization based on
the current fragment. - Boundary Effect problem. Two points Pi and Pj lie
on the boundaries of two different fragments and
Pi-Pj1. However, Pi and Pj will be very far
from each other in the one-dimensional space.
7How bad is the Fractals ?
Peano Gray Hilbert
P1-P2 6 5 11
P3-P4 22 47 43
- Things become even worse with the increase of the
grid size
8Can we avoid the Boundary Effect in Fractals?
- NO, the boundary problem is a property of the
fractals Man77. - Only two attempts to avoid the boundary effect
- SZM98 uses different space-filling curves for
the same data. If two points lie on the boundary
of one space-filling curve, they will not be in
the boundary of the other space-filling curve. - LLL01 uses multiple shifted copies of the
Hilbert SFC. If two points are in the boundary of
one copy of the Hilbert SFC, they will not be in
the boundary of another copy. - The main idea is
- By using more than one SFC, we can hopefully get
better results
9The Optimal Mapping (Spectral LPM)
- Unlike Fractals, Spectral LPM achieves global
optimization where all multi-dimensional points
are taken into account when performing the
mapping. - Spectral LPM does not favor any set of points
over the others. - Spectral LPM is proved to be globally optimal
with respect to all multi-dimensional points.
10Overview of the Spectral Algorithms
- Spectral algorithms are based on the spectral
theory which relates matrix to its eigenvalues
and eigenvectors. - A milestone in Spectral algorithms is due to
Fiedler Fie75 who proposed using the
eigenvalues and eigenvectors of the Laplacian
matrix L(G) instead of the Adjacency matrix A(G). - Spectral Algorithms have been widely used in
- Graph Partitioning
- Data Clustering
- Linear labeling of a graph.
- Up to the authors knowledge, the use of spectral
mapping to support similarity search queries is a
novel application
11The Spectral LPM Algorithm
- Algorithm Spectral Locality-Preserving Mapping
(Spectral LPM) - Input P, a set of multi-dimensional points.
- Output S, a linear order of the set P.
- Model the set of multi-dimensional points P as a
graph G(V,E) such that each point Pi?P is
represented by a vertex vi?V, and there is an
edge (vi,vj) ?E iff Pi-Pj1. - Compute the graph Laplacian matrix L(G)
D(G)-A(G). - Compute the second smallest eigenvalue ?2 and its
corresponding eigenvector X2 of L(G). - For each i1?n, assign the value xi to vi, and
hence to Pi. - The linear order S of P is the order of the
assigned values of Pis. - Return S
- End
12Example of The Spectral Mapping
13The Optimality of the Spectral Mapping
- Definition
- A vector X (x1,x2,,xn) that represents the n
one-dimensional values of n multi-dimensional
points represented as a graph G(V,E) is
considered to provide the global optimal
locality-preserving mapping from the
multi-dimensional space into the one-dimensional
space if X satisfies the following optimization
problem
14The Optimality of the Spectral Mapping
- The optimization problem in the optimality
definition is equivalent to
15The Optimality of the Spectral Mapping
- Theorem Fiedler, 1973
- The solution of the optimization problem
- is the second smallest eigenvalue ?2 and its
corresponding eigenvector X2
16Extensibility of the Spectral LPM
- Spectral LPM can change the way of constructing
the graph G. - Spectral LPM can incorporate any number of
additional constraints
17Extensibility of the Spectral LPM
- Spectral LPM can model the multi-dimensional
points as a weighted graph. The weight w of an
edge e(v1,v2) represents the priority of mapping
v1 and v2 to nearby locations in the
one-dimensional space. In this case the objective
function will be - The proof of optimality of the Spectral LPM is
valid regardless of the graph type. Spectral LPM
is optimal for the chosen graph type.
18Experimental Results
- If the Manhattan distance between any two points
Pi, Pj in the multi-dimensional space is MD, then
what is the Manhattan distance M1 between the
same two points in the one-dimensional space? The
lower M1 the better the locality-preserving
mapping. - For any multi-dimensional range query, what is
the difference between the minimum and the
maximum one-dimensional values of the points that
lie inside the range query? The smaller the
difference the better the locality-preserving
mapping. - The answers of these questions should have
- Bounded worst-case
- Dimension Independence
- Location Independence
19Bounded Worst-Case (Nearest-Neighbor Queries)
20Bounded Worst-Case (Range Queries)
21Dimension Independence
22Dimension Independence (Range Queries)
23Location Independence
24Summary
- The need of a locality-preserving mapping from
the multi-dimension space into the
one-dimensional space is needed for a variety of
applications. - We argue against the use of fractals as a basis
for locality-preserving mapping. - The Spectral LPM algorithm is proposed as an
optimal locality-preserving mapping that depends
on the spectral properties of the
multi-dimensional points. - The optimality proof of the Spectral LPM
algorithm is provided. - Unlike Fractals, the Spectral LPM algorithm can
be extended in several ways. - We provide experimental evidence that Spectral
LPM is superior to the fractal-based algorithms
25References
- Man77 B. Mandelbrot. Fractal Geometry of
Nature. W. H. Freeman, NY, 1977. - LLL01 S. Liao, M. Lopez, and S. Leutenegger.
High Dimensional Similarity search with
space-filling curves., ICDE 2001. - SZM98 J. Shepherd, X. Zhu, and N. Megiddo. A
fast indexing method for multidimensional nearest
neighbor search. Proc.of SPIE, Storage and
Retrieval for Image and Video Databases, 1998. - Ste73 L.A. Steen. Highlights in the history of
spectral theory. American Mathematically Monthly,
80(4), 1973. - Fie73 M. Fiedler. Algebraic Connectivity of
Graphs. Czechoslovak Mathematical Journal 23(98),
1973. - Fie75 M. Fiedler. A property of eigenvectors of
nonnegative symmetric matrices and its
application to graph theory. Czechoslovak
Mathematical Journal 25(100), 1975.