Title: NHDC and PHDC: Local and Global Heat Diffusion Based Classifiers
1NHDC and PHDC Local and Global Heat Diffusion
Based Classifiers
- Haixuan Yang
- Group Meeting
- Sep 26, 2005
2Outline
- Introduction
- Graph Heat Diffusion Model
- NHDC and PHDC algorithms
- Connections with other models
- Experiments
- Conclusions and future work
3Introduction
- Kondor Lafferty (NIPS2002)
- Construct a diffusion kernel on a graph
- Handle discrete attributes
- Apply to a large margin classifier
- Achieve goof performance in accuracy on 5 data
sets from UCI - Lafferty Kondor (JMLR2005)
- Construct a diffusion kernel on a special
manifold - Handle continuous attributes
- Restrict to text classification
- Apply to SVM
- Achieve good performance in accuracy on WEbKB and
Reuters - Belkin Niyogi (Neural Computation 2003)
- Reduce dimension by heat kernel and local
distance - Tenenbaum et al (Science 2000)
- Reduce dimension by local distance
4Introduction
- We inherit the ideas
- Local information is relatively accurate in a
nonlinear manifold. - The way heat diffuses on a manifold is related
to the density of the data on the manifold the
point where heat diffuses rapidly is one that has
high density. - For example, in the ideal case when the manifold
is the Euclidean space, heat diffuses in the same
way as Gaussian density -
- The way heat diffuses on a manifold can be
understood as a generalization of the Gaussian
density from Euclidean space to manifold. - Learn local information by k nearest neighbors.
5Introduction
- We think differently
- Unknown manifold in most cases.
- Unknown solution for the known manifold.
- The explicit form of the approximation to the
solution in (Lafferty Lebanon JMLR2005) -
- is a rare case.
- Establish the heat diffusion equation directly on
a graph that is formed by K nearest neighbors. - Always have an explicit form in any case.
- Form a classifier by the solution directly.
6Illustration
The first heat diffusion
The second heat diffusion
7Illustration
8Illustration
9Illustration
Heat received from A class 0.018 Heat received
from B class 0.016
Heat received from A class 0.002 Heat received
from B class 0.08
SVM
10Graph Heat Diffusion Model
- Given a directed weighted graph G(V,E,W), where
- V1,2,,n,
- E(i,j) if there is an edge from i to j,
- W( w(i,j) ) is the weight matrix.
- The edge (i,j) is imagined as a pipe that
connects i and j, w(i,j) is the pipe length. - Let f(i,t) be the heat at node i at time t.
- At time t, i receives M(i,j,t,dt) amount of heat
from its neighbor j during a period of dt.
11Graph Heat Diffusion Model
- Suppose that M(i,j,t,dt) is proportional to the
time period dt. - Suppose that M(i,j,t,dt) is proportional to the
heat difference f(j,t)-f(i,t). - Moreover, the heat flows from j to i through the
pipe and therefore the heat diffuses in the pipe
in the same way as it does in the Euclidean space
as described before.
12Graph Heat Diffusion Model
- The heat difference f(i,tdt) and f(i,t) can be
expressed as - It can be expressed as a matrix form
- Let dt tends to zero, the above equation becomes
13NHDC and PHDC algorithm - Step 1
- Construct neighborhood graph
- Define graph G over all data points both in the
training data set and in the test data set. - Add edge from j to i if j is one of the K
nearest neighbors of i. - Set edge weight w(i,j)d(i, j) if j is one of the
K nearest neighbors of i, where d(i, j) be the
Euclidean distance between point i and point j.
14NHDC and PHDC algorithm - Step 2
- Compute the Heat Kernel
- Using equation
15NHDC and PHDC algorithm - Step 3
- Compute the Heat Distribution
- Set f(0) for each class c, nodes labeled by
class c, has an initial unit heat at time 0, all
other nodes have no heat at time 0. - In PHDC, use equation
- to compute the heat distribution.
- In NHDC, use equation
16NHDC and PHDC algorithm - Step 4
- Classify the nodes
- For each node in the test data set, classify it
to the class from which it receives most heat.
17Connections with other models
- The Parzen window approach (when the window
function takes the normal form) is a special case
of the NHDC. - It is a non-parametric method for probability
density estimation
The class-conditional density for class k
Assign x to a class whose value is maximal.
18Connections with other models
- The Parzen window approach (when the window
function takes the normal form) is a special case
of the NHDC. - In our model, let Kn-1, then the graph
constructed in Step 1 will be a complete graph.
The matrix H will be
Heat that xp receives from the data points in
class k
19Connections with other models
- KNN is a special case of the NHDC.
- For each test data, assign it to the class that
has the maximal number in its K nearest neighbors.
20Connections with other models
- KNN is a special case of the NHDC.
- In our model, letßtend to infinity, then the
matrix H becomes
The number of the cases in class q in its K
nearest neighbor.
Heat that xp receives from the data points in
class k
21Connections with other models
- PHDC can approximate NHDC.
- If ?is small, then
- Since the identity matrix has no effect on the
heat - distribution, PHDC and NHDC has
similarclassification accuracy when ? is small.
22Connections with other models
PHDC
NHDC
KNN
PWA
23Experiments
- 2 artificial Data sets
- Spiral-100
Spiral-1000 - Compare with Parzen window (The window function
takes the normal form), KNN and SVM. - The result is the average of the ten-cross
validation.
24Experiments
Algorithm NHDC PHDC KNN PWA SVM
Spiral-100 84 84 67 83 34
Spiral-1000 99.6 99.8 99.3 99.7 68.7
Credit-g 76.1 76.06 75.59 72.35 71.5
Diabetes 76.3 76.22 75.78 74.96 76.6
Glass 72.99 73.12 70.64 71.56 68.1
Iris 97.36 97.79 97.36 97.07 96
Sonar 88.75 89.07 82.86 88.28 84.8
Vehicle 72.90 72.93 71.41 72.45 88.5
25Conclusions and future work
- Avoid the difficulty of finding the explicit
expression for the unknown geometry - Avoid the difficult of finding a closed form heat
kernel for some complicated geometries. - Both NHDC and PHDC are efficient in accuracy.
- There is space to develop it further.
- The assumption in the local heat diffusion is not
fully justified. - We are now using a directed graph. Converting it
into a undirected graph may be more reasonable
because that in reality heat diffuses
symmetrically. - Apply it to SVM?