Title: The Dirichlet Labeling Process
1The Dirichlet Labeling Process for Functional
Data Analysis
XuanLong Nguyen Alan E. Gelfand Duke
University Machine Learning Group Presented by Lu
Ren
2Outline
- Introduction
- Formalizing the model
- Properties of the Labeling Process
- Identifiability
- Model fitting and inference
- Applications
- Conclusions
3Introduction
- Functional Data
- Suppose we have a collection of functions,
each viewed as a stochastic process realization
with observations at a common set of locations. - e.g., a random curve or surface.
- 2. Dirichlet Labeling Process
- For a particular process realization, we
assume that the observation at a given location
can be allocated to separate groups via a random
allocation process. - 3. The Primary Objective
- Examine clustering of the set of curves.
-
4Introduction
4. The connections with other models
- Dirichlet Process (DP) mixture model
- global clustering
- Dependent Dirichlet Process (DDP) mixture model
- local clustering
- Generalized Spatial DP mixture model
- thresholding latent Gaussian process
5Model Formalization
Noisy curve realizations
over Obtained at local sites The corresponding
latent curves Each curve is described by
the label function
Dirichlet labeling process generates a random
distribution and also a marginal multinomial
distribution with
for
6Model Formalization
Assume a collection of canonical species
is realized at each location by indexing
with the labels, i.e., if
.
Or, it is equivalent to
7Model Formalization
is a random probability measure on
where is a base measure on
and constructed such that
- has a uniform marginal distribution at every
location - inherits the spatial dependence structure via
on .
Denote by the finite-dimensional
distributions of . Let
and consider
where denotes the cumulative distribution
function at for .
8Model Formalization
The vector has uniform marginals and induces a
joint distribution function denoted by
on . Let
be an increasing sequence of threshold
in such that for
. If define
, then
Discretize into hyper-cubes then
So an drawn from yields a label
9Model Formalization
Similarly, we define an auxiliary variables
on for such that
where
According to the definition of DP,
10Properties
1. Properties of .
2. Properties of .
Assume is a mean-zero, isotropic Gaussian
process
with covariance function
11Properties
Under the assumptions on , the quantile
threshold functions are constant with
respect to and the sequence satisfies
.
12Identifiability
- Larger will lead to more smooth learned
canonical curves but weakly distinguishable,
while smaller will make the curves
posteriors cover different regions in the
function space. - As is close to 0, label switching is
discouragedglobal clustering if the curve
realizations tend to switch often, the canonical
curves become more weakly identified. - Similar locations tend to be (correctly) assigned
the same labels, but it is possible that the
whole segment is incorrectly labeled relatively
to some other segments.
- strong constraints (ordering of label values)
can be imposed upon. - The model identifiability cannot be ensured with
constraints but - the mixing for posterior inference would be
expected to improve.
13Model fitting and inference
The joint distribution associated with model
parameters
For canonical curves, the prior for vector
is normal with mean and covariance
matrix
The full conditional for still has a Gaussian
form, but it has a high dimension for large data
set .
The inference of the label vectors is
dependent on the Polya urn sampling scheme and in
terms of and
14Applications
1. Synthetic Data
- Specify locations
- while leave other 20 locations for validation
purposes. - for are iid drawn
from at locations
, where - are constructed by
. - The data collection is
obtained by mixing with - an independent error process drawn from
.
15(No Transcript)
16(No Transcript)
17Applications
2. Progesterone modeling
- The data records the natural logarithm of the
progesterone metabolite, during a monthly cycle
for 51 female subjects. - Each cycle ranges from -8 to 15 (8 days
pre-ovulation to 15 days post-ovulation). - There are total of 88 cycles the first 66
cycles belong to non-contraceptive group, the
remaining 22 cycles belong to the contraceptive
group. - We also consider a modified data with the curves
of the contraceptive group are down-shifted by 2. - We focus our analysis to the case k2.
18(No Transcript)
19(No Transcript)
20(No Transcript)
21Applications
- 3. Image modeling
- 80 color images with each size equal to
. - Each image is represented by a surface
realization , where is the color
intensity of the location .
-
represents the RGB color intensity. - We introduce canonical species curves.
22(No Transcript)
23Conclusions
- The Dirichlet labeling process provides a highly
flexible prior for modeling collections of
functions. - The inter-relationships between these parameters
are complex with regard to process behavior. - MCMC inference is proved to have a fast mixing
and yields good results. - The model is applied on two real applications.