The Dynamic Hierarchical Dirichlet Process - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

The Dynamic Hierarchical Dirichlet Process

Description:

Presenter: John Paisley. Duke University. 07/06/08 ... Sharing statistical strength (HDP and Dynamic DP) Dynamic hierarchical Dirichlet process (dHDP) ... – PowerPoint PPT presentation

Number of Views:186
Avg rating:3.0/5.0
Slides: 28
Provided by: lu835
Category:

less

Transcript and Presenter's Notes

Title: The Dynamic Hierarchical Dirichlet Process


1
The Dynamic Hierarchical Dirichlet Process
The 25th International Conference on Machine
Learning
(ICML 2008)
Lu Ren, David B. Dunson and Lawrence
Carin Presenter John Paisley Duke
University 07/06/08
2
Outline
  • Dirichlet process (DP) mixture model
  • Sharing statistical strength (HDP and Dynamic
    DP)
  • Dynamic hierarchical Dirichlet process (dHDP)
  • dHDP for music segmentation (HMM)
  • Time-evolving model for gene analysis (GMM)
  • Conclusions and future work

3
Dirichlet Process (DP)
  • Dirichlet process (DP) a measure on measures
  • G DP( , G0 )

Precision parameter and base measure G0
  • Good clustering property
  • Non-parametric Bayesian prior for density
    estimation
  • Explicit mathematical form stick-breaking
    process

4
DP Mixture Model
Assume we have data points
  • Infinite number of atoms
  • infinite mixture model

Figure 1. Graphical model for DP mixture
Independent assumption
2.
5
Sharing Statistical Strength
  • A recurring theme
  • Separate observations into groups
  • The groups to remain linked

Two methods
1. Hierarchical Dirichlet Process (HDP)
  • Assume the data are subdivided
  • Parameters are shared among groups
  • Different groups are exchangeable

6
Hierarchical Dirichlet Process
Stick-breaking construction
7
Dynamic Mixture DPs
2. Dynamic mixture of DPs (DMDP)
  • Accommodates autocorrelation in the
    distributions
  • The atom (parameter) number might be huge

8
Problem Definition Motivation
  • Data collected sequentially
  • Temporal evolution is assumed
  • Non-parametric prior and sparse model
  • Solution
  • dynamic hierarchical Dirichlet process (dHDP)
  • The parameters shared globally
  • The mixture weights change dynamically

9
Dynamic HDP
10
Dynamic HDP
11
Dynamic HDP
Fig. 3 Graphical model for dHDP
12
Dynamic HDP
Two indicator variable and for each
observation .
The model specification
Fig. 4 Stick-breaking representation for dHDP
13
Dynamic HDP
Theorem 1
Theorem 2
14
Posterior Inference
  • A modification of the block Gibbs sampler
  • Collect the samples for each random variable
  • Approximate the posterior distribution
  • depending on the specific applications

1. For HMMs mixture
2. For GMM
15
Music Segmentation
  • dHDP HMMs mixture
  • Contiguous part clustered together
  • Segment changes detected as innovations

The music the movement Largo-Allegro of the
Beethoven piano sonata No. 17, also referred as
The Tempest.
Fig 5. Auditory waveform of the Sonata
  • MFCC features extracted
  • discretized by VQ technique

16
Music Segmentation
Fig. 6 Segmentation result on the auditory
waveform of the Sonata
  • dominant and temporally localized auditory
    phenomena
  • make the model sparse and keeps the temporal
    coherence
  • automatically annotate the music in Bayesian
    setting

17
Music Segmentation
(a) dHDP-HMMs
(b) HDP-HMMs
Fig. 7 Similarity matrix from HMM
mixture modeling
Temporal dependence makes the dHDP-HMMs
segmentation more insensitive to those local
temporal bursts than the HDP-HMMs.
18
Gene Analysis
  • Problem
  • Time-evolving modeling ---Disease development
  • Related genes---High dimensions of gene
    expressions

Time after infection (t)
3hr 6hr 12hr 24hr 48hr 72hr
Number of samples( ) 10 12 12
10 12 9
Step 1 Prune the genes with Fisher score Step 2
dHDP mixture model developed for further analysis
Assume samples at each time shot ,
Consider 1. Individual diversity of samples
2. Similar temporal pattern of
infection level
19
Gene Analysis
3. At , represents the virus
infection level for
p-dim and each iid drawn from a
student-t distribution
T1
T2
TJ
T3
Fig. 8 Time evolving model for gene analysis
20
Gene Analysis
4. For each time shot, assumed
to be drawn from a Gaussian mixture
?
Fig. 9 Median values and associated uncertainty
based on posterior distributions of the hidden
variables .
21
Gene Analysis
(a)
(b)
Fig. 10 The dHDP GMM modeling for the gene
expression data. (a) The posterior distribution
of . (b) The similarity matrix .
22
Gene Analysis
Fig. 11 The first ten inferred important genes
(color red and blue) and the relatively unrelated
genes (color green).
23
Gene Analysis
dHDP encourages proper sharing to improve
parameters estimation as the data number is
limited.
(a)
(b)
Fig. 12 Similarity matrix with four
samples for each temporal group. (a) HDP, (b)
dHDP.
24
Gene Analysis
Fig. 13 Comparison of dHDP and HDP with box
plots of the hidden variables as the sample
size is reduced to four for each temporal group (
the standard deviation based on dHDP is 12.1
reduced on average relative to HDP the means are
very similar).
25
Gene Analysis
(a)
(b)
Fig. 14 Similarity matrix between data at
different time points based on the correlation
coefficients (Theorem 2), as computed from the
dHDP posterior. (a) Using all available data,
(b) using four samples for each temporal group.
26
Conclusions
  • Non-parametric prior, dynamic HDP is proposed
  • Time dependence explored
  • A modification of the block Gibbs sampler
  • The dHDP HMMs mixture for music modeling
  • Temporal dependent model for analyzing the
    Dengue gene expression data

27
Thanks!
Write a Comment
User Comments (0)
About PowerShow.com