Cumulative distribution networks: Graphical models for cumulative distribution functions

1 / 18

About This Presentation

Title:

Cumulative distribution networks: Graphical models for cumulative distribution functions

Description:

Probabilistic and Statistical Inference Group, Department of Electrical and ... Problems where density models may ... Plackett-Luce model. Bradley-Terry model ... – PowerPoint PPT presentation

Number of Views:58

Avg rating:3.0/5.0

Slides: 19

Provided by: Jim4215

more less

Transcript and Presenter's Notes

Title: Cumulative distribution networks: Graphical models for cumulative distribution functions

1
Cumulative distribution networks Graphical
models for cumulative distribution functions

Jim C. Huang and Brendan J. Frey
Probabilistic and Statistical Inference Group,

Department of Electrical and Computer
Engineering,
University of Toronto,

Toronto, ON, Canada

2
Motivation

Problems where density models may be
intractable/unsuitable
e.g. Models with latent variables
unidentifiability, intractability
e.g. Learning to rank
Cumulative distribution network (CDN)

3
Cumulative distribution functions (CDFs)
Negative convergence
Positive convergence
Monotonicity

Marginalization ? maximization
Conditioning ? differentiation

4
Cumulative distribution networks (CDNs)

Bipartite graph for
representing CDFs
Example
Sufficient for to be CDFs (Huang and
Frey, 2008)
e.g. Multivariate Gaussian CDFs, multivariate
sigmoids,
copulas

5
Necessary/sufficient conditions on CDN functions

Negative convergence (necessity and sufficiency)
Positive convergence (sufficiency)

For each node a, at least one neighboring
function ? 0
All functions ? 1
6
Necessary/sufficient conditions on CDN functions

Monotonicity lemma (sufficiency)
(assuming derivatives exist!)

All functions monotonically non-decreasing
Sufficient condition for a valid joint CDF
Each CDN function can be a CDF of its
arguments
7
Example Bivariate CDN distributions

Using Gaussian bivariate CDFs

8
Example Bivariate CDN distributions

Using Gumbel copulas with marginal t-CDFs and
Gaussian CDFs

9
Conditional independence and graph separation in
CDNs

For any disjoint variable node sets
separated by with respect to

10
Conditional independence in CDNs

For any disjoint variable node sets
separated by with respect to
e.g. X and Y are conditionally dependent given Z
e.g. X and Y are conditionally independent given
Z

11
Conditional independence and graph separation in
CDNs
12
Connection to bi-directed graphs

Graphs for representing marginal independence
e.g.
Covariance graphs (Kauermann, 1996)
Binary models for marginal independence (Drton
and Richardson, 2008)
Factorial mixture models (Silva and Ghahramani,
2009)

13
Null-dependence in CDNs
14
Null-dependence in CDNs
15
Mapping between CDNs and factor graphs

Equivalence between bi-directed graph and
directed graph
Equivalence between CDN and factor graph

16
Inference by message passing

Conditioning ? differentiation
Replace sum in sum-product with differentiation
Recursively apply product rule via
message-passing with messages ?, ?
Derivative-Sum-Product algorithm (Huang and Frey,
2008)

17
The derivative-sum-product algorithm

In a CDN
In a factor graph

18
Derivative-Sum-Product

Message from function to variable

19
Derivative-Sum-Product

Message from variable to function

20
Application Ranking in multiplayer gaming

e.g. Halo 2 game with 7 players, 3 teams

Given game outcomes, update player skills as a
function of all player/team performances
21
Ranking in multiplayer gaming
Local cumulative model linking team rank rn
with player performances xn
e.g. Team 2 has rank 2
22
Ranking in multiplayer gaming
Pairwise model of team ranks rn,rn1
Enforce stochastic orderings between teams via h
23
Application Ranking in multiplayer gaming

CDN functions Gaussian CDFs
Skill updates
Prediction

24
Interpretation of skill updates

For any given player let
denote the outcomes of games he/she has
played previously
Then the skill function corresponds to

25
Results

Previous methods for ranking players
ELO (Elo, 1978)
TrueSkill (Graepel, Minka and Herbrich, 2006)
After message-passing

26
Factor graph and CDN for multiplayer games
27
Factor graph and CDN for multiplayer games
28
Factor graph and CDN for multiplayer games
Dual factor graph
TrueSkill factor graph
29
Learning to rank from observations

GOAL Learn a ranking function which
minimizes probability of misranking on
test queries

Training data
Learning
Predict on test data
?
30
Structured ranking learning

Define structured loss functional as likelihood
of generating order graphs
Use stochastic gradients to minimize structured
loss functional given independent observations

31
Converting from an order graph to a CDN
Edge in order graph
Preference variable node in CDN
32
Probabilistic models for rank data as CDNs
Plackett-Luce model
Bradley-Terry model
e.g. RankNet (Burges et al, 2005), RankMotif
(Chen, Hughes and Morris, 2007)
e.g. ListNet (Cao et al, 2007), ListMLE (Xia et
al., 2008)
33
Ranking documents for information retrieval

Loss functional
Multivariate sigmoids

34
Ranking documents for information retrieval

Ranking function
Nadaraya-Watson estimator with Gaussian kernel

35
Ranking documents for information retrieval

Performance metrics
Precision
Average precision
Normalized Discounted Cumulative Gains (NDCG) for
a ranked list of documents with labels r(j)

36
Application Information retrieval

OHSUMED dataset (LETOR 2.0)

37
Application Information retrieval

OHSUMED dataset (LETOR 2.0)

38
Application Information retrieval

OHSUMED dataset (LETOR 2.0)

39
Application Computational systems biology
40
Ranking transcription factor binding sites

Learn from protein binding microarray data
(Berger et al. 2006)

41
Ranking transcription factor binding sites

Ranking function depends on position weight
matrix M

Probability of occurrence
Position
42
Ranking transcription factor binding sites
43
Ranking transcription factor binding sites

Learn to rank microRNA targets using diverse
datasets

44
Ranking microRNA targets

Combine quantitative features and sequence data
Quantitative features can be obtained from
diverse experimental data and computational
prediction methods

45
Ranking microRNA targets