Untangling graphs: denoising protein-protein interaction networks - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Untangling graphs: denoising protein-protein interaction networks

Description:

and S ek = d. d. i,j. j. intractable sum? No, use dynamic programming. 12/18/09 ... Construct likelihood function out of all observations using Na ve Bayes ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 36
Provided by: biomlwork
Category:

less

Transcript and Presenter's Notes

Title: Untangling graphs: denoising protein-protein interaction networks


1
Untangling graphs denoising protein-protein
interaction networks
  • Quaid Morris
  • (joint work with Brendan Frey)

2
Motivation
  • High-throughput graph data is noisy, e.g.,
  • protein-protein interaction networks
  • synthetic lethal interaction networks
  • Real world graphs are highly structured

Idea Use prior knowledge about structure to
denoise graphs
3
Protein-Protein interaction network
Jeong et al, Nature 2001
4
Overview
  • Illustrative example
  • Model and inference algorithm
  • Protein-protein interaction network denoising

5
Example spy rings
Suspects
Phone Records
Call
  • Spies call exactly two other spies
  • Suspects may call other suspects
  • Phone records may be lost

6
Example spy rings cont.
Phone Records
Possible Rings
7
Denoising example
Noise assumptions
  • No lost calls
  • Rare, independent social calls

Possible Rings
8
Denoising example
Noise assumptions
  • No lost calls
  • Rare, independent social calls

Possible Rings
9
Denoising example
Noise assumptions
  • No lost calls
  • Rare, independent social calls

Possible Rings
10
Untangling example
Telemarketing Example
Noise assumptions
  • Lost calls and rare social calls
  • Telemarketing

Possible Decompositions
11
Summary
  • With structured noise, observed graph is composed
    of different graphs, each with their own
    properties.

12
Graph generative model
E
E
1
2
  • Sample hidden graphs
  • E from P(E )

i
h
h
j
2) Sample x from P(x e , e )
1
2
i,j
i,j
i,j
i,j
X
13
Model and inference
Joint
1
2
H
h
1
2
H
P(X, E , E , , E ) P P(E ) P P(x e ,
e , , e )
i,j
i,j
i,j
i,j
h
igtj
Posterior marginal
h
P(e , X)
i,j
h
P(e X)
i,j
Generally Intractable Sums
P(X)
Probability of evidence

1
2
H
P(X) S S S
P(X, E , E , , E )
H
2
1
E
E
E
14
Three tricks for tractability
  • Degree-based graph priors
  • Sum-product approximate inference
  • Dynamic programming trick

15
Degree-based graph priors
h
h
h
P(E ) P f (d ) / Z
i
i
Degree of vertex i
Degree potential for graph h
h
i
h
d S e
i,j
i
j
  • Real-world network structure captured
  • Nice sum-product (loopy belief prop) algorithm
  • Introduces dummy degree variable, d

16
Two types of random graphs
Exponential
Scale-free
Jeong et al, Nature 2000
17
Random graph degree distributions
Scale-free
Exponential
-p
f(k) Ck , p gt 1
Poisson(ltkgt)
Jeong et al, Nature 2000
18
Other real-world structure
  • Small-worldness
  • Degree correlations
  • Clustering

Google Mark Newman Michigan for more info
19
Factor graph for denoising
e
x
1,2
1,2
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
3,4
3,4
20
Factor graph for denoising
e
x
1,2
1,2
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
degree potentials
3,4
3,4
21
Factor graph for denoising
I(d , Se )
Indicator functions
e
x
j
i,j
1,2
1,2
i
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
3,4
3,4
22
Factor graph for denoising
I(d , Se )
e
x
j
i,j
1,2
1,2
i
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
3,4
3,4
P(xe)
Likelihood functions
23
Factor graph for denoising
I(d , Se )
e
x
j
i,j
1,2
1,2
i
e
x
d
1,3
1,3
1
d
e
x
2
1,4
1,4
d
x
e
3
2,3
2,3
e
d
x
2,4
4
2,4
e
x
f(d)
3,4
3,4
P(xe)
24
Sum-product approximate inference
  • Two types of binary messages
  • edge variables a constraint nodes
  • constraint nodes a edge variables

25
Calculating edge a constraint messages
e 0 or 1
edge e -gt degree I messages
i,j
i
m (e) m (e) m (e)
e -gt I
I -gt e
x -gt e
j
i
i,j
i,j
i,j
i,j
likelihood message
constraint -gt edge message
P(x e e)
i.e.
i,j
i,j
26
Calculating constraint a edge messages
degree I -gt edge e messages
i,j
j
m (e) S S f(d) P m (ek)
e -gt I
I -gt e
k j
i
i,k
d
j
i,j
e1, e2, , eN s.t. ej e, and S ek d
degree prior
intractable sum? No, use dynamic programming
27
Dynamic programming solution
d
j
I

s
s
s
j
1
2
N


e
e
e
d
1,j
2,j
N,j
j

e
e
e
1,j
2,j
N,j
N
2
O(2 ) time
O(N ) time
28
Dynamic programming solution
Constraint
d
j
s
s e
i1
i
i1,j
I

s
s
s
j
1
2
N


e
e
e
d
1,j
2,j
N,j
j

e
e
e
1,j
2,j
N,j
N
2
O(2 ) time
O(N ) time
29
Inference for untangling
  • Message passing same as denoising, except
    likelihood message needs to be recalculated.
  • Likelihood message incorporates edge
    information from other hidden graphs

30
Factor graph for untangling
1
2
f (d)
P(xe ,e )
f (d)
1
2
e
1
1
d
e
d
2
2
1,2
1
1,2
1
x
1,2
e
1
1
d
e
d
2
2
1,3
2
1,3
2
x
1,3
e
d
e
d
1
1
2
2
2,3
3
2,3
3
x
2,3
I(d, ee)
I(d, ee)
Graph 1
Graph 2
31
Protein-protein interaction network denoising
Von Mering et al (2002) dataset
  • Eight PPI networks consisting of
  • Low quality direct evidence (high-throughput)
  • Indirect evidence
  • Gold standard
  • A small set of confirmed interactions

32
Empirical degree distributions
33
Methods
  • Split 6k ORFs into training and test set
  • On training set
  • Fit degree priors to both true graph and
    false graph.
  • Construct likelihood function out of all
    observations using Naïve Bayes
  • Every observed interaction must be placed in
    exactly one of the two hidden graphs.

34
Results
Untangling
Baseline
35
Summary
  • Generative model for observed graphs, composed of
    many hidden graphs
  • Sum-product approximate inference algorithm for
    degree-based priors
  • Application to protein-protein interaction
    network noise removal
Write a Comment
User Comments (0)
About PowerShow.com