Gene and Protein Networks I Wednesday, April 11 2006 - PowerPoint PPT Presentation

1 / 61
About This Presentation
Title:

Gene and Protein Networks I Wednesday, April 11 2006

Description:

Hypergeometric: a p-value. Mutual clustering coefficient. Hypergeometric: P (intersection at least as large by chance) -log = neighbors of node v ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 62
Provided by: debrago
Category:

less

Transcript and Presenter's Notes

Title: Gene and Protein Networks I Wednesday, April 11 2006


1
Gene and Protein Networks IWednesday, April 11
2006
  • CSCI 4830 Algorithms for Molecular Biology
  • Debra Goldberg

2
Outline
  1. Introduction
  2. Network models
  3. Implications from topology
  4. Confidence assessment, edge prediction

3
Outline
  1. Introduction
  2. Network models
  3. Implications from topology
  4. Confidence assessment, edge prediction

4
What is a network?
  • A collection of objects (nodes, vertices)
  • Binary relationships (edges)
  • May be directed
  • Also called agraph

5
Networks are everywhere
6
Social networks
Nodes People Edges Friendship
from www.liberality.org
7
Sexual networks
Nodes People Edges Romantic and sexual
relations
8
Transportation networks
Nodes Locations Edges Roads
9
Power grids
Nodes Power station Edges High voltage
transmission line
10
Airline routes
Nodes Airports Edges Flights
11
Internet
Nodes MBone Routers Edges Physical connection
12
World-Wide-Web
Nodes Web documents Edges Hyperlinks
13
Gene and protein networks
14
Metabolic networks
Nodes Metabolites Edges Biochemical
reaction(enzyme)
from web.indstate.edu
15
Protein interaction networks
Nodes Proteins Edges Observed interaction
from www.embl.de
  • Gene function predicted

16
Gene regulatory networks
Nodes Genes or gene products Edges Regulation
of expression
  • Inferred from error-prone gene expression data

17
Signaling networks
Nodes Molecules(e.g., Proteins or
Neurotransmitters) Edges Activation
orDeactivation
from pharyngula.org
18
Signaling networks
Nodes Molecules(e.g., Proteins or
Neurotransmitters) Edges Activation
orDeactivation
from www.life.uiuc.edu
19
Synthetic sick or lethal (SSL)
20
SSL networks
  • Gene function, drug targets predicted

21
Other biological networks
  • Coexpression
  • Nodes genes
  • Edges transcribed at same times, conditions
  • Gene knockout / knockdown
  • Nodes genes
  • Edges similar phenotype (defects) when suppressed

22
What they really look like
23
We need models!
24
Outline
  1. Introduction
  2. Network models
  3. Implications from topology
  4. Confidence assessment, edge prediction

25
Traditional graph modeling
from GD2002
Random Regular
26
Introduce small-world networks
27
Small-world Networks
  • Six degrees of separation
  • 100 1000 friends each
  • Six steps 1012 - 1018
  • But

We live in communities
28
Small-world measures
  • Typical separation between two vertices
  • Measured by characteristic path length
  • Cliquishness of a typical neighborhood
  • Measured by clustering coefficient

29
Watts-Strogatz small-world model
30
Measures of the W-S model
  • Path length drops faster than cliquishness
  • Wide range of phas both small-worldproperties

31
Small-world measures of various graph types
Cliquishness Characteristic Path Length
Regular graph High Long
Random graph Low Short
Small-world graph High Short
32
Another network property Degree distribution P
(k)
  • The degree (notation k) of a node is the number
    of its neighbors
  • The degree distribution is a histogram showing
    the frequency of nodes having each degree

33
Degree distribution of E-R random networks
Erdös-Rényi random graphs
  • Binomial degree distribution, well-approximated
    by a Poisson

Network figures from Strogatz, Nature 2001
34
Degree distribution of many real-world networks
  • Scale-free networks
  • Degree distribution follows a power law
  • P (k x) ? x -?

35
Model for scale-free networks
  • Growth and preferential attachment
  • New node has edge to existing node v with
    probability proportional to degree of v
  • Biologically plausible?

36
Another scale-free network model
  • Duplication and divergence
  • New nodes are copies of existing nodes
  • Same neighbors, then some gain/loss

Solé, Pastor-Satorras, et al. (2002)
37
Other degree distributions
Amaral, Scala, et al., PNAS (2000)
38
Hierarchical Networks
Ravasz, et al., Science 2002
39
Properties of hierarchical networks



40
C of 43 metabolic networks
  • Independent of N

Ravasz, et al., Science 2002
41
Scaling of the clustering coefficient C(k)
  • Metabolic networks

Ravasz, et al., Science 2002
42
Summary of network models
Random Poisson degree distribution
Small world high CC, short pathlengths
Scale-free power law degree distribution
Hierarchical high CC, modular, power law degree distribution
43
Many real-world networks are small-world,
scale-free
  • World-wide-web
  • Collaboration of film actors (Kevin Bacon)
  • Mathematical collaborations (Erdös number)
  • Power grid of US
  • Syntactic networks of English
  • Neural network of C. elegans
  • Metabolic networks
  • Protein-protein interaction networks

44
So What?
45
There is information in a genes position in the
network
  • We can use this to predict
  • Relationships
  • Interactions
  • Regulatory relationships
  • Protein function
  • Process
  • Complex / molecular machine

46
Outline
  1. Introduction
  2. Network models
  3. Implications from topology
  4. Confidence assessment, edge prediction

47
SSL hubs might be good cancer drug targets
Cancer cells w/ random mutations
Normal cell
Dead
Alive
Dead
(Tong et al, Science, 2004)
48
Lethality
  • Hubs are more likely to be essential

Jeong, et al., Nature 2001
49
Degree anti-correlation
  • Few edges directly between hubs
  • Edges between hubs and low-degree genes are
    favored

Maslov and Sneppen, Science 2002
50
Outline
  1. Introduction
  2. Network models
  3. Implications from topology
  4. Confidence assessment, edge prediction

51
Confidence assessment
  • Traditionally, biological networks determined
    individually
  • High confidence
  • Slow
  • New methods look at entire organism
  • Lower confidence (? 50 false positives)
  • Inferences made based on this data

52
Confidence assessment
  • Can use topology to assess confidence if true
    edges and false edges have different network
    properties
  • Assess how well each edge fits topology of true
    network
  • Can also predict unknown relations

Goldberg and Roth, PNAS 2003
53
Use clustering coefficient, a local property
  • Number of triangles N(v) ? N(w)
  • Normalization factor?

N(x) the neighborhood of node x
54
Mutual clustering coefficient
  • Jaccard Index Meet / Min Geometric

N(v) ? N(w) ---------------- N(v) ? N(w)
N(v) ? N(w) 2 ------------------ N(v)
N(w)
N(v) ? N(w) ------------------------ min (
N(v) , N(w) )
55
Mutual clustering coefficient
  • Hypergeometric
  • P (intersection at least as large by chance)

-log
56
Prediction
  • A v-w edge would have a high clustering
    coefficient

v
w
57
Interaction generality
  • Confidence measure for edge based on topology
    around neighbors.

Saito, Suzuki, and Hayashizaki 2002,2003
58
Confidence assessment
  • Integrate experimental details with local
    topology
  • Degree
  • Clustering coefficient
  • Degree of neighbors
  • Etc.
  • Used logistic regression

Bader, et al., Nature Biotechnology 2003
59
The synthetic lethal network has many triangles
Xiaofeng Xin, Boone Lab
60
2-hop predictors for SSL
  • SSL SSL (S-S)
  • Homology SSL (H-S)
  • Co-expressed SSL (X-S)
  • Physical interaction SSL (P-S)
  • 2 physical interactions (P-P)

Wong, et al., PNAS 2004
61
Multi-color motifs
Zhang, et al., Journal of Biology 2005
Write a Comment
User Comments (0)
About PowerShow.com