Title: Gene and Protein Networks I Wednesday, April 11 2006
1Gene and Protein Networks IWednesday, April 11
2006
- CSCI 4830 Algorithms for Molecular Biology
- Debra Goldberg
2Outline
- Introduction
- Network models
- Implications from topology
- Confidence assessment, edge prediction
3Outline
- Introduction
- Network models
- Implications from topology
- Confidence assessment, edge prediction
4What is a network?
- A collection of objects (nodes, vertices)
- Binary relationships (edges)
- May be directed
- Also called agraph
5Networks are everywhere
6Social networks
Nodes People Edges Friendship
from www.liberality.org
7Sexual networks
Nodes People Edges Romantic and sexual
relations
8Transportation networks
Nodes Locations Edges Roads
9Power grids
Nodes Power station Edges High voltage
transmission line
10Airline routes
Nodes Airports Edges Flights
11Internet
Nodes MBone Routers Edges Physical connection
12World-Wide-Web
Nodes Web documents Edges Hyperlinks
13Gene and protein networks
14Metabolic networks
Nodes Metabolites Edges Biochemical
reaction(enzyme)
from web.indstate.edu
15Protein interaction networks
Nodes Proteins Edges Observed interaction
from www.embl.de
16Gene regulatory networks
Nodes Genes or gene products Edges Regulation
of expression
- Inferred from error-prone gene expression data
17Signaling networks
Nodes Molecules(e.g., Proteins or
Neurotransmitters) Edges Activation
orDeactivation
from pharyngula.org
18Signaling networks
Nodes Molecules(e.g., Proteins or
Neurotransmitters) Edges Activation
orDeactivation
from www.life.uiuc.edu
19Synthetic sick or lethal (SSL)
20SSL networks
- Gene function, drug targets predicted
21Other biological networks
- Coexpression
- Nodes genes
- Edges transcribed at same times, conditions
- Gene knockout / knockdown
- Nodes genes
- Edges similar phenotype (defects) when suppressed
22What they really look like
23We need models!
24Outline
- Introduction
- Network models
- Implications from topology
- Confidence assessment, edge prediction
25Traditional graph modeling
from GD2002
Random Regular
26Introduce small-world networks
27Small-world Networks
- Six degrees of separation
- 100 1000 friends each
- Six steps 1012 - 1018
- But
We live in communities
28Small-world measures
- Typical separation between two vertices
- Measured by characteristic path length
- Cliquishness of a typical neighborhood
- Measured by clustering coefficient
29Watts-Strogatz small-world model
30Measures of the W-S model
- Path length drops faster than cliquishness
- Wide range of phas both small-worldproperties
31Small-world measures of various graph types
Cliquishness Characteristic Path Length
Regular graph High Long
Random graph Low Short
Small-world graph High Short
32Another network property Degree distribution P
(k)
- The degree (notation k) of a node is the number
of its neighbors - The degree distribution is a histogram showing
the frequency of nodes having each degree
33Degree distribution of E-R random networks
Erdös-Rényi random graphs
- Binomial degree distribution, well-approximated
by a Poisson
Network figures from Strogatz, Nature 2001
34Degree distribution of many real-world networks
- Scale-free networks
- Degree distribution follows a power law
- P (k x) ? x -?
35Model for scale-free networks
- Growth and preferential attachment
- New node has edge to existing node v with
probability proportional to degree of v - Biologically plausible?
36Another scale-free network model
- Duplication and divergence
- New nodes are copies of existing nodes
- Same neighbors, then some gain/loss
Solé, Pastor-Satorras, et al. (2002)
37Other degree distributions
Amaral, Scala, et al., PNAS (2000)
38Hierarchical Networks
Ravasz, et al., Science 2002
39Properties of hierarchical networks
40C of 43 metabolic networks
Ravasz, et al., Science 2002
41Scaling of the clustering coefficient C(k)
Ravasz, et al., Science 2002
42Summary of network models
Random Poisson degree distribution
Small world high CC, short pathlengths
Scale-free power law degree distribution
Hierarchical high CC, modular, power law degree distribution
43Many real-world networks are small-world,
scale-free
- World-wide-web
- Collaboration of film actors (Kevin Bacon)
- Mathematical collaborations (Erdös number)
- Power grid of US
- Syntactic networks of English
- Neural network of C. elegans
- Metabolic networks
- Protein-protein interaction networks
44So What?
45There is information in a genes position in the
network
- We can use this to predict
- Relationships
- Interactions
- Regulatory relationships
- Protein function
- Process
- Complex / molecular machine
46Outline
- Introduction
- Network models
- Implications from topology
- Confidence assessment, edge prediction
47SSL hubs might be good cancer drug targets
Cancer cells w/ random mutations
Normal cell
Dead
Alive
Dead
(Tong et al, Science, 2004)
48Lethality
- Hubs are more likely to be essential
Jeong, et al., Nature 2001
49Degree anti-correlation
- Few edges directly between hubs
- Edges between hubs and low-degree genes are
favored
Maslov and Sneppen, Science 2002
50Outline
- Introduction
- Network models
- Implications from topology
- Confidence assessment, edge prediction
51Confidence assessment
- Traditionally, biological networks determined
individually - High confidence
- Slow
- New methods look at entire organism
- Lower confidence (? 50 false positives)
- Inferences made based on this data
52Confidence assessment
- Can use topology to assess confidence if true
edges and false edges have different network
properties - Assess how well each edge fits topology of true
network - Can also predict unknown relations
Goldberg and Roth, PNAS 2003
53Use clustering coefficient, a local property
- Number of triangles N(v) ? N(w)
- Normalization factor?
N(x) the neighborhood of node x
54Mutual clustering coefficient
- Jaccard Index Meet / Min Geometric
N(v) ? N(w) ---------------- N(v) ? N(w)
N(v) ? N(w) 2 ------------------ N(v)
N(w)
N(v) ? N(w) ------------------------ min (
N(v) , N(w) )
55Mutual clustering coefficient
- Hypergeometric
- P (intersection at least as large by chance)
-log
56Prediction
- A v-w edge would have a high clustering
coefficient
v
w
57Interaction generality
- Confidence measure for edge based on topology
around neighbors.
Saito, Suzuki, and Hayashizaki 2002,2003
58Confidence assessment
- Integrate experimental details with local
topology - Degree
- Clustering coefficient
- Degree of neighbors
- Etc.
- Used logistic regression
Bader, et al., Nature Biotechnology 2003
59The synthetic lethal network has many triangles
Xiaofeng Xin, Boone Lab
602-hop predictors for SSL
- SSL SSL (S-S)
- Homology SSL (H-S)
- Co-expressed SSL (X-S)
- Physical interaction SSL (P-S)
- 2 physical interactions (P-P)
Wong, et al., PNAS 2004
61Multi-color motifs
Zhang, et al., Journal of Biology 2005