Prediction of Synthetic Genetic Interactions In Silico Using Protein Network Data - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Prediction of Synthetic Genetic Interactions In Silico Using Protein Network Data

Description:

... organization and biogenesis* YKL164C. molecular_function. biological_process ... ribosome biogenesis and assembly* YDR165W. cation-transporting ATPase activity ... – PowerPoint PPT presentation

Number of Views:96
Avg rating:3.0/5.0
Slides: 45
Provided by: SRI140
Category:

less

Transcript and Presenter's Notes

Title: Prediction of Synthetic Genetic Interactions In Silico Using Protein Network Data


1
Prediction of Synthetic Genetic Interactions In
Silico Using Protein Network Data
  • Sri R. Paladugu
  • Keck Graduate Institute
  • July 07, 2007

2
Outline
  • What?
  • Prediction of Synthetic sick or lethal (SSLs)
  • Why?
  • Goals of our study
  • How?
  • Computational Methods (SVMs)
  • Results Conclusion

3
Protein Interaction Networks
Nodes represent Proteins Edges represent
Observed Interactions
4
Synthetic sick or lethal (SSL)
  • Extension of single gene dispensability
    Dispensability of one gene versus, dispensability
    of a pair of genes.

Cells live(wild type)
Cells live
Cells live
Cells die or grow slowly
5
SSL Interaction Networks
Nodes represent Non-essential genes Edges
represent synthetic lethality or synthetic slow
growth
6
Scenarios giving rise to SSL interactions
Wong SL, Trends in Genetics. 21, 8 (2005)
7
SSL hubs might be good cancer targets
Cancer cells w/ random mutations
Normal cell
Alive
Dead
Dead
A. Kamb, J. Theor. Biol. 223, 205 (2003)
8
SSL Interactions ? Chemical Genetics
  • SSL Interactions together with chemical genetics
    can help us predict the target pathways of magic
    bullets.

B. Stockwell, Nature Biotechnology. 22, 1 (2003)
9
Goals of our study
  • Study the predictive power of protein interaction
    networks for synthetic genetic interactions
    (SGIs) in S. cerevisiae using machine learning
    methods
  • Predict novel SGIs in yeast
  • Validate the method and carry out novel
    predictions in other organisms

10
How?
  • Experimental Methods
  • Synthetic Genetic Array (SGA) analysis
  • dSLAM
  • Computational Methods
  • Use machine learning analysis to predict
    synthetic lethal (SL) interactions
  • Has been done previously by Wong et.al (2004),
    where they used lot of functional information
    (e.g. common upstream regulator, correlated mRNA
    expression, same MIPS function) to predict (SL)
    interactions in Saccharomyces
  • Data for most of the predictor variables that
    were used in the above approach is not readily
    available for other species.

11
Can We Predict ?
Protein Interaction Network in S. Cerevesiae
Synthetic Lethal/Sick Interaction Network in S.
Cerevesiae
http//www.visualcomplexity.com/vc/project_details
.cfm?index19id184domainBiology
12
Network Data
  • PIN data and the synthetic lethal/sick
    interactions for Saccharomyces were obtained from
    Reguly et.al (2006).
  • Non-Synthetic Lethal/Non-Synthetic Sick
    interactions (Negatives) were constructed by
    generating all pairwise combinations of SGA baits
    (227) with all other genes in yeast and then
    subtracting the Lethal/Sick interactions
    confirmed by SGA and dSLAM methods. Pairs where
    either members are not part of PIN data are also
    removed.
  • The genes that were used in SGA analysis was
    obtained from Tong et.al (2001).

Ref 1 Reguly et.al. Comprehensive curation and
analysis of global interaction networks in
Saccharomyces Cerevisiae. Journal of Biology 511
(2006). Ref 2 Tong et.al. Systemic Genetic
Analysis with Ordered Arrays of Yeast Deletion
Mutants. Science 294, 2364-68 (2001).
13
Input properties for SGI prediction
  • For every pair of genes we have computed the
    following PIN properties, which were supplied as
    input to the SVM based classifier.
  • Degree
  • Clustering Co-efficient
  • Betweenness Centrality
  • Closeness Centrality
  • Eigenvector Centrality
  • Information Centrality
  • Current-Flow Betweenness Centrality
  • Bridge Centrality
  • Stress Centrality
  • Shortest Distance
  • Mutual Neighborhood

14
Graph Properties
SSL Pairs
Non-SSL Pairs
15
Graph Properties
SSL Pairs
Non-SSL Pairs
16
Graph Properties
SSL Pairs
Non-SSL Pairs
17
Support Vector Machines
SVM Classifier
Graph properties of a of pair of proteins P1, P2
Optimal hyperplane
Average of Betweenness Centralities of pair P1-P2
Average of Closeness Centralities of pair P1-P2
Absolute Difference between Betweenness
Centralities of pair P1-P2
Output between 0 and 1 measures the propensity
for SL interaction
Absolute difference between Closeness
Centralities of pair of P1-P2
Shortest Distance P1 - P2
Mutual Neighborhood P1 - P2
support vectors
18
Results
  • Receiver operating characteristic (ROC) curves
    for SVM classifier trained using various
    predictor variables.
  • The accuracy of SVM Classification can be
    measured by the Area Under the ROC Curve (AUC).
  • AUC for Classifier trained using All Features
    0.7960

19
Synthetic lethal network has many ?s
20
2Hop Characteristics
  • 2Hop S-S is 1 if there is SL interaction between
    the pair (Gene A, Gene C) and also between the
    pair (Gene B, Gene C)
  • 2Hop S-P is 1 if there is SL interaction between
    the Pair (Gene A, Gene C) and Physical
    interaction between the pair (Gene B, Gene C) or
    vice versa.

2Hop Y-Z
?
Gene A
Gene B
Y
Z
Gene C
21
2Hop Characteristics
  • Addition of 2Hop input properties to the SVM
    Classifier improves the ROC curves significantly.

22
Positive Predictive Value (PPV)

Positive predictive value
Threshold (Cut-Off)
23
Robustness Analysis
  • We added some random edges to protein network
    data and recomputed the input properties for each
    gene pair.
  • The SVM classifier seems to be robust to bias in
    PIN data

24
Novel Predictions
  • Trained the SVM classifier using all the known
    positives and an equal number of known negatives.
  • Carried out predictions on the 2 million
    non-essential pairs that were never screened for
    synthetic lethality.
  • Repeated the training and testing 5 times each
    time using a different training set.
  • Based on a threshold level of 1, we selected the
    pairs that were predicted to be lethal in all 5
    Runs.

25
New Predictions
26
Graph Properties
True Positives
False Positives
27
Graph Properties
True Positives
False Positives
28
Graph Properties
True Positives
False Positives
29
Experimental validation of SSL pairs
30
New Predictions
Pairs not being validated as single gene knockout
strains are not available for one of the two
genes in the pair
Pairs which are being validated experimentally
31
rad1
dep1
end3
ras2
apn2
cti6
thi3
sap30
apn1
atx1
erf4/shr5
cis3
pir1
ccc2
erf2
thi2
spt4
hsp150
YPL183C
YKL050C
Known Interactions
Predicted Interactions
bug1
trm7
SS-SL interactions were color coded based on the
localization information.
32
trm82
pus7
dus3
ncl1
smm1/dus2
dus1
trm8
Known Interactions
Predicted Interactions
33
tsl1
rvs167
fps1
hsp82
tps1
rvs161
Nucleus
Cytoplasm
ER
Cell wall / Plasma Membrane
gpd1
tps3
Golgi Apparatus
Peroxisome
Vacuole
Other
npt1
rim15
Known Interactions
ypt6
ric1
Predicted Interactions
34
rad1
dep1
end3
ras2
apn1
apn2
cti6
thi3
sap30
atx1
erf4/shr5
cis3
ccc2
erf2
pir1
thi2
spt4
hsp150
YPL183C
YKL050C
Known Interactions
Predicted Interactions
bug1
trm7
SS-SL interactions were color coded based on the
biological process.
35
trm82
pus7
dus3
ncl1
smm1/dus2
dus1
trm8
Known Interactions
Predicted Interactions
36
tsl1
rvs167
fps1
hsp82
tps1
rvs161
gpd1
tps3
npt1
rim15
Known Interactions
ypt6
ric1
Predicted Interactions
37
Future Work
  • Train the model on Saccharomyces Synthetic
    Genetic Interactions data and test on
    Caenorhabditis elegans data.
  • Quantify the role of graph centrality properties
    in the prediction task.
  • Identify misclassified gene pairs and understand
    where our methods go wrong.

38
Acknowledgements
  • Shan Zhao (University of Rochester)
  • Dr. Animesh Ray
  • Dr. Alpan Raval
  • NSF

39
References
  • Ye P, Peyser BD, Pan X, Boeke JD, Spencer FA and
    Bader JS. Gene function prediction from congruent
    synthetic lethal interactions in yeast. Molecular
    Systems Biology 1(Nov 22), 2005 .
  • Newman M. A measure of betweenness centrality
    based on random walks. Social Networks, 27
    (1)3954, 2005.
  • Wong SL. et al. Combining biological networks to
    predict genetic interactions. Proc. Natl. Acad.
    Sci. USA, 101, 15682-15687, 2004.
  • T. Reguly et al. Comprehensive curation and
    analysis of global interaction networks in
    Saccharomyces cerevisiae. Journal of Biology,
    511, 2006.

40
In Pursuit of Modularity
  • Sri R Paladugu
  • May 08, 2007

41
Modular structure
42
Where are the modules?
43
Algorithm
  • 1. Choose a centrality measure (e.g., Edge
    Betweenness, Edge Information Centrality)
  • 2. Calculate the centrality score for each of the
    edge
  • 3. Remove the edge highest centrality measure
  • 4. Perform an analysis of the components
    generated
  • 5. Repeat steps 2- 4 until the desired number of
    components are generated

S. Fortunato, V. Latora, and M. Marchiori,
cond-mat/042522 (2004)
44
Modular structure
0.11
0.11
0.11
0.02
0.11
0.01
0.11
0.21
0.61
0.21
0.51
0.12
0.11
0.12
0.01
0.11
0.11
0.11
0.41
0.11
0.11
0.11
0.11
0.11
0.11
0.11
0.11
0.31
0.41
0.01
0.11
0.11
0.01
0.01
0.01
0.01
Write a Comment
User Comments (0)
About PowerShow.com