Protein Interaction Networks - PowerPoint PPT Presentation

About This Presentation
Title:

Protein Interaction Networks

Description:

Title: PowerPoint Presentation Last modified by: Aalt-Jan van Dijk Created Date: 1/1/1601 12:00:00 AM Document presentation format: Diavoorstelling (4:3) – PowerPoint PPT presentation

Number of Views:112
Avg rating:3.0/5.0
Slides: 97
Provided by: gened151
Category:

less

Transcript and Presenter's Notes

Title: Protein Interaction Networks


1
Protein Interaction Networks
Feb. 21, 2013
Aalt-Jan van Dijk Applied Bioinformatics, PRI,
Wageningen UR Mathematical and Statistical
Methods, Biometris, Wageningen University aaltjan.
vandijk_at_wur.nl
2
My research
  • Protein complex structures
  • Protein-protein docking
  • Correlated mutations
  • Interaction site prediction/analysis
  • Protein-protein interactions
  • Enzyme active sites
  • Protein-DNA interactions
  • Network modelling
  • Gene regulatory networks
  • Flowering related

3
Overview
  • Introduction protein interaction networks
  • Sequences networks predicting interaction
    sites
  • Predicting protein interactions
  • Sequence and network evolution
  • Interaction network alignment

4
Protein Interaction Networks
hemoglobin
Obligatory
5
Protein Interaction Networks
hemoglobin
Mitochondrial Cu transporters
Obligatory
Transient
6
Experimental approaches (1)
  • Yeast two-hybrid (Y2H)

7
Experimental approaches (2)
  • Affinity Purification mass spectrometry (AP-MS)

8
Interaction Databases
  • STRING http//string.embl.de/

9
Interaction Databases
10
Interaction Databases
  • STRING http//string.embl.de/
  • HPRD http//www.hprd.org/

11
Interaction Databases
12
Interaction Databases
  • STRING http//string.embl.de/
  • HPRD http//www.hprd.org/
  • MINT http//mint.bio.uniroma2.it/mint/

13
Interaction Databases
14
Interaction Databases
  • STRING http//string.embl.de/
  • HPRD http//www.hprd.org/
  • MINT http//mint.bio.uniroma2.it/mint/
  • INTACT http//www.ebi.ac.uk/intact/

15
Interaction Databases
16
Interaction Databases
  • STRING http//string.embl.de/
  • HPRD http//www.hprd.org/
  • MINT http//mint.bio.uniroma2.it/mint/
  • INTACT http//www.ebi.ac.uk/intact/
  • BIOGRID http//thebiogrid.org/

17
Interaction Databases
18
Some numbers
Organism Number of
known interactions H. Sapiens 113,217 S.
Cerevisiae 75,529 D. Melanogaster 35,028 A.
Thaliana 13,842 M. Musculus 11,616
Biogrid (physical interactions)
19
Overview
  • Introduction protein interaction networks
  • Sequences networks predicting interaction
    sites
  • Predicting protein interactions
  • Sequence and network evolution
  • Interaction network alignment

20
Binding site
21
Binding site prediction
  • Applications

22
Binding site prediction
  • Applications
  • Understanding network evolution
  • Understanding changes in protein function
  • Predict protein interactions
  • Manipulate protein interactions

23
Binding site prediction
  • Applications
  • Understanding network evolution
  • Understanding changes in protein function
  • Predict protein interactions
  • Manipulate protein interactions
  • Input data
  • Interaction network
  • Sequences (possibly structures)

24
Sequence-based predictions
25
Sequences and networks
  • Goal predict interaction sites and/or motifs

26
Sequences and networks
  • Goal predict interaction sites and/or motifs
  • Data interaction networks, sequences

27
Sequences and networks
  • Goal predict interaction sites and/or motifs
  • Data interaction networks, sequences
  • Validation structure data, motif databases

28
Motif search in groups of proteins
  • Group proteins which have same interaction
    partner
  • Use motif search, e.g. find PWMs

Neduva Plos Biol 2005
29
Correlated Motifs
30
Correlated Motifs
  • Motif model
  • Search
  • Scoring

31
Predefined motifs
32
Predefined motifs
33
Predefined motifs
34
Predefined motifs
35
Predefined motifs
36
Correlated Motif Mining
Find motifs in one set of proteins which interact
with (almost) all proteins with another motif
37
Correlated Motif Mining
  • Find motifs in one set of proteins which interact
    with
  • (almost) all proteins with another motif
  • Motif-models
  • PWM so far not applied
  • (l,d) with llength, dnumber of wildcards
  • Score overrepresentation, e.g. ?2

38
Correlated Motif Mining
  • Find motifs in one set of proteins which interact
    with
  • (almost) all proteins with another motif
  • Search
  • Interaction driven
  • Motif driven

39
Interaction driven approaches
Mine for (quasi-)bicliques ? most-versus-most
interaction Then derive motif pair from
sequences
40
Motif driven approaches
Starting from candidate motif pairs, evaluate
their support in the network (and improve them)
41
D-MOTIF
Tan BMC Bioinformatics 2006
42
(No Transcript)
43
IMSS application of D-MOTIF
protein Y
protein X
Test error
Number of selected motif pairs
Van Dijk et al., Bioinformatics 2008 Van Dijk et
al., Plos Comp Biol 2010
44
Experimental validation
protein Y
protein X
Van Dijk et al., Bioinformatics 2008 Van Dijk et
al., Plos Comp Biol 2010
45
Experimental validation
protein Y
protein X
Van Dijk et al., Bioinformatics 2008 Van Dijk et
al., Plos Comp Biol 2010
46
Experimental validation
protein Y
protein X
Van Dijk et al., Bioinformatics 2008 Van Dijk et
al., Plos Comp Biol 2010
47
SLIDER
Boyen et al. Trans Comp Biol Bioinf 2011
48
SLIDER
  • Faster approach, enabling genome wide search
  • Scoring Chi2
  • Search steepest ascent

49
Validation
  • Performance assessment on simulated data
  • Performance assessment using using protein
    structures

50
Extensions of SLIDER
  • Extension I better coverage of network

Boyen et al. Trans Comp Biol Bioinf 2013
51
Extensions of SLIDER
  • Extension I better coverage of network
  • Extension II use of more biological information

52
bioSLIDER
DGIFELELYLPDDYPMEAPKVRFLTKI
53
bioSLIDER
DGIFELELYLPDDYPMEAPKVRFLTKI
conservation
54
bioSLIDER
DGIFELELYLPDDYPMEAPKVRFLTKI
conservation
accessibility
55
bioSLIDER
DGIFELELYLPDDYPMEAPKVRFLTKI
conservation
accessibility
Thresholds for conservation and
accessibility Extension of motif model amino
acid similarity (BLOSUM)
56
bioSLIDER
DGIFELELYLPDDYPMEAPKVRFLTKI
conservation
accessibility
Using human and yeast data for training and
optimizing parameters
0.5 0.4 0.3 0.2 0.1 0.0
Interaction-coverage
No conservation, no accessibility Conservation
and accessibility
0.0 0.3 0.6
0.0 0.3 0.6
Motif-accuracy
Leal Valentim et al., PLoS ONE 2012
57
Application to Arabidopsis
Input data 6200 interactions, 2700
proteins Interface predictions for 985 proteins
(on average 20 residues)
Arabidopsis Interactome Mapping Consortium,
Science 2011
58
Ecotype sequence data (SNPs)
SNPs tend to avoid predicted binding sites In
263 proteins there is a SNP in a binding site ?
these proteins are much more connected to each
other than would be randomly expected
59
Summary
  • Prediction of interaction sites using protein
  • interaction networks and protein sequences
  • Correlated motif approaches

60
Overview
  • Introduction protein interaction networks
  • Sequences networks predicting interaction
    sites
  • Predicting protein interactions
  • Sequence and network evolution
  • Interaction network alignment

61
Protein Interaction Prediction
Lots of genomes are being sequenced (www.genomeso
nline.org) Complete Incomplete ARCHAEA 182 2
64 BACTERIA 3767 14393 EUKARYA 183 2897 TOTAL
4132 17514
62
Protein Interaction Prediction
Lots of genomes are being sequenced
(www.genomesonline.org) Complete Incomplete AR
CHAEA 182 264 BACTERIA 3767 14393 EUKARYA 183
2897 TOTAL 4132 17514 But how do we know
how the proteins in there work together?!
63
Protein Interaction Prediction
  • Interactions of orthologs interologs
  • Phylogenetic profiles
  • Domain-based predictions

A 1 0 1 1 0 0 1
B 1 0 1 1 0 0 1
64
Orthology based prediction
65
Orthology based prediction
66
Phylogenetic profiles
A 1 0 1 1 0 0 1
B 1 0 1 1 1 0 1
C 1 0 1 1 1 0 1
D 0 1 0 1 0 0 1
67
Domain Based Predictions
68
Domain Based Predictions
69
Overview
  • Introduction protein interaction networks
  • Sequences networks predicting interaction
    sites
  • Predicting protein interactions
  • Sequence and network evolution
  • Interaction network alignment

70
Duplications
71
Duplications and interactions
Gene duplication
72
Duplications and interactions
Gene duplication
73
Duplications and interactions
Gene duplication Interaction loss
0.1 Myear-1
0.001 Myear-1
74
Duplications and interaction loss
Duplicate pairs share interaction partners
75
Interaction network evolution
Science 2011
76
Overview
  • Introduction protein interaction networks
  • Sequences networks predicting interaction
    sites
  • Predicting protein interactions
  • Sequence and network evolution
  • Interaction network alignment

77
Network alignment
Local Network Alignment find multiple, unrelated
regions of Isomorphism Global Network Alignment
find the best overall alignment
78
PATHBLAST
Kelley, PNAS 2003
79
PATHBLAST scoring
homology
interaction
Kelley, PNAS 2003
80
PATHBLAST results
Kelley, PNAS 2003
81
PATHBLAST results
For yeast vs H.pylori, with L4, all resulting
paths with plt0.05 can be merged into just five
network regions
Kelley, PNAS 2003
82
Multiple alignment
Scoring Probabilistic model for interaction
subnetworks Sub-networks bottom-up search,
starting with exhaustive search for L4 followed
by local search
Sharan PNAS 2005
83
Multiple alignment results
Sharan PNAS 2005
84
Multiple alignment results
Applications include protein function
prediction and interaction prediction
Sharan PNAS 2005
85
Global alignment
Singh PNAS 2008
86
Global alignment
Singh PNAS 2008
87
Global alignment
Alignment greedy selection of matches
Singh PNAS 2008
88
Network alignment the future?
Sharan Ideker Nature Biotech 2006
89
Summary
  • Interaction network evolution mostly
    comparative, not much mechanistic
  • Approaches exist to integrate and model network
    analysis within context of phylogeny (not
    discussed)
  • Outlook combine interaction site prediction with
  • network evolution analysis

90
Exercises
The datafiles arabidopsis_proteins.lis and
interactions_arabidopsis.data contain
Arabidopsis MADS proteins (which regulate various
developmental processes including flowering), and
their mutual interactions, respectively.
91
Exercise 1
  • Start by getting familiar with the basic
    Cytoscape features described in section 1 of the
    tutorial http//opentutorials.cgl.ucsf.edu/index.p
    hp/TutorialIntroduction_to_Cytoscape
  • Load the data into Cytoscape
  • Visualize the network and analyze the number of
    interactions per proteins which proteins do
    have a lot of interactions?

92
Exercise 2
Write a script that reads interaction data and
implements a datastructure which enables further
analysis of the data (see setup on next
slides). Use the datafiles arabidopsis_proteins.l
is and interactions_arabidopsis.data and let
the script print a table in the following
format PROTEIN Number_of_interactions Make a
plot of those data
93
two subroutines input filename output
list with content of file sub read_list my
infile_0 YOUR CODE return
_at_newlist input protein list and interaction
list output hash with proteins ? list of
their partners sub combine_prot_int() my
(plist,intlist) _at__ YOUR CODE return
inthash
94
reading input data my _at_plist
read_list(ARGV0) my _at_intlist
read_list(ARGV1) obtaining hash with
interactions inthashcombine_prot_int(\_at_plist,\_at_i
ntlist) YOUR CODE loop over all proteins and
print their name and their number of interactions
95
(No Transcript)
96
Exercise 3
In orthology_relations.data we have a set of
predicted orthologs for the Arabidopsis proteins
from exercise 1. protein_information.data
describes a.o. from which species these proteins
are. Finally, interactions.data contains
interactions between those proteins. Use the
Arabidopsis interaction data from exercise 1 to
predict interactions in other species using the
orthology information. Compare your predictions
with the real interaction data and make a plot
that visualizes how good your predictions are.
Write a Comment
User Comments (0)
About PowerShow.com