Protein-Protein Interactions Networks - PowerPoint PPT Presentation

1 / 67
About This Presentation
Title:

Protein-Protein Interactions Networks

Description:

Protein-Protein Interactions Networks A comprehensive analysis of protein-protein interactions in Saccharomyces cerevisiae P.Utez et al, Nature 2000 – PowerPoint PPT presentation

Number of Views:169
Avg rating:3.0/5.0
Slides: 68
Provided by: a1565
Category:

less

Transcript and Presenter's Notes

Title: Protein-Protein Interactions Networks


1
Protein-Protein Interactions Networks
  • A comprehensive analysis of protein-protein
    interactions in Saccharomyces cerevisiaeP.Utez
    et al, Nature 2000
  • Functional organisation of the yeast proteome
    by systematic analysis of protein complexes G.
    Gavin et al, Nature 2002
  • Global Mapping of the Yeast Genetic Interaction
    Network Tong et al, Science 2004
  • Global analysis of protein activities using
    proteome chips Zhu, H. et al. Science 2001
  • Conserved patterns of protein interaction in
    multiple species R. Sharan et al, PNAS 2005

2
Genomics
  • Genomics The large scale study of genomes and
    their functions
  • Why protein network?

3
Why protein network?
  • Assemblies represent more than the sum of their
    parts.
  • complexity' may partly rely on the contextual
    combination of the gene products.

4
Yeast as a model
  • Why yeast genomics? A model eukaryote organism

5
(No Transcript)
6
The best-studied organism
  • 5,500 genes.
  • 16(!) chromosomes.
  • 13 Mb of DNA (humans have 3,000 Mb).
  • We know (?) the function of gt1/2 of the yeast
    genes.
  • All the essential functions are conserved from
    yeast to humans.

7
Example cell cycle
Lee Hartwell, Nobel Prize 2001
8
4 methodologies for high throughput research
  • Two hybrid systems
  • Analysis of protein complexes
  • Synthetic lethal
  • Protein Chips (?)

9
Two hybrid system
  • Aim
  • Identify pairs of Physical interactions.
  • Solution
  • Use the transcription mechanism of the cell

10
The central dogma
3
11
Transcription factors
Movie transcription (molecular model, real
time) 7.2
12
Transcription real time (viedo)
13
Reporter gene
14
Two hybrid system
  • Isolate double plasmids using reporter or
    selection methods.

15
All against All
16
Focus on the baits
  • Baits are analyzed separately.
  • 192 baits vs. 6000 pray yeast strains.

A component of RNA polymerase I, III,
identification of three new interacting proteins
17
Two hybrid system
18
Two hybrid system
  • A comprehensive two-hybrid analysis to explore
    the yeast protein interactome Ito T. et al, PNAS
    2001.

19
Analysis of protein complexes
  • Aim Identification of complexes and their sub
    units.
  • Solution a two step method
  • Isolation of only relevant complexes
  • Identification of complex units.

20
Double Isolation
21
Identification of the members
  • Divide and conquer-
  • Denaturate assembly
  • Digest with protease
  • Mass spectrometry

22
How does it work?
  • The deflection route of ionized molecules is used
    to determine the molecules mass.
  • The output

23
Analysis of protein complexes
  • Cross results of peptide mass with protein
    database.
  • Mass spectrometry can be implied again if the
    data is not sufficient, this time for the
    peptides.

24
Analysis of protein complexes
  • Systematic(1) 1739 bait proteins.
  • 232 complexes with 589 baits.
  • Systematic(2) 725 bait proteins.
  • 3,617 interactions with 493 baits.

25
(No Transcript)
26
Analysis of protein complexes
  • About 25 false positive rate.
  • Covers 56/60, 10/35 in Y2H, of known complexes.
  • Only 7 of the interactions were seen by Y2H
    assays.
  • But,
  • Can evaluate protein-
  • Concentration.
  • Localization.
  • Post-translational modifications.

27
Synthetic lethality
  • First, few words on essentiality.
  • Create new strains, each strain with one gene
    deleted (96 coverage)
  • Tag each strains with a unique sequence.
  • Grow all the strains.
  • Measure the amount of each seq.
  • Some 18.7 (1,105) are essential.

28
Synthetic lethality
  • High genetic redundancy hardens the discovery of
    many gene functions (30).
  • Only the double mutation is lethal, either of the
    single mutations is viable.
  • Why?
  • Single biochemical pathway.
  • Two distinct pathways for one process.

29
The naïve approach
  • But how do you genomics it?

30
All vs. All
  • 5100 non essential mutants.
  • Main tricks
  • 1. Haploid strains
  • 2. Resistant markers.
  • 3. Extra marker for the library haploid.

31
Synthetic lethality Making it genomics
  • Mass analysis Crossing the query haploid with a
    library (synthetic genetic array)
  • Tetrad analysis Validation and finding synthetic
    sick

32
The genetic interaction map
  • 8 genes against all produced a network of
    synthetic lethal pairs.

33
Synthetic lethality Making it genomics
  • 132 query genes vs. 4700
  • False negatives 17-42.
  • At least 4 times more dense than the PPI network.
  • Predicting 100,000 interactions (?)

34
PPI Summery (2003)
35
PPI Summery
  • S. Cerevisiae (Yeast)
  • 4389 proteins
  • 14319 interactions
  • C. Elegans (Worm)
  • 2718 proteins
  • 3926 interactions
  • D. Melanogaster (Fly)
  • 7038 proteins
  • 20720 interactions

Sharan et al. PNAS 2005
36
We like Networks
  • Exploit graph theory methods.
  • Provide a general solution for data integration.

37
Network Structure and Function
  • Identify highly nonrandom network structural
    patterns that reflect function
  • Ideker et al Finding co-regulated sub-graphs.
  • Lee at el The repeated instances of each motif
    are the result of evolutionary convergence.
  • Barabasi at el Network motifs are associated
    with specific cellular tasks.

38
Conserved patterns of PPI in multiple species
Bakers yeast (Saccharomyes cerevisiae) 15000
interactions 5000 interacting genes
Bacterial pathogen (Helicobacter pylori) 1500
interactions 700 interacting genes
Kelley et al. PNAS 2003
39
Goals
  • Separating true PPI from false positives.
  • Assign functional roles to interactions.
  • Predict interactions.
  • Organizing the data into models of cellular
    signaling and regulatory machinery.
  • How?
  • Use approach based on evolutionary cross-species
    comparisons.

40
Interaction graph (per species)
  • Vertices are the organisms interacting proteins.
  • Edges are pair-wise interactions between
    proteins.
  • Edges are weighted using a logistic regression
    model
  • A Number of times an interaction was observed.
  • For Fly and worm observation In one experiment.
  • B Correlation coefficient of the gene
    expression.
  • Shown to be correlated to interaction.
  • C Proteins small world clustering coefficient.
  • Sum of the neighbors logHG probs.

41
How do we find Sub-network conservation?
  • Interactions within each species should
    approximate the desired structure
  • Pathway. Signal transduction.
  • Cluster. Protein complex.
  • Many-to-many correspondence between the sets of
    proteins.

42
Network alignment graph
  • Each node corresponds to k sequence-similar
    proteins.
  • BLAST E value lt -7 considering the 10 best
    matches only.
  • Cannot be split into two parts with no sequence
    similarity between them.
  • Edge represents a conserved interaction.
  • Match -gt One pair of proteins directly interacts
    and all other include proteins with distance lt2
    in the interaction maps.
  • Gap gt All protein pairs are of distance 2 in the
    interaction maps.
  • Match-Gap-gt At least max2, k -1 protein pairs
    directly interact.
  • A subgraph corresponds to a conserved
    sub-network.

43
A probabilistic model
(
)

P
S
q(e) interaction similarity
44
Searching for conserved sub-networks
  • Identifying high-scoring subgraphs of the network
    alignment graph.
  • This problem is computationally hard.
  • Exhaustively we find seeds - paths with 4 nodes.
  • Expand high scoring seeds. Greedily add/remove
    nodes.
  • Filter subgraphs with a high degree of overlap
    (gt80).

45
Statistical evaluation of sub-networks
  • Randomized data is produced
  • Random shuffling of each of the interaction
    graphs.
  • Randomizing the sequence-similarity
    relationships.
  • Find the highest-scoring sub-networks of a given
    size.
  • P-value is computed by the distribution of the
    top scores.

46
The final product
47
3-way Comparison
  • S. cerevisiae
  • 4389 proteins
  • 14319 interactions
  • C. elegans
  • 2718 proteins
  • 3926 interactions
  • D. melanogaster
  • 7038 proteins
  • 20720 interactions

Sharan et al. PNAS 2005
48
Multiple Network Alignment
Subnetwork search
Network alignment
Preprocessing Interaction scores logistic
regression on observations, expression
correlation, clustering coeff.
Filtering Visualizing p-valuelt0.01, ?80 overlap
Conserved paths
Conserved clusters
Protein groups
Conserved interactions
49
(No Transcript)
50
Reduced false positives
  • Compared these conserved clusters to known
    complexes in yeast -
  • Pure cluster - contain gt2 annotated proteins and
    gt1/2 of these shared the same annotation.
  • 94(gt83 in mono specie) pure clusters.
  • Did sticky proteins biased the clusters?
  • Of 39 proteins (gt 50 neighbors), only 10 were
    included in conserved clusters. And they were
    annotated so.

51
Cross Validation Function
  • Guilty by association.
  • Enrichment of GO annotation (plt0.01).
  • More then half of the annotated proteins had the
    annotation.
  • Outperforms sequence-based approach at 37-53.

52
Cross Validation Interaction
  • 1 Evidence that proteins with similar sequences
    interact within other species.
  • 2 Co-occurrence of these proteins in the same
    conserved cluster.

53
Wet Validation Interaction
  • The tests were performed by using two-hybrid
    assays.
  • Of the 65 yeast predicted interactions
  • 5 were self inducing.
  • 31 tested positive.

54
Conclusions
  • Associate proteins that are not necessarily each
    others best sequence match.
  • 177/679 conserved clusters.
  • 31/129 conserved paths.
  • Inter module interaction is reinforced by
    inter-species observations.
  • 40-52 gtgt 0.042 as a random PPI prediction.
  • Many PPI circuits are conserved over evolution.

55
Thanks!!!
  • Recoverin, a calcium-activated myristoyl switch.

56
GO Gene Ontology
  • all all ( 171472 )
  • GO0008150 biological_process ( 109503 )
  • GO0007582 physiological process ( 70981 )
  • GO0008152 metabolism ( 41395 )
  • GO0009058 biosynthesis ( 10256 )
  • GO0009059 macromolecule biosynthesis (
    6876 )
  • GO0006412 protein biosynthesis ( 4611 )
  • GO0043170 macromolecule metabolism ( 17198
    )
  • GO0009059 macromolecule biosynthesis (
    6876 )
  • GO0006412 protein biosynthesis ( 4611 )
  • GO0019538 protein metabolism ( 12856 )
  • GO0006412 protein biosynthesis ( 4611 )
  • GO0005575 cellular_component ( 98453 )
  • GO0003674 molecular_function ( 108120 )

back
57
Interaction distribution
58
Expression data
  • Yeast - 794 conditions.
  • Fly - over 90 CC time points170 profiles.
  • Worm - over 553 conditions.

back
59
Edge weight
  • where 0, . . . , 3 are the parameters of the
    distribution.
  • Maximize the likelihood
  • Positive MIPS interactions.
  • Negative random or false positives in the cross
    validation test.
  • Yeast - 1006 positive and negative examples.
  • Fly - 96 positive and negative examples.
  • Worm 24 positive and 50 negative examples.

back
60
back
71 conserved regions 183 significant clusters
and 240 significant paths.
61
A probabilistic model
  • Ms - the sub-network model.
  • Mn - the null model.
  • Ouv - the set of available observations on u-v.
  • Puv- fraction of (u,v) in order preserving graphs
    family.
  • T/Fuv True/False edge (u,v).

back
62
A probabilistic model
  • Each species interaction map was randomly
    constructed.
  • Randomizing assumptions
  • Each interaction should be present independently
    with high probability.
  • The probability depends on their total number of
    connections in the network.

63
Why Yeast?

back
Comparative Genomics of the Eukaryotes Rubin
GM. et al. Science 2000
64
Analysis of protein complexes
  1. IsolationA straight forward method, using
    Affinity chromatography. A target protein is
    attached to polymer beads that are packed into a
    column. Cell proteins are washed through the
    column.Proteins the interact with the target
    protein adhere to the affinity matrix and are
    eluted later.

65
Analysis of protein complexes
  1. IsolationCo-immunoprecipitation. An antibody
    that recognizes the target protein is used to
    isolate the protein. Usually the there isnt a
    highly specific antibody for the target protein.
    A chimera protein is formed, using a the target
    protein and an epitope tag.The common tag is a
    enzyme glutathione S-transferase (GST).

66
Analysis of protein complexes
  1. IsolationIsolation of complex using the Chimera

Glutathione coated beads
Cell extract
Glutathione solution
67
MIPS
  • Munich Information Center for Protein Sequences
    (MIPS).
  • Hierarchy Structure.
  • Only manually annotated complexes from DIP.
  • Left with 486 proteins spanning 57 categories at
    level 3.

back
Write a Comment
User Comments (0)
About PowerShow.com