Title: Chapter 4: Protein Interactions and Disease
1Chapter 4 Protein Interactions and Disease
- Mileidy W. Gonzalez, Maricel G. Kann
Presented by Md Jamiul Jahid
2What to learn in this chapter
- Experimental and computational methods to detect
protein interactions - Protein networks and disease
- Studying the genetic and molecular basis of
disease - Using protein interactions to understand disease
3What is Protein interaction
- Protein is the main agents of biological function
- Protein determine the phenotype of all organisms
- Protein don't function alone
- interaction with other proteins
- interaction with other molecules (e.g. DNA, RNA)
4What is Protein interaction
- Protein interaction generally means physical
contact between proteins and their interacting
partners. - Protein associate physically to create
macromolecular structures of various complexities
and heterogeneities - Protein pair can form dimers, multi-protein
complexes or long chains
5What is Protein interaction
- But it always need not to be physical
- Besides physical interactions protein interaction
means metabolic or genetic correlation or
co-localization - Metabolic -gt in same pathway
- Genetically correlated -gt co-expressed
- Co-localization -gt protein in the same cellular
compartment
6PPI Network
- PPI network represents interaction among proteins
- Each node represent a protein
- Each link represents an interaction
7PPI Network
A PPI network of the proteins encoded by
radiation-sensitive genes in mouse, rat, and
human, reproduced from 89.
8PPI Network
- Some use of PPI network
- To learn the evolution of different proteins
- About different systems they are involved
- Network can be used to learn interaction for
other species - Helpful to identify functions of uncharacterized
proteins
9Experimental Identification of PPIs
- Biophysical Methods
- High-Throughput Methods
- Direct high-throughput methods
- Indirect high-throughput methods
10Biophysical Methods
- Mainly biochemical, physical and genetic methods
- X-ray
- Crystallography
- NMR spectroscopy
- Fluorescence
- Atomic force microscopy
11Biophysical Methods
- Biophysical methods identify interacting partners
- Chemical features of the interaction
- Problem
- Time and resource consumption is high
- Applicable for small scale
12High Throughput Methods
- Direct high-throughput methods
- Indirect high-throughput methods
13Direct high-throughput methods
- Yeast two-hybrid (Y2H)
- Most common
- Fuse two protein in a transcription binding
domain - If the protein interact-gttranscription complex
activated
14Direct high-throughput methods
Y2H overview
- Image courtesy Wikipedia.org
15Direct high-throughput methods
- Problem (Yeast two-hybrid)
- Cannot identify complex protein interaction means
more than two interaction - Interaction of proteins initiating transcription
16Indirect high-throughput methods
- Looking at characteristics of the gene encode
that produce that protein - Gene co-expression
- Assumption genes of interacting protein must
co-expressed to provide the product of protein
interaction
17Computational Predictions of PPIs
- Empirical predictions
- Theoretical predictions
- Coevolution at the residue level
- Coevolution at the full sequence level
18Empirical predictions
- Based on
- Relative frequency of interacting domains
- Maximum likelihood estimation
- Co-expression
- Disadvantage
- Rely on existing network
- Propagate inaccuracies
19Theoretical Predictions of PPIs Based on
Coevolution
- Coevolution at the residue level
- Coevolution at the full sequence level
- In biology, coevolution is "the change of a
biological object triggered by the change of a
related object."
20Coevolution at the residue
- Paris of residues of the same protein can
co-evolve for three dimensional proximity or
shared functions - A pair of protein is assumed to interact if they
show enrichment of the same correlated mutations
21Coevolution at the full sequence level
- Basic idea changes in one protein are
compensated by correlated changes in its
interacting partners to preserve interaction - -gtgt interacting protein have phylogenetic trees
with topologies more similar than by chance - Mirrortree is most accurate option to indentify
interaction
22Mirrortree
- Identify the orthologs of both proteins in common
species - Creating multiple sequence alignment (MSA) with
each orthologs - Create distance metric from MSA
- Calculate correlation coefficient between
distance metric
23Mirrortree
24Different methods for computing PPI
25Protein Network and Disease
- Studying the Genetic Basis of Disease
- Studying the Molecular Basis of Disease
26Studying the Genetic Basis of Disease
- After Mendelian genetics in the 1900, a lot of
effort to categorize disease genes - Positional cloning the process to isolate a gene
in the chromosome based on its position - Genes identified by this approach
- cystic fibrosis, HD, breast cancer etc.
- still mutation in gene not correlate with symptoms
27Studying the Genetic Basis of Disease
- Several reasons
- pleiotropy
- influence of other genes
- environmental factors
28Studying the Genetic Basis of Disease
- Pleiotropy when a single gene produce multiple
phenotype - Problem complicates disease elucidation process
because mutation of such gene can have effect of
some, all or none of its traits. - Means, mutation of a pleiotrophic gene may cause
multiple syndrome or only cause disease in some
of the biological process
29Studying the Genetic Basis of Disease
- Influence of other genes
- Interact synergistically
- Modify one another
30Studying the Genetic Basis of Disease
- Environmental factors
- diet
- infection etc.
- Cancer are believed to be caused by several genes
and are affected by several environment factors
31Studying the Molecular Basis of Disease
- Genes associated with disease is important
- Molecular details is also important to identify
the mechanism triggering, participating and
controlled perturbed biological functions
32The role of protein interaction in disease
- Protein interaction provide a vast source of
molecular information because their interaction
involve in - metabolic
- signaling
- immune
- gene regulatory networks
- Protein interaction should be the key target to
understand molecular based disease understanding
33The role of protein interaction in disease
- Protein-DNA interaction disruption
- Protein misfolding
- New undesired protein interaction
34Protein-DNA interaction disruption
- p53 tumor suppressor
- Mutation on p53 DNA-binding domain destroy its
ability to bind its target DNA sequence - Cause preventioning of several anticancer
mechanism it mediates
35Protein misfolding and undesired interaction
- Protein misfolding
- protein folding A process by which a protein
goes to its 3D functional shape - New undesired protein interaction
- Main cause of several disease like Huntington
disease, Cystic fibrosis, Alzheimer's disease
etc.
36Using PPI network to understand disease
- PPI Network can help identify novel pathway
- PPI network can be helpful to explore difference
between healthy and disease states - Protein interaction studies play a major role in
the prediction of genotype-phenotype association
37Using PPI network to understand disease
- New diagnostic tools can result from
genotype-phenotype associations - Can identify disease sub networks
- Drug design
38PPI Network can help identify novel pathway
- PPI network Maps physical and functional
interaction of protein pairs - Pathway Represents genetic, metabolic, signaling
or neural processes as a series of sequential
biochemical reaction
39PPI Network can help identify novel pathway
- Pathway alone cannot uncover disease detail
- When performing pathway analysis to study disease
differential expression is the key - Majority of human genes haven't been assigned to
pathway
40PPI Network can help identify novel pathway
- In this scenario PPI network can be helpful to
identify novel pathway - Some key findings
- Disease genes are generally occupy peripheral
position in PPI network - Few cancer genes are hubs
- Disease genes tend to cluster together
- Protein involved in similar phenotype are highly
connected
41PPI network can be helpful to explore difference
between healthy and disease states
Source Dynamic modularity in protein interaction
networks predicts breast cancer outcome, Nature
Biotechnology 27, 2009
42 Genotype-phenotype association and new disease
genes
- Disease gene by interacting partners of already
known disease genes - Topological features to predict disease genes
- 970/5000 genes are disease genes
43Disease subnetwork identification
44Disease subnetwork identification
45Drug design
- Hub node in PPI are not good for drug target
- Less connected nodes may be good target for drug
46Exercise
- Objective investigate Epstein-Barr Virus
pathogenesis using PPI - EBV is most common human virus
- 95 adult infected to this virus
- EBV replicates in epithelial cells and establish
latency in B lymphocytes - 35-50 time mono-nucleosis
- Sometimes cancer
47Dataset
- Dataset S1 EBV interactome
- Dataset S2 EBV-Human interactome
- Software requirement
- Cytoscape (DL link www.cytoscape.org)
48Questions
- How many nodes and edges are featured in this
network? - How many self interactions does the network have?
- How many pairs are not connected to the largest
connected component? - Define the following topological parameters and
explain how they might be used to characterize a
protein-protein interaction network node degree
(or average number of neighbors), network
heterogeneity, average clustering coefficient
distribution, network centrality.
49Questions
- How many unique proteins were found to interact
in each organism? - How many interactions are mapped?
- How many human proteins are targeted by multiple
(i.e. how many individual human proteins interact
with gt1) EBV proteins? - How does identifying the multi-targeted human
proteins help you understand the pathogenicity of
the virus? Hint Speculate about the role of the
multi-targeted human proteins in the virus life
cycle.
50Questions
- Based on the degree property, what can you
deduce about the connectedness of ET-HPs? What
does this tell you about the kind of proteins
(i.e. what type of network component) EBV
targets?
51Questions
- What do the number and size of the largest
components tell you about the inter-connectedness
of the ET-HP subnetwork?
52Questions
- Why is distance relevant to network centrality?
What is unusual about the distance of ET-HPs to
other proteins and what can you deduce about the
importance of these proteins in the Human-Human
interactome?
53Questions
- Based on your conclusions from questions i-iii,
explain why EBV targets the ET-HP set over the
other human proteins and speculate on the
advantages to virus survival the protein set
might confer.
54Thanks