Protein-Protein Interaction - PowerPoint PPT Presentation

Loading...

PPT – Protein-Protein Interaction PowerPoint presentation | free to download - id: 3d292c-NTA2N



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Protein-Protein Interaction

Description:

Protein-Protein Interaction L519 presentation Group Members and Content Introduction - biological aspect of protein-protein interaction. (Zhenli Su) Protein-protein ... – PowerPoint PPT presentation

Number of Views:193
Avg rating:3.0/5.0
Slides: 74
Provided by: bioInform1
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Protein-Protein Interaction


1
Protein-Protein Interaction
  • L519 presentation

2
Group Members and Content
  • Introduction
  • - biological aspect of protein-protein
    interaction. (Zhenli Su)
  • Protein-protein interaction databases
  • - BIND (Xin Hong)
  • - DIP (Xiang Zhou)
  • Pathway databases and Algorithms (Paul Ma)
  • Visualization Tools (James Coleman)
  • present by Xiang Zhou

3
Biological Aspects of Protein-Protein Interaction
  • Zhenlu Su

4
  • Introduction to protein protein interactions
  • The importance of the interactions
  • Impact of protein interaction technologies on
    other fields
  • The types of protein interactions
  • The methods of protein interactions

5
Introduction to protein protein interactions
  • Proteins control and mediate many of the
    biological activities of cells
  • A cell is not static
  • Changes in shape
  • Division
  • Metabolism
  • All cells are not equivalent
  • Lymphoid
  • Neural

6
Why are protein-protein interactions so
important?
  • The binding of one signaling protein to another
    can have a number of consequences 
  • Such binding can serve to recruit a signaling
    protein to a location where it is activated
    and/or where it is needed to carry out its
    function.
  • The binding of one protein to another can induce
    conformational changes that affect activity or
    accessibility of additional binding domains,
    permitting additional protein interactions.
  •  

7
Why are protein-protein interactions so
important?
  • Imagine a cell in which, suddenly, the specific
    interactions between proteins would disappear.
    This unfortunate cell would become deaf and
    blind, paralytic and finally would disintegrate,
    because specific interactions are involved in
    almost any physiological process.

8
Impact on other fields
  • Cancer Biology
  • The study of protein-protein interactions has
    provided important insights into the functions of
    many of the known oncogenes, tumor suppressors,
    and DNA repair proteins.
  • Pharmacogenetics
  • Pharmacogenetic research has expanded to
    include the study of drug transporters, drug
    receptors, and drug targets.

9
The types of protein interactions
  • Binary protein protein interactions
  • Scaffolding proteins
  • http//www.udel.edu/chem/bahnson/chem667/crotty/sc
    affolding_proteins.htmlscaffolding

10
The types of protein interactions-another
classification
  • Metabolic and signaling (genetic)pathways
  • Morphogenic pathways in which groups of proteins
    participate in the same cellular function during
    a developmental process
  • Structural complexes and molecular machines in
    which numerous macromolecules are brought together

11
Signaling pathways
12
Morphogenic pathways
13
Structural complexes and molecular machines
  • Chaperones protein refolding machines
  • http//www-cryst.bioc.cam.ac.uk/cgi-bin/cgiwrap/ho
    mstrad/showpage.cgi?familyGroELdispstr
  • http//www.nature.com/nsb/web_specials/movies/saib
    il_side.html

14
Experimental methods
  • Tagged Fusion Proteins
  • Coimmunoprecipitation
  • Yeast Two-hybrid
  • Biacore
  • Atomic Force Microscopy (AFM)
  • Fluorescence Resonace Energy Trasfer (FRET)
  • X-ray Diffraction

15
Experimental methods
  • The first comprise and atomic observation in
    which the protein interaction is detected using,
    for example, X-ray crystallography. These
    experiments can yield specific information on the
    atoms or residues involved in the interaction.
  • The second is a direct interaction observation
    where protein interaction between two partners
    can be detected as in a two-hybrid experiment.
  • At a third level of observation, multi-protein
    complexes can be detected using methods such as
    immuno-precipitation or mass-specific analysis.
    This type of experiment does not unveil the
    chemical detail of the interactions or even
    reveal which proteins are in direct contact but
    gives information as to which proteins are found
    in a complex at a given time.
  • The fourth category comprises measurements at the
    cellular level, where an activity bioassay is
    used to observe an interaction for example,
    proliferation assays of cells by a
    receptor-ligand interaction.

16
Protein-Protein Interaction Databases
  • BIND
  • (Biomolecular Interaction Network Database)
  • Xin Hong

17
Introduction of BIND
  • Background
  • What is BIND
  • MCODE Algorithm
  • How to use BIND
  • Reference

18
Background
  • Recent advances in proteomics technologies such
    as two-hybrid, phage display and mass
    spectrometry have enabled us to create a detailed
    map of biomolecular interaction networks. Initial
    mapping efforts have already produced a wealth of
    data. As the size of the interaction set
    increases, databases and computational methods
    will be required to store, visualize and analyze
    the information in order to effectively aid in
    knowledge discovery.
  • For the protein-protein interactions, there are
    mnay websites can be reached, here I just show
    several.
  • BIND (Interaction Network Database)
  • DIP (Database of Interacting Proteins)
  • Protein-Protein Interaction Server
  • Protein-Protein Interface

19
What is BIND The Biomolecular Interaction
Network Database is a database designed to store
full descriptions of interactions, molecular
complexes and pathways. Development of the BIND
2.0 data model has led to the incorporation of
virtually all components of molecular mechanisms
including interactions between any two molecules
composed of proteins, nucleic acids and small
molecules. Chemical reactions, photochemical
activation and conformational changes can also be
described. Everything from small molecule
biochemistry to signal transduction is abstracted
in such a way that graph theory methods may be
applied for data mining. The database can be
used to study networks of interactions, to map
pathways across taxonomic branches and to
generate information for kinetic simulations.
BIND anticipates the coming large influx of
interaction information from high-throughput
proteomics efforts including detailed information
about post-translational modifications from mass
spectrometry.
20
  • What kind of data stored in BIND?
  • INTERACTION The interaction between two
    molecules as well as any chemical reactions that
    occur as a direct result of interaction.
  • Example P-P, P-n, P-s. (phosphorylation of P,
    methylation of D, hydrolysis of sugar)
  • COMPLEX describes a molecular complex by listing
    the series of interaction records that are
    present in the complex.
  • Example multi-sub enzyme, actin fiber, ribosome
  • PATHWAY describes a cellular process pass a
    sequential list of interaction records and its
    associated Chemical Action data.
  • Example cell-signaling pathway, synthesis of an
    amino acid, transcription and splicing of a
    pre-massager RNA.

21

22
What BIND can and cannot do right now
  • The design of the BIND database structure is a
    robust one that has been built to accept data
    from all cell systems, the interface that you see
    is NOT the data structure and it does not
    accurately reflect all of the potentialities of
    the database. Tools are being built to implement
    these potentials, and changes are constantly
    being made to the interface to make the database
    easier to use and understand.
  • BIND is currently able to accept records that
    describe protein-protein and protein-nucleic acid
    interactions.
  • The BIND data specification is available as ASN.1
    and XML DTD. ASN.1 data can describe details
    underlying biochemical and genetic networks. XML
    versions of all data with accompanying DTDs are
    supported through the use of the NCBI programming
    toolkit.

23
Demonstrating the use of Binding sites and
Binding Site Pairs for a protein-protein
interaction
  • The grey shapes represent autonomous domains in
    proteins A and B that mediate a protein-protein
    interaction. The black lines in these grey shapes
    represent polypeptide chains that continue
    outside of these domains to make up the rest of
    proteins A and B.
  • The protein-protein interaction between these two
    domains is mediated by two Binding Site pairs.
    The first pair (a salt bridge) consists of a
    single amino acid on molecule A (SLID 0) and a
    single amino acid on B (SLID 0). These two amino
    acids form the first Binding Site Pair. The
    second pair consists of a range of amino acids on
    A (SLID 1) and a range of amino acids on B (SLID
    1). These two ranges of amino acids form the
    second Binding Site Pair.

24
The electronic version of this article is the
complete one and can be found online at
http//www.biomedcentral.com/1471-2105/4/2
  • The Algorithm MCODE
  • -An automated method for finding molecular
    complexes in large protein interaction networks.
  • The MCODE algorithm operates in three stages,
    vertex weighting, complex prediction and
    optionally post-processing to filter or add
    proteins in the resulting complexes by certain
    connectivity criteria
  • Background
  • Recent advances in proteomics technologies such
    as two-hybrid, phage display and mass
    spectrometry have enabled us to create a detailed
    map of biomolecular interaction networks.

25
The Algorithm MCODE -An automated method for
finding molecular complexes in large protein
interaction networks. Results The algorithm has
the advantage over other graph clustering methods
of having a directed mode that allows fine-tuning
of clusters of interest without considering the
rest of the network and allows examination of
cluster interconnectivity, which is relevant for
protein networks. Protein interaction and complex
information from the yeast Saccharomyces
cerevisiae was used for evaluation. Conclusion De
nse regions of protein interaction networks can
be found, based solely on connectivity data, many
of which correspond to known protein complexes.
The algorithm is not affected by a known high
rate of false positives in data from
high-throughput interaction techniques. The
program is available from ftp//ftp.mshri.on.ca/pu
b/BIND/Tools/MCODE http//www.biomedcentral.com/14
71-2105/4/2
26
How to use BIND Pathway
  • The INAD Pathway in Drosophila Photoreceptors - A
    Tutorial
  • http//bind.ca/index2.phtml?sitetutor

27
  • How to use BIND
  • BIND Interaction Viewer Java Applet
  • BIND Interaction Viewer Java applet showing how
    molecules can be connected in the database from
    molecular complex to small molecule.
  • Yellow, protein
  • purple, small molecule
  • white, molecular complex
  • red, a square is fixed in place and will not be
    moved by the graph layout algorithm.
  • This session was seeded by the interaction
    between human LAT and Grb2 proteins involved in
    cell signaling in the T-cell.

28
  • Reference
  • Gary D Bader et al BMC Bioinformatics 2003 Jan
    134(1)2
  • An automated method for finding molecular
    complexes in large protein interaction networks
  • Gary D. Bader Nucleic Acids Research, 2001, Vol.
    29, No. 1 242-245 BINDThe Biomolecular
    Interaction Network Database
  • http//bind.ca/
  • http//nar.oupjournals.org/cgi/content/full/29/1/2
    42

29
Protein-Protein Interaction Databases
  • DIP
  • (Database of Interacting Proteins)
  • Xiang Zhou

30
What is DIP?
  • Established in 1999 in UCLA
  • Primary goal
  • extract and integrate protein-protein info and
    build a user-friendly environment.
  • The usage of DIP

31
The usage of DIP
  • Study
  • Protein function
  • Protein-protein relationship
  • Evolution of protein-protein interaction
  • The network of interacting proteins
  • The environments of protein-protein interactions
  • Predict
  • Unknown protein-protein interaction
  • The best interaction conditions

32
The structure of DIP
Protein Table
Method Table
Reference Table
Interaction Table
33
Protein Table
  • DIP accession number ltDIPnnnNgt
  • Identification numbers from
  • SWISS-Prot, GenBank, PIR
  • Protein Name and description
  • Cross references
  • Graph

34
A sample DIP protein table
35
Interaction Table
  • Interacting proteins
  • Links to
  • - Methods
  • - Original papers

36
A sample interaction table
37
The current status of DIP
  • Number of proteins 6978
  • Number of organisms 101
  • Number of interactions18260
  • Number of distinct experiments describing an
    interaction 22229
  • Number of articles 2203

38
Other satellite databases
  • DLRP (http//dip.doe-mbi.ucla.edu/dip/DLRP.cgi)
  • - Database of Ligand-Receptor Partners
  • LiveDIP(http//dip.doe-mbi.ucla.edu/ldipc/tmpl/liv
    edip.cgi)
  • - data of the protein states and state
    transition in protein-protein interaction.
  • JDIP
  • - a stand-alone Java application that provides a
    graphical, browser- independent interface to the
    DIP database.

39
Document types and annotations
  • Document types
  • - XIN and tab-delimited formats
  • Annotations
  • - Node ltDIP nnnNgt
  • - Edge ltDIP nnnEgt

40
Search DIP
  • http//dip.doe-mbi.ucla.edu/dip/Search.cgi

41
BIND and DIP Comparison
42
BIND and DIP Comparison
  • Size of the databases

43
BIND and DIP Comparison
  • Graphic tools
  • Data display layout

44
Pathway Databases and Algorithms
  • Paul Ma

45
1) KEGG(Kyoto Encyclopedia of Genes and Genomes)
  • Representation of higher order functions in
    terms of the network of interaction molecules
  • GENES database contains 240 943 entries from the
    published genomes, including the bacteria, mouse
    and human.
  • Has 3 databases, GENES, PATHWAY and LIGAND
    databases.
  • Each entry has the form, databaseentry or
    organismgene
  • ex) EC6.3.2.3 enzyme
  • genbankDROALPC gene
  • D.melanogasterdpp organism specific
    gene

46
  • By matching genes in the genome and gene
    products in the pathway, KEGG can be used to
    predict protein interaction networks and
    associated cellular function.
  • The data object stored in the PATHWAY database is
    called the generalized protein interaction
    network, which is a network of gene products with
    three types of interactions or relations
    enzyme-enzyme relations which catalyzes the
    successive reaction steps in the metabolic
    pathway, direct protein-protein interactions and
    gene expression relations. Currently, only
    enzyme-enzyme relations are maintained.
  • PATHWAY database contains 5761 entries including
    201 pathway diagrams with 14,960 enzyme-enzyme
    relations.

47
An example of a pathway entry in KEGG- Glycolysis
48
2) WIT database Oak Ridge National Laboratory
  • Similar to KEGG
  • 3) Eco Cyc E Coli Encyclopedia
  • the genome and gene products of E Coli, its
    metabolic and signal transduction pathways and
    its RNAs. Contains 4391 genes, 904 metabolic
    reactions and 129 metabolic pathways

49
Graph theoretical algorithmfor finding the
molecular complex
  • Small-world networks
  • How to identify a set of central metabolites such
    as in BIND database ? MCODE
  • Many biological networks have small-world
    characteristic
  • ex) Erdos number
  • Paul Erdos A prominent Hungarian
    graph-theorist. He is the center of mathematical
    collaboration. Coauthors of a paper with Erdos
    are one step from Erdos and has Erdos number 1.
    Coauthors of a paper with mathematicians with
    Erdos number 1 have Etrdos number 2. Most
    mathematicians active in this century has a small
    Erdos number
  • ex) Kevin Bacon game
  • It aims at connecting an arbitrary actor with the
    actor Kevin Bacon by the shortest sequence of
    actor-pairs who have appeared together in a film.
    The average Bacon number for an arbitrary actor
    turns out to be 2.87. (However, Kevin Bacon is
    not the center of this small world of film actor
    collaboration. The center turns out to be
    Christopher Lee, with a mean center of 2.60
  • )

50
  • Small-world lies between two extremes of graph,
    completely regular and completely random graph.
  • Regular networks have long path lengths, and are
    clustered, while random graphs has short path
    length but shows little clustering.
  • Small-world networks has short path lengths but
    highly clustered.
  • The metabolic network of E. coli falls into the
    small-world network. The center of the map is
    glutamate with a mean path of 2.46, followed by
    pyruvate with a value of 2.59

51
Three Cases of Networks
52
MCODE(Molecular Complex Detection) in BIND
database
  • Algorithms for finding clusters an active area
    of computer science
  • - often based on network flow/minimum cut
    theory or spectral clustering
  • - MCODE uses a vertex-weighting scheme based on
    the clustering coefficient, Ci, which means the
    cliquishness of the neighborhood of a vertex.
  • - Ci 2n/ki (ki -1), where ki is the vertex
    size of the neighborhood of vertex i and n is
    the number of edges in the neighborhood.

53
  • Density of a subgraph is the number of edges
    divided by the maximum possible number of edges,
    so it ranges from 0.0 to 1.0
  • A k-core is a subgraph of minimal degree k, i.e,
    every vertex of it has degree gt k.
  • So, the highest k-core of a graph is the
    central most densely connected subgraph
  • We define the core-clustering coefficient of a
    vertex to be the density of the highest k-core of
    the immediate neighborhood of v, including v.

54
  • The core-clustering coefficient amplifies the
    weighting of the heavily interconnected graph
    regions while removing the many less connected
    vertices that are characteristics of the
    bimolecular interaction network
  • Then, the weight of a vertex is the product of
    the vertex core-clustering coefficient and the
    highest k-core level, kmax, of the immediate
    neighborhood of the vertex.
  • Then, finds a complex with the highest weight
    vertex and recursively moves outward from this
    vertex, including vertices whose weight is above
    a given threshold of the seed vertex. In this way
    the densest regions of the network are
    identified.
  • The time complexity is O(nmh3), where n is the
    number of vertices, m is the number of edges and
    h is the vertex size of the average neighborhood
    in the graph

55
  • It is slower than the fastest min-cut graph
    clustering algorithm with O(n2 log n) time
    complexity. But MCODE has a number of advantages.
    Since weighting is done only once and it
    comprises most of the execution time we can try
    many parameters. Another is MCODE is relatively
    easy to implement.

56
Structure Visualization Tools
  • Written by James Coleman
  • Presented by Xiang Zhou

57
Structure Visualization
  • One of the primary activities in proteomics RD
    is determining and Visualizing the 3D structure
    of proteins in order to find where drugs might
    modulate their activity. Other activities
    include identifying all of the proteins produced
    by a given cell or tissue and determining how
    these proteins interact.
  • BIOINFORMATICS COMPUTING, p.186, Bryon Bergeron,
    M.D., Prentice Hall 2002

58
Structure Visualization
  • Its generally understood by the molecular
    biology research community that the sequencing of
    the human genome, which will likely take several
    more years to complete, is relatively trivial
    compared to definitively characterizing the
    interactions within the proteome.
  • BIOINFORMATICS COMPUTING, p.186, Bryon Bergeron,
    M.D., Prentice Hall 2002

59
Non-Static Structure Visualization
  • Unlike a nucleotide sequence, which is a
    relatively static structure, proteins are dynamic
    entities that change their shape and association
    with other molecules as a function of
    temperature, chemical interactions, pH, and other
    changes in the environment.
  • BIOINFORMATICS COMPUTING, p.186, Bryon Bergeron,
    M.D., Prentice Hall 2002

60
Primary vs. Secondary and Tertiary Structure
  • In contrast to visualizing the sequence of
    nucleotides on a strand of DNA, visualizing the
    primary structure of a protein adds little to the
    knowledge of protein function. More interesting
    and relevant are the higher-order structures.

61
Why Visualize?
  • In each area of bioinformatics, the rationale for
    using graphics instead of tables or strings of
    data is to shift the users mental processing
    from reading and mathematical, logical
    interpretation to faster pattern recognition.
  • BIOINFORMATICS COMPUTING, p.180, Bryon
    Bergeron, M.D., Prentice Hall 2002
  • Pattern recognition is an area where humans are
    much more efficient than computers.

62
Some Common Tools
  • 100s of visualization tools have been developed
    in bioinformatics.
  • Many are specific to hardware such as microarray
    devices.
  • Shareware utilities for PCs
  • PDB Viewer, WebMol, RasMol, Protein Explorer,
    Cn3D
  • VMD, MolMol, MidasPlus, Pymol, Chime, Chimera

63
Application Feature Summary
64
Molecule Representations
65
Wireframe used to show individual chains
66
Stick view showing atoms and bonds
67
Surface View showing surface fields
68
Ribbon view of secondary structure
69
Distinct geometrical features by color
70
Other properties that can be Visualized
  • MolMol supports the display of electrostatic
    potentials across a protein molecule.
  • MidasPlus (a predecessor of Chimera) allows for
    the editing of sequences visually to see the
    effects of point mutations.

71
HCI and Protein-Protein Interaction
  • Creating a suitable metaphor to transform data
    into a form that means something to the user.
  • Large volumes of complex data require more
    complex metaphors than, for example, the pie
    chart used in business graphics.
  • Different users require different levels of
    complexity and therefore different metaphors.
  • The desktop, folder, trashcan metaphor could be
    replaced by a chromosome, gene, protein, pathway
    metaphor.

72
For Protein interactions, we need a metaphor that
reveals dynamics
  • Haptic Joystick Provides force feedback when
    user manipulates a molecule near another one.
  • 3D Goggles combined with haptic gloves to feel
    electrostatic potentials and see tertiary
    structure dynamics.
  • PyMol provides scripting that can produce a movie
    in 3D of the geometrical relationship between
    multiple proteins.

Stereo view of interaction of two proteins.
Scripting allows for the movement of individual
molecules creating a movie.
73
The field is wide open.
  • To definitively characterize the interactions
    within the proteome, we need more tools.
  • We need new metaphors for managing this complex
    data.
  • We need tools to reveal dynamic relationships.
About PowerShow.com