Proteomics & Bioinformatics - PowerPoint PPT Presentation

Loading...

PPT – Proteomics & Bioinformatics PowerPoint presentation | free to download - id: 3bce5b-ODEwN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Proteomics & Bioinformatics

Description:

Proteomics & Bioinformatics MBI, Master's Degree Program in Helsinki, Finland Lecture 5 11 May, 2007 Sophia Kossida, BRF, Academy of Athens, Greece – PowerPoint PPT presentation

Number of Views:229
Avg rating:3.0/5.0
Slides: 65
Provided by: bioacadem
Learn more at: http://www.bioacademy.gr
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Proteomics & Bioinformatics


1
Proteomics Bioinformatics
MBI, Master's Degree Program in Helsinki, Finland
Lecture 5
11 May, 2007
Sophia Kossida, BRF, Academy of Athens,
Greece Esa Pitkänen, Univeristy of Helsinki,
Finland Juho Rousu, University of Helsinki,
Finland
2
Mining proteomes
To identify as many components of the proteome as
possible
Mapping of proteomes of various organisms and
tissues Comparison of protein expression levels
for the detection of disease biomarkers
3
How to select proteome?
A proteome is defined by the state of the
organism, tissue, or cell that produces it.
Because these states are constantly changing, so
are the proteomes.
Example of proteomes different kind of cells
liver, extracellular fluids blood plasma,
urine, CSF
4
Applications
Systems biology - understand cell-pathways,
network, and complex interacting. Biological
processes - characterize sub-proteomes such as
protein complexes, cellular machines, organelles
Biomarkers - discovery of disease (serological,
urine, other biological fluids) - diagnostics,
treat patients, monitor therapies Drug targets -
evaluate toxicity other biological or
pharmaceutical parameters associated with drug
treatment
5
Protein Profiling
Measure the expression of a set of proteins in
two samples and compare them - Comparative
proteomics
  • 2D gel electrophoresis
  • Difference gel electrophoresis (DIGE)
  • LC-MS/MS using coded affinity tagging
  • (ICAT, iTrac, SILAC..)
  • ProteinChip Array (SELDI analysis)
  • Antibody arrays

6
Laser-Capture Micro dissection, LMC
Technique for selectively sampling certain cells
within a tissue
Biopsy
Tissue sample
Transfer film
Tumor
Glass slide
Laser beam activates film
Cells
Selected cells are transferred
Genomic/proteomic analysis
Modified from National Cancer Institute, US
National Institutes of Health http//www.cancer.g
ov/cancertopics/understandingcancer/moleculardiagn
ostics/Slide29
7
2D gels, DIGE
High resolving power Absolute / relative
quantity Easily archived for further
comparison Detects some PTMs and alternatives
splices Low troughput Poor detection of large,
acidic, basic and membrane proteins Only high
abundance proteins
8
DIGE
Proteins are labeled prior to running the first
dimension with up to three different fluorescent
cyanide dyes
Mix labeled extracts
Internal standard
Allows use of an internal standard in each
gel-to-gel variation, reduces the number of gels
to be run
Adds 500 Da to the protein labeled Additional
post-electrophoretic staining needed
9
Human brain proteins
Differences in Expression Level in Thalamus
10
Example of different expression
11
LC-MS/MS using coded affinity tagging
Moderate throughput, but can be automated Detects
some low abundance proteins Most isotope label
experiments limited to two versions heavy and
light isotope, i.e. binary comparisons only Poor
detection of alternative splices and PTMs
12
Labeling
Chemical, ICAT, ITRAQ Chemical modifications to
amino acids generally after digestion Most labels
differ by 3-10Da in mass (not complete /
interferences) Compares only 2-8 samples SILAC
Stable isotopes incorporated during cell growth
Must be able to grow cells Compares 2 or 3
samples Lys (8 Da) and Arg (10 Da) Ion
Current No labeling of any kind, See everything
in the sample not just what gets
labeled Normalization issues, (2 separate runs
are compared) Standards needed Robust and many
samples and experimental conditions can be
compared
13
Isotope Coded Affinity Tag (ICAT)
Two protein samples, are labeled with normal and
heavy versions of the same isotope-coded affinity
tag (ICAT) reagent, respectively. The reagent
binds to cysteine residues and carries a
biotin-tag.
Samples are mixed, digested and ICAT-labeled
peptides are recovered via the biotin tag of the
ICAT reagents by -affinity chromatography.
Drawback Cysteine containing peptides only
14
ICAT
  • Label protein samples with heavy and light
    reagent
  • Reagent contains affinity tag and heavy or light
    isotopes

Chemically reactive group forms a covalent bond
to the protein or peptide
Isotope-labeled linker heavy or light, depending
on which isotope is used
Affinity tag enables the protein or peptide
bearing an ICAT to be isolated by affinity
chromatography in a single step
Modified from http//skop.genetics.wisc.edu/AhnaMa
ssSpecMethodsTheory.ppt260,11,Mass Spectrometry
15
Example of an ICAT Reagent
Modified from http//skop.genetics.wisc.edu/AhnaMa
ssSpecMethodsTheory.ppt260,11,Mass Spectrometry
16
Stable-isotope labeling
Aebersold and Mann, Nature, 2004
17
Isobaric tag reagent
Isobaric tags for relative and absolute
quantification
Allows us to compare the relative abundance of
proteins from four different samples in a single
mass spectrometry experiment
Isobaric Tag (Total mass 145 Da)
Peptide reactive group
Reporter mass114 to 117 Gives strong signature
ion in MS/MS Good b- and y-series Maintains
charge state and ion masses Signature ion masses
lie in quiet low mass region
Balance mass 31 to 28 Balances the mass change of
reporter to maintain a total mass of 145 Neutral
loss in MS/MS
Amine specific
18
iTRAQ
Uses up to 4 tag reagents that bind covalently to
the N-terminus of the peptide and any Lysine side
chains at the amine group (global tagging).
Each sample set is digested separately and then
mixed with the specific iTRAQ tag
Samples mixed
Reporter Balance - Peptide intact 4 samples
identical m/z
MS
Peptide fragments equal Reporter ions different
MS/MS
Modified from Quantitative Proteomics Using
Isotope Tagging of Peptides by Kathryn Lilley
19
iTRAQ spectrum
20
Stable isotope labeling in cell culture
SILAC
1. Cell culture with normal Arginine
2. Cell culture plus heavy Arginine.
LC-MS/MS
Combine, digest, (purification)
Quantify levels from peak ratio
21
SILAC Example
From presentation by Nicholas E. Sherman,
Ph.D. http//www.healthsystem.virginia.edu/interne
t/biomolec/Keck_Dec12_2006.ppt387,15,Slide 15
22
SELDI
Surface Enhanced Laser Desorption Ionization
Ionized proteins are detected and their mass
accurately determined by Time-of-Flight Mass
Spectrometry
High throughput Small amounts of sample More
reproducible than 2DE, but lower resolving
power Applied for the analysis of crude
samples Process is not standardized
23
The SELDI-chip
24
Antibody arrays
Not discovery based Must have 1 or 2 specific
high affinity antibodies Very high throughput Can
be highly quantitative - relative and
absolute Can design reagents to detect PTMs,
splice forms
25
Antibody array
Modified from slide FullMoonBiosystemsInc.
(http//www.fullmoonbio.com/Doc/Overview.pdf)
26
Protein Protein Interactions
From single proteins to systems biology
27
Protein-Protein Interactions
Proteins work together forming multi complexes
to carry out the specific functions
28
Identification of interactions
  • Computational
  • Genomic data
  • Phylogenetic profiling
  • Gene context
  • Gene fusion
  • Symmetric evolution
  • Structural data
  • Sequence profile
  • 3D structural distance matrix
  • Surface patches
  • Binding interactions
  • Experimental
  • x-ray crystallography
  • NMR spectroscopy
  • Mass spectrometry
  • (Tandem affinity purification)
  • Immunoprecipitation
  • Yeast two-hybrid
  • Microarrays

29
X-ray crystallography
Crystals hard to obtain Good for large proteins
Bioinformatics center, University of
Copenhagen Modified from presentation http//www.
biosys.dk/courses/Previous_courses/Introductory_Bi
oinformatics/protein_structure.pdf
30
Nuclear Magnetic Resonance
Multidimensional NMR
NMR Spectroscopy
For proteins in solution Better for small
proteins than large ones
31
Identification by mass spectrometry
SDS-PAGE
MALDI-TOF
Immunoprecipitate
anti-
Peptide mixture
LC-MS-MS
shotgun identification
32
Immunoprecipitation
Immunoprecipitation of a protein of interest,
analyzed by 1D-SDS-PAGE Electrophoretically
transferred to membrane, the membrane is probed
with antibodies suspected as partners of the
target protein
SDS-PAGE
Immunoprecipitation
Western blot
undetected
Only detects what one sets out to look for.
Obtaining a suitable antibody is important. The
antibody might immuno-precipitate the protein
successfully, but not when other interacting
proteins are present.
33
Yeast Two-Hybrid System
A transcription factor is split into 2 domains
and two hybrid proteins are designed. One
protein of interest (bait) is typically fused to
a DNA-binding domain. The proteins being screened
for interactions with the bait (preys) are fused
to a transcription-activating domain. An
interaction between the bait and a prey will
bring these 2 domains close together which in
turn results in the transcription of a reporter
gene.
The reporter can be essential, in which case the
colony dies if no interaction reversely, the
reporter gene can be attached to a green
fluorescent protein
Prey protein
Bait protein
mRNA
Activation Domain
Binding Domain
Reporter Gene
Promoter Region
The rate of false positive is high (estimated gt
45)
34
Microarray co-expression
Microarray study the expression of genes as a a
function of time, or following treatment with a
drug, Co-expression of genes are usually a
sign that the two proteins interact.
Gene A Gene B
Expression level
Time or treatment
35
Identification of Co-expressed Genes
To determine which genes have similar/correlated
expression patterns to derive their functional
relationships
Data clustering We can represent each gene as a
vector (5, 15, 10, 7, 5, 3) So a set of
expression data can be represented as a
collection of data points in K-dimensional
space Genes with similar expression patterns form
data clusters
36
In silico Prediction of PPI
Phylogenetic Profile
The phylogenetic profile of a protein is a string
that encodes the presence or absence of the
protein in every sequenced genome
Conserved presence or absence of a protein pair
suggests functional coupling.
Phylogenetic profile (against N genomes) For
each gene X in a target genome if gene X has a
homolog in genome i, the ith bit of Xs
phylogenetic profile is 1 otherwise it is 0
37
In silico Prediction of PPI
Gene Context Conserved gene neighbourhood
suggests position- function coupling
Gene Fusion (Rosetta stone) Seemly unrelated
proteins are sometimes found fused in another
organism
Though gene-fusion has low prediction coverage,
its false-positive rate is low
38
In silico Prediction of PPI
Symmetric Evolution Interaction positions on
different proteins should co-evolve so as to
maintain the interface. Look for correlation
between sequence changes at one position and
those at another position in a multiple sequence
alignment.
Docking determination of protein complex
structure from individual protein structures
39
Structure- and interaction databases
STRING (EMBL) BOND (Unleashed Informatics) DIP
(UCLA) iHOP
40
STRING
http//string.embl.de
41
BOND
Biomolecular Object Network Databank
http//bond.unleashedinformatics.com
42
Database of Interacting Proteins
The DIP database catalogs experimentally
determined interactions between proteins. It
combines information from a variety of sources to
create a single, consistent set of
protein-protein interactions.
http//dip.doe-mbi.ucla.edu/
43
ihop
http//www.ihop-net.org/UniPub/iHOP/
44
Proteomics in human diseases
45
Fingerprinting of bladder cancer
Combination of protein extract
MALDI-TOF/TOF
LC
blood/urine
Application of bioinformatics tools (feature
extraction, classification algorithms)
Disease classification
46
Strategy for Biomarker Discovery
Disease vs. Normal
Proteomic analysis (2D gels / MS)
Genomic analysis mRNA level
Discovery Candidate gene Validation in situ
hybridization Immunohistochemistry Application
Large samples Small candidates
Clinical Application
Diagnostic
Therapeutic
Prognostic
47
Proteins as biomarkers
The protein composition may be associated with
disease processes in the organism and thus have
potential utility as diagnostic markers.
Proteins are closer to the actual disease
process, in most cases, than parent
genes Proteins are ultimate regulators of
cellular function Most cancer markers are
proteins The vast majority of drug targets are
proteins
Individual biomarkers are not sufficient for
accurate disease detection Panel of biomarkers
should be established
48
Benefits of Molecular Diagnostics
proteins
MS
Patients blood sample
Ovarian pattern
  • Create new cancer screening tools
  • Inform design of new treatments
  • Monitor treatment effectiveness
  • Predict patients response to treatment

49
From known samples to serum proteins
Patterns as screening tool
MS
Protein patterns
Early diagnosis of disease Early warning of
toxicity
MS
50
Proteomics in nutrition of food
Development of fingerprinting techniques to
identify changes in modified organisms at
different integration levels (2D gels, MALDI)
MALDI-MS).
51
Identification of unintended side effects
A proteome analysis of livers from mice traeted
with WY14.643
Isolation of protein spots
Peptide mapping
MALDI-TOF analysis
Amino acid sequence
Data base
16 proteins
Protein identified
Proteins from animals after treatment
Liver proteins from control
http//i-council-biomed-biotech.org/Contacts20to
20Add_files/Haoudi20Oman20Feb202005.pdf
52
Identification of breast cancer biomarkers by
iCAT LC-MS
53
Biomarker Discovery
  • Markers can be easily found by comparing protein
    maps.
  • SELDI is faster and more reproducible than 2D
    PAGE.
  • Has been used to discover protein biomarkers of
    diseases such as ovarian cancer, breast cancer,
    prostate and bladder cancers.

 
Modified from Ciphergen Web Site)
54
Gene Ontology
  • A knowledge representation about the word or some
    part of it.
  • An ontology is used as a description of the
    concepts and relationships that exist for a
    community of agents.
  • Ontology generally describes
  • Individuals the basic or ground level
    objects
  • Classes sets, collections, or types of objects
  • Attributes properties, features,
    characteristics, or parameters that objects can
    have and share
  • Relations ways that objects can be related to
    one another

from wikipedia
55
Goals
Develop a set of controlled, structured
vocabularies gene ontology (GO) to describe
aspects of molecular biology Describe gene
products using vocabulary terms
(annotation) Provide a public resource, allowing
access to the GO, annotations and software tools
developed for use with the GO data
www.geneontology.org
56
The Three Ontologies
Molecular Function describes activities, or
tasks, performed by individual or by assembled
complexes of gene products. DNA binding,
transcription factor Biological Process a
series of events accomplished by one or more
ordered assemblies of molecular functions. NOT
a pathway! mitosis, signal transduction,
metabolism Cellular Component location or
complex , a component of a cell, that also is
part of some larger object nucleus, ribosome,
origin recognition complex
57
Relationships between terms
Directed acyclic graph each child may have one
or more parents
Every path from a node back to the root must be
biologically accurate (the true path rule)
Relationship types is_a class-subclass
relationship, meaning that a is a type of
b Exemple nuclear chromosome is_a
chromosome. part_of physical part of
(component) subprocess of (process) part_of c
part_ of d, meaning that whenever c is present,
it is a part of d, but c doesnt always have to
be present. Example nuleus part_of cell
meaning that nuclei are always part of a cell,
but not all cells have nuclei.
58
Relationships between terms
Example the biological process term hexose
biosynthesis has two parents, hexose metabolism
and monosaccaride biosynthesis. This is because
biosynthesis is a subtype of metabolism, and a
hexose is a subtype of monosaccharide. When any
gene involved in hexose biosynthesis is annotated
to this term, it is automatically annotated to
both hexose metabolsim and monosaccharide
biosynthesis, because every GO term must obey the
true path rule, if the child term deescribes
the gene product, then all its parent terms must
also apply to that gene product..
59
Evidence codes
IC Inferred by Curator IDA Inferred from
Direct Assay IEA Inferred from Electronic
Annotation IEP Inferred from Expression
Pattern IGC Inferred from Genomic Context IGI
Inferred from Genetic Interaction IMP Inferred
from Mutant Phenotype IPI Inferred from
Physical Interaction ISS Inferred from Sequence
or Structural Similarity NAS Non-traceable
Author Statement ND No biological Data
available RCA Inferred from Reviewed
Computational Analysis TAS Traceable Author
Statement NR Not Recorded
60
Gene Ontology Home
61
GO tools
  • search for gene products and view the terms with
    which they are associated
  • search or browse the ontology for GO terms of
    interest and see term details and gene product
    annotations.
  • AmiGO also provides a BLAST search engine, which
    searches the sequences of genes and gene products
    that have been annotated to a GO term and
    submitted to the GO Consortium.

62
Annotation tools
63
ReBIL
64
Gene expression tools
About PowerShow.com