Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms

Description:

Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms Nathan Edwards Department of Biochemistry and Molecular & Cellular Biology Georgetown University ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 46
Provided by: edwardsla1
Category:

less

Transcript and Presenter's Notes

Title: Proteomics and Glycoproteomics (Bio-)Informatics of Protein Isoforms


1
Proteomics and Glycoproteomics(Bio-)Informatics
of Protein Isoforms
  • Nathan Edwards
  • Department of Biochemistry and Molecular
    Cellular Biology
  • Georgetown University Medical Center

2
Outline
  • Tandem mass-spectrometry of peptides
  • Detection of alternative splicing protein
    isoforms
  • Phyloproteomics using top-down mass-spec.
  • Characterization of glycoprotein
    microheterogeneity by mass-spectrometry

3
Mass Spectrometer
  • ElectronMultiplier(EM)
  • Time-Of-Flight (TOF)
  • Quadrapole
  • Ion-Trap
  • MALDI
  • Electro-SprayIonization (ESI)

4
Mass Spectrum
5
Mass is fundamental
6
Sample Preparation for MS/MS
7
Single Stage MS
MS
8
Tandem Mass Spectrometry(MS/MS)
Precursor selection
9
Tandem Mass Spectrometry(MS/MS)
Precursor selection collision induced
dissociation (CID)
MS/MS
10
Why Tandem Mass Spectrometry?
  • MS/MS spectra provide evidence for the amino-acid
    sequence of functional proteins.
  • Key concepts
  • Spectrum acquisition is unbiased
  • Direct observation of amino-acid sequence
  • Sensitive to small sequence variations

11
Unannotated Splice Isoform
  • Human Jurkat leukemia cell-line
  • Lipid-raft extraction protocol, targeting T cells
  • von Haller, et al. MCP 2003.
  • LIME1 gene
  • LCK interacting transmembrane adaptor 1
  • LCK gene
  • Leukocyte-specific protein tyrosine kinase
  • Proto-oncogene
  • Chromosomal aberration involving LCK in
    leukemias.
  • Multiple significant peptide identifications

12
Unannotated Splice Isoform
13
Unannotated Splice Isoform
14
Splice Isoform Anomaly
  • Human erythroleukemia K562 cell-line
  • Depth of coverage study
  • Resing et al. Anal. Chem. 2004.
  • Peptide Atlas A8_IP
  • SALT1A2 gene
  • Sulfotransferase family, cytosolic, 1A
  • 2 ESTs, 1 mRNA
  • mRNA from lung, small cell-cancinoma sample
  • Single (significant) peptide identification
  • Five agreeing search engines
  • PepArML FDR lt 1.
  • All source engines have non-significant E-values

15
Splice Isoform Anomaly
16
Splice Isoform Anomaly
17
Translation start-site correction
  • Halobacterium sp. NRC-1
  • Extreme halophilic Archaeon, insoluble membrane
    and soluble cytoplasmic proteins
  • Goo, et al. MCP 2003.
  • GdhA1 gene
  • Glutamate dehydrogenase A1
  • Multiple significant peptide identifications
  • Observed start is consistent with Glimmer 3.0
    prediction(s)

18
Halobacterium sp. NRC-1ORF GdhA1
  • K-score E-value vs PepArML _at_ 10 FDR
  • Many peptides inconsistent with annotated
    translation start site of NP_279651

19
Translation start-site correction
20
What if there is no "smoking gun" peptide
21
What if there is no "smoking gun" peptide
22
What if there is no "smoking gun" peptide
23
HER2/Neu Mouse Model of Breast Cancer
  • Paulovich, et al. JPR, 2007
  • Study of normal and tumor mammary tissue by
    LC-MS/MS
  • 1.4 million MS/MS spectra
  • Peptide-spectrum assignments
  • Normal samples (Nn) 161,286 (49.7)
  • Tumor samples (Nt) 163,068 (50.3)
  • 4270 proteins identified in total
  • 2-unique generalized protein parsimony

24
Nascent polypeptide-associated complex subunit
alpha
7.3 x 10-8
25
Pyruvate kinase isozymes M1/M2
2.5 x 10-5
26
Phyloproteomics
  • Fragment intact proteins (top-down MS)
  • Match the spectra to protein sequences
  • Place the organism phylogenetically
  • Works even for unknown microorganisms without any
    available sequences

27
CID Protein Fragmentation Spectrum from Y. rohdei
28
CID Protein Fragmentation Spectrum from Y. rohdei
Match to Y. pestis 50S Ribosomal Protein L32
29
Exact match sequence
30
Phylogeny Protein vs DNA
Protein Sequence
16S-rRNA Sequence
31
What about mixtures?
32
Shared Small Ribosomal Proteins
33
Shared Small Ribosomal Proteins
34
Identified E. herbicola proteins
  • DNA-binding protein HU-alpha
  • m/z 732.71, z 13, E-value 7.5e-26, ? -14.128
  • Eight proteins identified with "large" ?

35
Identified E. herbicola proteins
  • DNA-binding protein HU-alpha
  • m/z 732.71, z 13, E-value 1.91e-58
  • Use "Sequence Gazer" to find mass shift
  • ?M mode can "tolerate" one shift for free!

36
Identified E. herbicola proteins
  • DNA-binding protein HU-alpha
  • m/z 732.71, z 13, E-value 7.5e-26, ? -14.128
  • Extract N- and C-terminus sequence supported by
    at least 3 b- or y-ions

37
E. herbicola protein sequences
38
E. herbicola sequences found in other species
39
Phylogenetic placement of E. herbicola
Cladogram
Phylogram
phylogeny.fr "One-Click"
40
Glycoprotein Microheterogeneity
  • Glycosylation is important, but our analytic
    tools are rather rudimentary
  • Detach glycans (PNGase-F) and analyze glycans
  • Detach glycans (PNGase-F) and analyze peptides
  • Get glycan structures, but no association with
    protein or protein site, or
  • Get glycosylation sites, but no association with
    glycan structures.
  • We analyze glycopeptides directly
  • Challenges all facets of glycoproteomics

41
Altered N-Glycosylation in Cancer
Glycosyltransferase Expression or Glycan Analyses
GalNAc Sialic Acid Gal
GlcNAc Man
K. Chandler
42
The informatics challenge
  • Identify glycopeptides in large-scale tandem
    mass-spectrometry datasets
  • Many glycopeptide enriched fractions
  • Many tandem mass-spectra / fraction
  • Good, but not great, instrumentation
  • QStar Elite CID, good MS1/MS2 resolution
  • Strive for hypothesis-generating analysis
  • Site-specific glycopeptide characterization
  • Glycoform occupancy in differentiated samples

43
CID Glycopeptide Spectrum
44
Observations
  • Oxonium ions (204, 366) help distinguish
    glycopeptides from peptides
  • but do little to identify the glycopeptide
  • Few peptide b/y-ions to identify peptides
  • but intact peptide fragments are common
  • If the peptide can be guessed, then
  • the glycan's mass can be determined

45
Haptoglobin Standard
  • N-glycosylation motif (NX/ST)
  • Site of GluC cleavage

Pompach et al. Journal of Proteome Research 11.3
(2012) 17281740.
46
Tuning the filters
  • Oxonium ions
  • Number intensity
  • Match tolerance
  • "Intact-peptide" fragments
  • Number intensity
  • Match tolerance
  • Glycan composition
  • ICScore
  • Constrain search space
  • Match tolerance
  • Glycan database
  • Constrain search space
  • Match tolerance
  • Precursor ion
  • Non-monoisotopic selection
  • Sodium adducts
  • Charge state
  • Peptide search space
  • Semi-specific peptides
  • Non-specific peptides
  • Peptide MW range
  • Variable modifications

47
Tuning the filters
  • We estimate the number of false-positivesso
    that the user can tune the search parameters

48
Application of Exoglycosidasesto locate Fucose
  • At ITIH4 site N517

K. Chandler
49
NVVFVIDK ITIH4 Glycopeptide
K. Chandler
50
Similar Glycopeptides Spectra( mass ? 162 Da)
?
162 Da
51
Fragmented Glycopeptides( mass ? 162 Da)
?
162 Da
52
Propagating Annotations
G. Berry
53
Summary
  • Mass-spectrometry coupled with protein chemistry
    and good informatics can look beyond the obvious
    to the unexpected...
  • and there is plenty to find!

54
Acknowledgements
  • Edwards lab
  • Kevin Chandler
  • Gwenn Berry
  • Fenselau lab (UMD)
  • Colin Wynne
  • Avantika Dhabaria
  • Goldman lab (GU)
  • Kevin Chandler
  • Petr Pompach
  • NSF Graduate Fellowship (Chandler)
  • Funding NCI
Write a Comment
User Comments (0)
About PowerShow.com