Title: BioMarker
1A biased look at Biomarkers
2BioMarker
Definition
Biomarker is a substance used as an indicator of
a biologic state Existence of living
organisms or biological process. A particular
disease state
Proteins
Nucleic acids
Metabolites Carbohydrates
Lipids
Small molecules
3Biomarker
Detection of biomarker
Detection of biomarker diagnosis Self
properties, e.g enzymatic activities Antibodies,
IHC, ELISA
Detection of biomarker Quantitative a link
between quantity of the marker and disease
Qualitative a link between exist of a
marker and disease
4Biomarker Diagnosis
Ideal Marker for diagnosis
Should have great sensitivity, specificity, and
accuracy in reflecting total disease burden. A
tumor marker should also be prognostic of outcome
and treatment
Biomarker for Screening
- The marker must be highly specific, minimize
false positive and negative
- The marker must be able to clearly reflect the
different stages of the disease (early)
- The marker must be easily detected without
complicated medical - procedures. The disease markers released to
serum and urine are good - targets for application of early screening.
- The method for screening should be cost
effective.
Samples for biomarker detection
Blood, urine, or other body fluids samples
Tissue samples
5Prostate Cancer marker PSA
PSA is a protein normally made in the prostate
gland in ductal cells that make some of the
semen. PSA helps to keep the semen liquid. PSA,
also known as kallikrein III, seminin,
semenogelase, ?-seminoprotein and P-30 antigen,
is a glycoprotein, a serine protease
6Prostate Cancer Diagnosis with PSA
Cancer of the prostate does not cause any
symptoms until it is locally advanced or
metastatic.
There is a correlation between elevated PSA and
prostate cancer.
Detection of PSA is a surrogate for early
detection of prostate cancer.
Large screening trials have shown that PSA nearly
doubles the rate of detection when combined with
other methods. Based on these data, PSA testing
was approved by the US FDA for the screening and
early detection of prostate cancer.
PSA is also found in the cytoplasm of benign
prostate cells.
I never dreamed that my discovery four decades
ago would lead to such a profit-driven public
health disaster." -Richard Ablin (inventor of
the PSA test)
PSA screening generates 1.7 billion annually
in the U.S. alone.
7Sensitivity the ability of the test to detect
the disease (True positive rate) Specificity
the likelihood that your test will be normal if
you are disease free (True
Negative)
8A brief aside about Statistics and Probability
-Statistics are the formalization of common
sense -because they have to handle many
different situations, they can be
really complicated -they should make you feel
really good or really bad about your
data -People are inherently bad at statisitics
and probability Case Study rate for being HIV
positive 110000
false positive rate of HIV test
11000 If I test positive, what is the chance
that I am really HIV negative?
9A brief aside about Statistics and Probability
-Statistics are the formalization of common
sense -because they have to handle many
different situations, they can be
really complicated -they should make you feel
really good or really bad about your
data -People are inherently bad at statisitics
and probability Case Study rate for being HIV
positive 110000
false positive rate of HIV test
11000 What is the chance that I am HIV
negative? 0.0001 0.001 0.01
0.1 0.9 0.99 0.9999
10A brief aside about Statistics and Probability
-Statistics are the formalization of common
sense -because they have to handle many
different situations, they can be
really complicated -they should make you feel
really good or really bad about your
data -People are inherently bad at statisitics
and probability Case Study rate for being HIV
positive 110000
false positive rate of HIV test
11000 What is the chance that I am HIV
negative? 0.0001 0.001 0.01
0.1 0.9 0.99 0.9999 For every 1 True
Positive there will be 10 false positives, so my
chance of being Negative is 10/11.
11How about the PSA test?
Rate is 1510000 False Positive Rate is
601000 For every 15 True Positives, there will
be 600 False Positives! Chance of being Negative
600/615 .97 Chance of being Positive .03
(before test chance was 0.015) -Is this true?
12How about the PSA test?
Rate is 1510000 False Positive Rate is
601000 For every 15 True Positives, there will
be 610 False Positives! Chance of being Negative
600/615 .97 Chance of being Positive .03
(before test chance was 0.015) -Is this
true? The test will miss 80 of the true
positives (sensitivity 20) so
there will only be 3 True Positives Detected
so Chance of being Negative 600/603 0.995
Chance of being True Positive 0.005 Follow up
for a HIV test is another blood test. Follow up
for PSA test is tissue biopsy.
13How good does a Biomarker have to be?
By Age 65 the rate of Prostate Cancer climbs to
81000 and the test performs much better. For
every 8 True Positives, there will be 60 False
Positives! Chance of being Negative 60/68
.88 Chance of being Positive .12 (before test
chance was 0.015)
14How good does a Biomarker have to be?
Prostate Cancer is one of the most frequent
cancers (1510000), most cancers are much less
frequent (110000 150000) so a biomarker
would have to be much better than the PSA test.
It is currently believed that a new biomarker
would need sensitivity and specificity better
than 95.
15Early Proteomics Base Biomarker work was based on
SELDI
SELDI can detect 200-300 features in a sample.
It has been used to find biomarkers from
everything from blood to tears.
16Early Biomarker work has largely been discredited
-Biomarkers with similar masses kept being
rediscovered -When the proteins were identified,
they were abundant serum proteins and were from
the same proteins -Multi-center studies failed to
validate the biomarkers in clinical setting -R
ealization that serum and other biofluids are
incredibly complex. -Realization that serum and
other biofluids are incredibly variable and
fragile -some strong biomarkers -blood
collection tube - of freeze-thaw
cycles -diet
17Key Concept Proteins vary widely in concentration
18Typical Biomarker Discovery study will take 50
samples per condition. Typically takes 10 samples
per condition to have a 90 chance of finding
differences of 2 times.Validation will take 1000s
of samples. Finally the assay will have to be
converted to something that can be done in a
clinical lab.
19PCA or other Clustering is used for Biomarker
discovery
20Common Serum Markers for Cancer
Diagnosis/prognosis
21Conclusions
-Biomarker Discovery is difficult -biofluids are
complex -biofluids have a high dynamic
range -biomarkers are usually low
abundance -even taking proximal fluids
typically does not
help -the is a lot of person to person
variability -Most Biomarkers will never become
clinically relevant -statistical standards for
diagnostic tools is very high -the more
prevalent the disease the better the
biomarker will perform -An MS based biomarker
assay is unlikely due to the greater analytical
performance of antibody based methods. -For a
biomarker workflow to be meaningful it must be
quantitative!
22Quantitative Approaches
Stable Isotope Labeling methods -adds heavy
isotopes to one sample so chemically identical
compounds are mass shifted -added to the
peptides/proteins using reactive groups -added
to the proteins in vivo using heavy amino
acids -can be multiplexed Label free
methods -extracted ion chromatograms -spectral
counting
23(No Transcript)
24ISOTOPE-CODED AFFINITY TAG (ICAT)
- Label protein samples with heavy and light
reagent - Reagent contains affinity tag and heavy or light
isotopes
Chemically reactive group forms a covalent bond
to the protein or peptide
Isotope-labeled linker heavy or light, depending
on which isotope is used
Affinity tag enables the protein or peptide
bearing an ICAT to be isolated by affinity
chromatography in a single step
25Example of an ICAT Reagent
Biotin Affinity tag Binds tightly to
streptavidin-agarose resin
Reactive group Thiol-reactive group will bind to
Cys
Linker Heavy version will have deuteriums at
Light version will have hydrogens at
26The ICAT Reagent
27How ICAT works?
Affinity isolation on streptavidin beads
Lyse Label
Quantification MS
Identification MS/MS
NH2-EACDPLR-COOH
Light
100
MIX
Heavy
Proteolysis (ie trypsin)
m/z
m/z
28ICAT Quantitation
29ICATAdvantages vs. Disadvantages
- Estimates relative protein levels between samples
with a reasonable level of accuracy (within 10) - Can be used on complex mixtures of proteins
- Cys-specific label reduces sample complexity
- Can set up the mass spectrometer to fragment only
those peaks with a certain ratio
- Yield and non specificity
- Slight chromatography differences
- Expensive
- Tag fragmentation
- Meaning of relative quantification information
- No presence of cysteine residues or not
accessible by ICAT reagent
30iTRAQ Reagent Design
Isobaric Tag (Total mass 145)
Charged
Neutral loss
31Isobaric Tagging - General Method (4-Plex)
32Spotfire K-means Clustering of Protein-level
Ratios
33MS/MS Spectra of a Singly-charged Peptide
-TPHPALTEAK-
34Reporter Group Placement Selection of Quiet
Region
Summed Ion Intensity (75,000 Spectra)
35Simplified Workflow (One extra step)
MIX
ID and
SCX
LC MS/MS Analysis
MS/MS
36Differential Expression using iTRAQ Reagent
Approach OverExpression of Chaperonin 10
Non-Cysteine containing Protein
Cancer
Cancer
54
50
Normal
45
VLQATVVAVGSGSK iTRAQ Labeled Residue
Normal
40
Quantitation
35
114
115
116
117
m/z, amu
30
25
y2
y1
20
y3
y5
15
b3
10
b2
b5
y4
b4
y6
y7
5
b6
b7
0
200
300
400
500
600
700
800
900
100
m/z, amu
37ITRAQAdvantages vs. Disadvantages
- Estimates relative protein levels between samples
with a reasonable level of accuracy (gt 10) - Can be used on complex mixtures of proteins
- Isobaric so the tag is only visible in the MS/MS,
keeping the precursor scans as clean as possible.
- The abundance of the peptides sums together.
Making analysis of low abundance peptides easier. - Replicates analyzed on the same LC-MS/MS run,
minimizing run to run variability.
- Reagent not completely specific
- Expensive
- Does not work on ion trap instruments
- Reporters tend to dominate the spectra
- You have to fragment everything and sort out the
ITRAQ reporters later. The mass spec spends a
lot of time analyzing peptides with no
quantitative differences.
38Stable Isotope Labeling in Animal Culture
39(No Transcript)
40(No Transcript)
41SILACAdvantages vs. Disadvantages
- Estimates relative protein levels between samples
with a high level of accuracy ( lt5) - Can be used on complex mixtures of proteins
- Can set up the mass spectrometer to fragment only
those peaks with a certain ratio - Extremely flexible and can be adapted to many
systems.
- Labeling may be incomplete
- Urea Cycle may cause incorporation of heavy
isotopes into other amino acids - Expensive
- Works best on high resolution instruments.
42Label-Free Quantitation
- All approaches so far require purchase of
isotopically labeled reagents (can be expensive). - What if you want to compare large numbers of
samples (10) - What if you cant afford lots of reagents?
- Peak/Spectral counting
- Peak area comparison (Extracted Ion
Chromatograms)
43Spectral Counting
- Count the number of peptides identified from a
protein in each sample. - Typically do not count repeat identifications of
the same peptide - Not accurate at quantifying magnitude of change,
but can be used to determine if there is a
difference. - In general, need a spectral count difference of
about 4 peptides in order to be confident of a
difference being real. - Most proteins in complex mixtures are identified
by less than 4 peptides.
44EIC(Extracted Ion Chromatogram)
- Measure intensity of peak during its elution off
HPLC column and into the mass spectrometer. - Measure area of peak in XIC.
- More accurate than selecting peak intensity for
one given scan.
45emPAI(Exponentially Modified Protein Abundance
Index)
- emPAI 10PAI 1
- Where PAI Nobserved / Nobservable
- What is an observable peptide
- Peptides with a precursor mass between
800-2400Da. - There is a roughly linear relationship between
log protein concentration and the ratio of
observable peptides observed in range of 3-500
fmoles. - If you know how much total protein you analyzed
you can derive absolute abundancies.
Ishihama et al. Mol Cell Proteomics (2005) 4 9
1265-1272
46MRM (Multiple Reaction Monitoring)
Look for a component of a specific mass that when
fragmented forms a fragment of another specific
mass.
- Transition precursor m/z 521.7 fragment m/z
757.6 - Very sensitive and specific.
47MRM
- Best performed on a triple quadrupole instrument.
- Scans are very fast, so can perform multiple
transition scans on a chromatographic time-scale. - Requires a lot of optimization
- Verify transitions are reproducible, typically
want 2-3 transitions/peptide, 3-4
peptides/protein. - Determine the retention time to maximize the
number of peptides - that can be analyzed per run.
- It is possible to analyze 100s of transition
per hour - MRM coupled to isotopically labeled peptides
allows for very high sensitivity and high
accuracy analysis and can give absolute
quantification. - Once optimized 1000s of samples can be run in a
short time frame - Not for discovery! You must already know what
you are looking for, sometimes refered to as
targeted proteomics
48Issues with MS Quantitation Analysis
- Should you use all data for quantitation?
- Minimum peak intensity?
- Peaks near to signal to noise will have much
higher variability in quantitation accuracy. - Very intensive peaks may be saturated.
- Proteins identified by a single peptide are
probably not accurately quantified? - It is best to ignore sequences with more than one
form PTMs, missed cleavages, etc. - Multiple charge states should be summed.
Results are normally reported with a mean and
standard deviation
49(No Transcript)
50Conclusions
- There are many different ways to quantitate
proteomics data - Quantitative studies need to be approached
carefully, because it is easy to make mistakes - No one strategy is best
- MRM is the most sensitive and accurate, but
requires the most optimization and cannot be used
for discovery.