Clinical Validation of Prognostic Biomarkers of Risk and Predictive Biomarkers of Drug Efficacy or Safety - PowerPoint PPT Presentation

1 / 72

About This Presentation

Title:

Clinical Validation of Prognostic Biomarkers of Risk and Predictive Biomarkers of Drug Efficacy or Safety

Description:

Clinical Validation of Prognostic Biomarkers of Risk and Predictive Biomarkers of Drug Efficacy or Safety Gene Pennello, Ph.D. Team Leader, Diagnostics Devices Branch – PowerPoint PPT presentation

Number of Views:211

Avg rating:3.0/5.0

Slides: 73

Provided by: GenePe7

Category:

more less

Transcript and Presenter's Notes

Title: Clinical Validation of Prognostic Biomarkers of Risk and Predictive Biomarkers of Drug Efficacy or Safety

1
Clinical Validation of Prognostic Biomarkers of
Risk and Predictive Biomarkers of Drug Efficacy
or Safety

Gene Pennello, Ph.D.
Team Leader, Diagnostics Devices Branch
Division of Biostatistics
Office of Surveillance and Biometrics
Center for Devices and Radiological Health, FDA

SAMSI Risk Perception Policy Practice
Workshop October 3, 2007
2
Outline

FDA and Device Regulation
Types of Biomarkers
Validation of Diagnostics
Predictive and Prognostic Biomarkers
Definitions, Endpoints
Study Designs for Predictive Biomarkers
Prospective Designs efficiency comparison
Prospective-Retrospective Designs
Summary

3
FDA
CDERDrugs
CDRH,Devices
CBER,Biologics
CVM,Veterinary
CFSAN,Food
NCTR
4
What are Medical Devices?
An item for treating or diagnosing a health
condition whose intended use is not achieved
primarily by chemical or biological action within
the body (Section 201(h) of the Federal Food Drug
Cosmetic (FDC) Act). Definition by
exclusion Simply put, a medical device is any
medical item for use in humans that is not a drug
nor a biological product.
5
Example of Medical Devices

Relatively Simple Devices tongue depressors
thermometers latex gloves simple surgical
instruments
Ophthalmic devices intraocular lenses PRK
lasers,
Radiological devices MRI machines CT
scannersdigital mammographycomputer aided
detection

Cardiovascular Devices pacemakers
defibrillators heart valves coronary stents
artificial hearts Monitoring Devices
glucometers bone densitometers Diagnostic
Devices diagnostic test kits for
HIVprostate-specific antigen (PSA) testhuman
papillomavirus (HPV) test
6
Example of Medical Devices
Dental, Ear, Nose, andThroat Devices hearing
aidsbronchoscopy system General, Surgical, and
Restorative Devices breast implants artificial
hips spinal fixation devices artificial skin
Emerging technologies multiplex genetic tests
(e.g., for multiple mutations or
microbes) Genomic and proteomic Dx
tests Nanotechnological devices Microspheres for
molecular treatment of cancer Robotics Theranostic
s (predictive biomarkers of response or adverse
reaction to therapy). Artificial pancreas
7
Example of Medical Devices
Due to the wide variety in technology,
complexity, and intended use, medical devices
can present novel statistical design and analysis
challenges.
8
Device Regulation

Decision to approve a PMA application must rely
upon valid scientific evidence to determine
whether there is reasonable assurance that the
device is safe and effective.
Valid scientific evidence is evidence from well
controlled studies, partially controlled studies
and objective trials without matched controls,
well documented case histories conducted by
qualified experts that there is a reasonable
assurance of safety and effectiveness . . .
U.S. Code of Federal Regulations, Title 21 (Food
and Drugs), U.S. Government Printing Office,
Washington DC, 2001, Part 860.7 Web address
http//www.access.gpo.gov/nara/cfr/waisidx_01/21cf
r860_01.html (Accessed February, 2002)

9
Device Regulation

Least Burdensome Provisions of FDA Modernization
Act (1997)
Secretary shall only request information that is
necessary to making substantial equivalence
determinations.
Secretary shall consider, , the least
burdensome appropriate means of evaluating device
effectiveness that would have a reasonable
likelihood of resulting in approval.
U.S. Code of Federal Regulations, Title 21 (Food
and Drugs), U.S. Government Printing Office,
Washington DC, 2001, Part 513(i)(1)(D) and
513(a)(3)(D)(ii). Web address http//www.access.g
po.gov/nara/cfr/waisidx_01/21cfr860_01.html

10
FDA Least Burdensome Guidance

FDA Guidance The Least Burdensome Provisions of
the FDA Modernization Act of 1997 Concept and
Principles (2002)
Modern statistical methods may also play an
important role in achieving a least burdensome
path to market. For example, through the use of
Baysian sic analyses, studies can be combined
in order to help reduce the sample size needed
for the experimental and/or control device.

11
Examples of Less Burdensome

Non-U.S. data
Surrogate endpoints (e.g., acute follow-up)
Interim analysis, Adaptive design
Bayesian methods (e.g., to reduce sample size)
Propensity Scores for historical controls
Sensitivity analysis for missing data.Note,
could trade clinical for statistical burden
FDA Draft Guidance for the Use of Bayesian
Statistics in Medical Device (released May 23,
2006) www.fda.gov/cdrh/osb/guidance/1601.html

12
Least Burdensome Provision

Least burdensome provision in FDAMA of 1997 is
directed to both medical devices and diagnostics
(including biomarkers).

13
Device Risk Classification

Class I Devices for which general controls
provide reasonable assurance of the safety and
effectiveness.
Class II General controls insufficient, Can
establish special controls (performance
standards CLIA, ISO, FDA guidance. May require
clinical data on a 510(k).
Class III General and special controls
insufficient. Life-sustaining/supporting,
substantial importance in preventing impairment
of human health, potential unreasonable risk of
illness or injury. Needs pre-market approval
(PMA).

14
Post-Market Transformation

Make postmarket data more widely available to
Center staff and supplement search and reporting
tools
"Investigate the use of data and text mining
techniques to identify the "needles in the
haystack" by identifying patterns in the incoming
data that equate to public health signals.
Example is WebVDME Bayesian data-mining
Design a pilot project to test the usefulness of
quantitative decision-making methods for medical
device regulation across the total product life
cycle

http//www.fda.gov/cdrh/postmarket/mdpi-report-110
6.html
15
Types of Biomarkers

Diagnostic
Early detection (screening), enabling
intervention at an earlier and potentially more
curable stage than under usual clinical
diagnostic conditions
Monitoring of disease response during therapy,
with potential for adjusting level of
intervention (e.g. dose) on a dynamic and
personal basis
Risk assessment leading to preventive
interventions for those at sufficient risk
Prognosis, allowing for more aggressive therapy
for patients with poorer prognosis
Prediction of safety or efficacy (response) of a
therapy, thereby providing guidance in choice of
therapy

16
Types of Biomarkers

Diagnostic
Early Detection (screening)
Monitoring
Risk Assessment
Prognostic
Predictive of Safety or Efficacy
The first three are considered together, where
the focus is on identifying the disease or
condition.

17
Types of Biomarkers

Diagnostic
Early Detection (screening)
Monitoring
Risk Assessment
Prognostic
Predictive of Safety or EfficacyThe last three
are attempting to predict the future.

18
Analytical Validation

How well are you measuring the measurand?
Precision / Reproducibility
Method Comparison
LoB, LoD, LoQ
Linearity
Stability
Clinical Laboratory Standards Institute (CLSI)
(http//www.nccls.org/)

19
Clinical Validation (Qualification)

Does the test have clinical utility?
Does it have added value over standard tests
(e.g, clinical covariates like age, tumor size,
stage)?
May or may not require a clinical study
EX. Roche Amplichip

CDRH guidance document Statistical Guidance on
Reporting Results from Studies Evaluating
Diagnostic Tests issued in final form in March,
2007, concerns reporting agreement when there is
no perfect standard and also discrepancy
resolution. http//www.fda.gov/cdrh/osb/guidance/
1620.html
20
Roche AmpliChip CYP450 Test (CDRH de novo 510(k)
K042259)

Genotypes two cytochrome P450 genes (29
polymorphisms in CYP2D6 gene, 2 in CYP2C19) to
provide the predictive phenotype of the metabolic
rate for a class of therapeutics metabolized
primarily by CYP2D6 or CYP2C19 gene products.
The phenotypes are
(1) Poor metabolizers (3) Extensive
metabolizers
(2) Intermediate metabolizers (4) Ultrarapid
metabolizers
Cytochrome P450s are a large multi-gene family of
enzymes found in the liver, and are linked to the
metabolism of approximately 70-80 of all drugs.
Among them, the polymorphic CYP2D6 and CYP2C19
genes are responsible for approximately 25 of
all CYP450-mediated drug metabolism. A
polymorphism in these enzymes can lead to an
excessive or prolonged therapeutic effect or
drug-related toxicity after a typical dose by
failing to clear a drug from the blood or by
changing the pattern of metabolism to produce
toxic metabolites.

http//www.accessdata.fda.gov/scripts/cdrh/cfdocs/
cfPMN/pmn.cfm
21
Adding Value to Standard Clinical Predictors

Head to Head Marker superior to clinical
predictors at predicting outcome.
Incremental Improvement Combination superior to
clinical predictors alone.
Marker Predictive within Clinical Strata e.g.,
HR(, ) significant within age, tumor grade,
tumor size groups.

22
Multivariate Index Assays

An IVDMIA is a device that
Combines the values of multiple variables using
an interpretation function to yield a single,
patient-specific result (e.g., a
classification, score, index, etc.), that
is intended for use in the diagnosis of disease
or other conditions, or in the cure, mitigation,
treatment or prevention of disease, and
Provides a result whose derivation is
non-transparent and cannot be independently
derived or verified by the end user. MIA result
could be a binary (dichotomous) (such as yes or
no), categorical (such as disease type), ordinal
(such as low, medium, high) or a continuous
scale.
Source FDA MIA Draft Guidance
http//www.fda.gov/cdrh/oivd/guidance/1610.html

23
Typical Endpoints for Prognostic or Predictive
Biomarkers

Time to Event
Event by Time t

Treatment Median Survival Time
A 6 months
B 12 months
Hazard Ratio 0.5
Treatment R Not R Response Rate
A 30 30 0.50 (30/60)
B 10 50 0.13 (10/60)
24
Relative Risk vs. Diagnostic Accuracy
Event by Time t
RelativeRisk 3.0 (30/60)/(10/60)
Se 0.75 (30/40)
Sp 0.63 (50/80)
PPV 0.50 (30/60)
NPV 0.83 (50/60)
E No E
30 30 60
10 50 60
40 80 120
Marker

Relative Risk looks good, but Dx accuracy not
great ? limited clinical utility?

Example taken from Emir, Wieand, Su, Cha,
Analysis of repeated markers used to predict
progression of cancer Statist. Med., 17, 2563-78,
1998.
25
Hazard Ratio vs. Diagnostic Accuracy

NCCTG Mayo Clinic Study. CA15-3 ratio as
diagnostic for progression of breast cancer (as
determined by physical exam).

Hazard Ratio 2.3 (p 0.0002)
Se 0.30 (0.17,0.43)
Sp 0.82 (0.74,0.89)
PPV 0.27 (0.21,0.33)
Example taken from Emir, Wieand, Su, Cha,
Analysis of repeated markers used to predict
progression of cancer Statist. Med., 17, 2563-78,
1998.
26
Diagnostic Performance
Sensitivity Specificity (TP rate) (TN
rate) FP rate fraction of fraction
of fraction of responders non-responders
non-responders who test who test who test
Test is useful if TP rate gt FP rate, i.e.,
sensitivity specificity gt 1. EX. Useless
test sensitivity 0.80, specificity 0.20
27
Diagnostic Performance
Positive Negative predictive predictive
value (PPV) value (NPV) 1 NPVfraction
of fraction of fraction of test s
who test s who test s whorespond dont
respond respondTest is useful if PPV NPV gt
1 EX. Useless test PPV 0.60, NPV 0.40
28
d
A ROC curve is a plot of sensitivity (true
positive rate) vs. 1-specificity (false positive
rate) over all possible cutoff points for the
test. The test is informative if the area under
the curve is greater than 0.5.
29
Prognostic Biomarker (Strong Defn)

Prognostic factor. Informs about an outcome
independent of specific treatment (ability of
tumor to proliferate, invade, and/or spread).
Prognostic biomarker is associated with
likelihood of an outcome (e.g., survival,
response, recurrence) such that magnitude of
association is independent of treatment.
On some scale, treatment and biomarker effects
are additive, that is, do not interact.

30
HR(A,B)0.67
HR(A,B)0.67
31
(No Transcript)
32
Prognostic Biomarker (Weak Defn)

Prognostic factor. Informs about an outcome
independent of specific treatment (ability of
tumor to proliferate, invade, and/or spread).
Prognostic biomarker is associated with
likelihood of an outcome (e.g., survival,
response, recurrence) in a population that is
untreated or on a standard (non-targeted)
treatment.
If population is clearly defined, than can use
to choose more or less aggressive therapy, but
not specific therapies, per se.

33
HR(A,B)0.67
HR(A,B)0.67
34
Prognostic Biomarker

Her2-neu for node-negative women with breast
cancer prognostic for recurrence
Breast cancer prognostic test based on microarray
gene expression of RNAs extracted from breast
tumor tissue to assess a patients risk for
distant metastasis for women less than 61 with
Stage I or II disease with tumor size less than
or equal 5.0 cm and who are lymph node negative.
(Ref. Buyse et al. JNCI 98, 1183-1192)

35
Agendia Mammaprint Gene Signature for Time to
Distant Metastasis (N302)
5-year Low risk group 0.95 (0.91-0.99) High
risk group 0.78 (0.72-0.84) 10-year Low risk
group 0.90 (0.85-0.96) High risk group0.71
(0.65-0.78) Buyse et al JNCI (2006),
98,1183-1192
36
Proportion alive at 10 years

Clinical Gene N Proportion
Signature
Low Risk Low Risk 52 0.88 (0.74 to 0.95) Sp
Low Risk High Risk 28 0.69 (0.45 to 0.84) 1Se
High Risk Low Risk 59 0.89 (0.77 to 0.95) Sp
High Risk High Risk 163 0.69 (0.61 to 0.76) 1Se
Buyse et al JNCI 2006

37
Predictive Biomarker

Predictive factor. Implies relative sensitivity
or resistance to specific treatments or agents.
Predictive biomarker predicts differential effect
of treatment on outcome.
Treatment and biomarker interact.Predictive
biomarker can be useful for selecting specific
therapy.

38
HR(A,B)0.5
HR(A,B)1.0
39
Predictive Biomarker of Efficacy

Marker HER2/neuTreatment Trastuzumab
(Herceptin)
Objective response rate
HerceptinChemo ChemoFISH 95/176 (54)
51/168 (30)FISH- 19/50 (38) 22/57
(39)
Arch. Pathol. Lab Med Jan 2007 (ASCO/CAP
Guidelines)

40
Predictive Biomarkers for Safety

Predict risk of an adverse event dependent on the
biomarker
Example
UGT1A1, cleared by FDA, to predict the risk of
neutropenia in patients taking irinotecan for
colorectal cancer

41
Prospective Study Designs for Predictive Markers

Untargeted Design (Reference)
Validate Treatment, Marker Simultaneously
Marker by Treatment Design
Targeted Design (Marker Subset Only)
Marker Strategy Design
Historical Control

42
Untargeted Design (Reference)

Test if drug works in entire population.
Mixture of marker and drug effects.
Can store samples if test is not ready.

43
Marker by Treatment (Interaction) Design

A Randomized Block Design
Can test for biomarker by treatment interaction
(predictive biomarker)
Test needs to be available before trial ensues.

44
Marker by Treatment Design Questions

Test Drug Overall and within Marker Subset
0.04, 0.01 tests suggested to control Type I
error rate at 0.05 (Simon), but subset could
drive overall result.
Frequentist multiplicity penalty may preclude
subset testing as good business strategy.
Statement about drug, not biomarker
Test Marker Overall and within Drug Subset
Statement about marker, not drug.
Test for Treatment by Marker Interaction
Simultaneously validates drug and marker.

45
Targeted Design

Test if drug works in subset.
Cannot test if marker discriminates. Only PPV
available.

46
Efficiency of Designs
Relative Efficiency Relative Efficiency
Marker Prevalence Relative Efficacy Targeted Design Interaction Design
25 0 16x 8x
50 0 4x 2x
75 0 1.8x 0.9x

Efficiency gain depends on marker prevalence,
relative efficacy, and difference tested.

Marker to Marker Patients Simon
Maitournam, CCR 2004 Marker by Treatment
Design Test for Interaction approx. efficiency
enriching with half s, half s.
47
Efficiency of Designs
Relative Efficiency Relative Efficiency
Marker Prevalence Relative Efficacy Targeted Design Interaction Design
25 25 5.2x 1.5x
50 25 2.6x 0.7x
75 25 1.5x 0.4x

Efficiency gain depends on marker prevalence,
relative efficacy, and difference tested.

Marker to Marker Patients Simon
Maitournam, CCR 2004 Marker by Treatment
Design Test for Interaction approx. efficiency
enriching with half s, half s.
48
Efficiency of Designs
Relative Efficiency Relative Efficiency
Marker Prevalence Relative Efficacy Targeted Design Interaction Design
25 50 2.5x 0.3x
50 50 1.8x 0.2x
75 50 1.3x 0.1x

Efficiency gain depends on marker prevalence,
relative efficacy, and difference tested.

Marker to Marker Patients Simon
Maitournam, CCR 2004 Marker by Treatment
Design Test for Interaction approx. efficiency
when enriching with half s, half s.
49
Improving Efficiency of Interaction Design

Enrich with Test Positives if Pr() is low
Find scale such that marker and treatment effects
are additive
Adaptive Randomization
Bayesian subset analysis
If reader variability (e.g., IHC), then use
multiple readers.
Prior Information

50
Possibilities for Increasing Efficiency of
Interaction Design

Enrich with Test Positives if Pr() is low
Estimates of Sensitivity and Specificity are
biased because they depend on Pr().
Use inverse probability weighting (Horvitz,
Thompson, 1952) or Bayes Theorem (Begg, Greenes,
1983) to obtain unbiased estimates.

51
A Marker-Based Strategy

Pro More ethical, perhaps. More patients given
experimental drug. Test utility based on PPVE,
NPVE.
Con Cannot assess test-treatment interaction.

52
Marker-Based Strategy
Response
E Naïve E Unbd
Se a / (ac) a / (a2c)
Sp d / (db) 2d / (2db)
PPV a / (ab) same
NPV d / (cd) same
R Not R
E a b
P 0 0

Test
R Not R
E c d
P e f

Test
53
A Marker-Based Strategy
Response
E Naïve E Unbd
Se 20/43(0.47) 20/66 (0.30)
Sp 157/177(0.89) 314/334(0.94)
PPV 20/40(0.50) Same
NPV 157/180(0.88) Same
R Not R
E 20 20 40
P 0 0 0
20 20 40
Test
R Not R
E 23 157 180
P 24 156 180
46 314 360
Test
54
Possibilities for Increasing Efficiency of
Interaction Design

Transformation
Find a transformation (Box-Cox?) of outcome that
makes treatment and effects additive.
Can then pool marker effect estimates within
treatments A and B.
Can also pool drug effect estimates within marker
and marker s.

55
Possibilities for Increasing Efficiency of
Interaction Design

Adaptive Randomization
Adapt randomization ratio to treatment A and B
within biomarker subsets to maximize (a) power,
or (b) fraction of patients on better treatment
If response rate lt 0.5 for both treatments, then
(a) and (b) are compatible, otherwise in tension.
Pr() disturbed, so need to adjust Se, Sp

56
Possibilities for Increasing Efficiency of
Interaction Design

Bayesian subset analysis (cf. Dixon, Simon)
Subsets modeled as exchangeable via random
effects.
Subset estimate borrows strength from complement
subset, increasing precision of estimate.
However, interaction estimate more conservative
relative to usual non-Bayesian analysis.

57
Bayesian Subset Analysis

Power is enhanced to show drug works in marker
subset (blue).
Power is enhanced to show marker works
(discriminates) in patients taking drug (red)

58
Possibilities for Increasing Efficiency of
Interaction Design

Use Multiple Readers
EGFR IHC test (Dako) and Cetuximab and
Panitumumab (Amgen) for Colorectal Cancer. of
cells stained and maximum staining intensity
subject to reader variability
Use multiple readers, account for random reader
effects.
Multiple Reader, Multiple Case Designs (MRMC) are
used for digital mammography systems and computed
aided detection (CAD) systems
Analysis can be difficult.

59
Possibilities for Increasing Efficiency of
Interaction Design

Prior Information (Bayesian analysis)
Borrow strength from previous study regarded as
exchangeable with current study.

60
Marker Based Strategy Design
Marker Level (-)
Treatment A
Marker Based Strategy
Marker Level ()
Treatment B
Register
Randomize
Test Marker
Non Marker Based Strategy
Treatment A
Sargent et al., JCO 2005
61
Marker Based Strategy Design
Marker Level (-)
Treatment A
Marker Based Strategy
Marker Level ()
Treatment B
Register
Randomize
Test Marker
Treatment A
Non Marker Based Strategy
Randomize
Treatment B
Sargent et al., JCO 2005
62
Marker Based Strategy Design

Lacks power Differential effect comparison
diluted because some patients in non-marker-based
strategy arm get marker-based treatment (could
eliminate these to increase power).
Might be best suited if have gt 2 treatments or gt
2 markers
EX. Irinotecan regiment (dose, timing, frequency)
determined by UGT1A1 genotype (6/6, 6/7, or 7/7)
in colorectal cancer patients.

63
Marker Based Strategy

If no gold standard, then can be only way to
assess effectiveness of a test.
EX. Detection tumor of origin in cancers of
unknown primary.
No gold standard IHC, imaging, may fail to
identify TOO.
Randomize patients to be managed with
new test standard, or
with standard alone
Compare arms on survival

64
Targeted Design w. Historical Control

Drug already on market, but has poor response
rate.
If response rate in marker study is
significantly greater than historical rate, then
marker discriminates.
Limitations
Lacks power because effect diluted.
Need to calibrate historical rate to marker
study (adjust for covariates).

65
Prospective-Retrospective Designs

Prospectively apply marker to stored samples (in
retrospect).
Can test overall, w. subset, or for interaction.
Missing samples could introduce bias.
RCT samples. Randomization ensures case and
control samples have similar characteristics.
Case-control samples. Avoid selection bias by
matching on sample processing date, processing
sites, etc., and not excluding censored times.
Reserve samples only for analytically validated
markers that are biologically plausible.

66
The Challenge of Multiplicity

Multiplicity of classifiers
Microarrays and proteomics
Many predictive models could be built with so
many inputs.
The challenge is to confirm any such model with
an independent data set.
A caveat the independent test data set cannot
be continually reused. Great discipline is
required in this regard.

67
Cross-Validation Pitfall
Simon, Radmacher, Dobbin, McShane (2003),
Pitfalls in the Use of DNA Microarray Data for
Diagnostic and Prognostic Classification, JNCI,
95 (1)
68
Summary Remarks

How to assess a test or biomarker is well-known,
but not as well-known in therapeutic circles.
Need to assess whether the biomarker adds
anything to what we already know.
The number of possibly good biomarker candidates
is enormous but great care is needed in
restricting the search.

69
Summary Remarks

Need to encourage least burdensome approaches to
validating biomarkers without compromising level
of evidence
Essential to confirm marker in independent
dataset
Studies to demonstrate informativeness of a
biomarker can be quite difficult to design,
conduct and analyze.

70
Acknowledgements

CDRH Division of Biostatistics (DBS)
Greg Campbell, Division Director
Diagnostic Devices Branch (DDB)
Lakshmi Vishnuvajjala, Branch Chief
Estelle Russek-Cohen, Team Leader
Gene Pennello, Team Leader

Bipasa Biswas Kyungsook Kim, Harry Bushar Samir
Lababidi Arkendra De Kristen Meier Shanti
Gomatam Kyunghee Song Thomas Gwise Rong Tang
71
More References

Sargent et al (2005). Clinical trial designs for
predictive marker validation in cancer treatment
trials. J Clin Oncol 232020-2027.
Pennello Vishnuvajjala (2005). Statistical
design and analysis issues with pharmacogenomic
drug-diagnostic co-development, In American Stat.
Assoc. 2005 Proc. of the Biopharm. Section, Joint
Statistical Meetings, Minneapolis, MN, August,
2005 American Stat. Assoc. Alexandria, VA.
FDA Drug-Diagnostic Co-Development Concept Paper.
April 2005.http//www.fda.gov/cder/genomics/pharm
acoconceptfn.pdf

72
(No Transcript)

Write a Comment

User Comments (0)