SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates - PowerPoint PPT Presentation

1 / 78
About This Presentation
Title:

SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates

Description:

SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates – PowerPoint PPT presentation

Number of Views:103
Avg rating:3.0/5.0
Slides: 79
Provided by: ralpho
Category:

less

Transcript and Presenter's Notes

Title: SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates


1
Sample-Size AnalysisConsidering Traditional and
CrucialType I and Type II Error Rates
  • Ralph O'Brien, PhDCenter for Clinical
    InvestigationCase Western Reserve University

Edvard MunchThe Scream1893
2
Objectives
  • Explain Type I and II errors and error rates,
    both classical and crucial.
  • Understand what factors affect these error rates.
  • Determine the sample size to achieve given
    objectives.
  • Use the software PASS and OBriens Excel program
    to analyze power and sample size.

3
Eric Topol, et al (1997)
  • The calculation and justification of sample size
    is at the crux of the design of a trial. Ideally,
    clinical trials should have adequate power, 90,
    to detect a clinically relevant difference
    between the experimental and control therapies.
    Unfortunately, the power of clinical trials is
    frequently influenced by budgetary concerns as
    well as pure biostatistical principles

4
Eric Topol, et al (1997)
  • Yet an underpowered trial is, by definition,
    unlikely to demonstrate a difference between the
    interventions assessed and may ultimately be
    considered of little or no clinical value. From
    an ethical standpoint, an underpowered trial may
    put patients needlessly at risk of a new therapy
    without being able to come to a clear conclusion.

5
Richard Feynman
Richard Feynman, 1918-1988Nobel Laureate in
PhysicsAdventuresome, joking ever the
curious character
6
The March of Science
  • Scientific knowledge is a body of statements of
    varying degrees of uncertainty,
  • some mostly unsure,
  • some nearly sure,
  • none absolutely certain

Richard Feynman, 1918-1988Nobel Laureate in
PhysicsAdventuresome, joking ever the
curious character
7
March of Science in clinical research
8
Peter Stacpoole
  • Does DCAdecrease mortality inchildren
    withsevere malaria?

9
Sol Capote
  • Does QCAdecrease mortality inchildren
    withsevere malaria?

Pablo Picasso Portrait Max Jacob, 1907
10
Capotes proposed study?
  • Design? Allocation ratio?
  • Subjects?
  • Primary efficacy outcome measure?
  • Primary analysis? One- or two-sided test?
  • Scenario for the infinite data set?

11
Design? Allocation ratio?
  • Two groups
  • Randomized, double blind.
  • 1 pt gets Usual Care 2 pts get QCA

12
Subjects?
  • Set inclusion and exclusion criteria.
  • Total N? Consider 700 1400 2100.

13
Primary efficacy outcome measure?
  • Death before 10 days (0 vs. 1)
  • No censoring
  • Disregard exact survival times

14
Primary analysis?One- or two-sided test?
  • Compare deaths
  • Likelihood ratio test of 2 ind.
    proportions.(Others would be OK, too.)
  • Two-sided (This is arguable.)

15
Scenario for the infinite data set?
  • Usual Care mortality rate 0.15
  • QCA cuts mortality by 25 or 33.33
  • ? QCA mortality rate 0.1125 or 0.10.

16
(No Transcript)
17
(No Transcript)
18
  • Do classical power analysis with PASS.

19
Beyond ? and ?
  • What arecrucialType I and Type II error rates?

20
Crucial error rates
  • Type I If the test yields traditional
    statistical significance (p ? ?), what is the
    chance this will be an incorrect inference?
  • Type II If the test does not yield traditional
    statistical significance (p ??), what is the
    chance this will be an incorrect inference?

21
  • How does this relate to statistical power?

22
Which study has the strongest evidence that QCA
is effective?
Note With UCO mortality of 0.15 and QCA relative
risk of 0.67 and ? 0.05 Ntotal 450 ? power
0.33 Ntotal 2100 ? power 0.90
23
Suppose you believe that
  • Only 30 of hypotheses being investigated are
    actually non-null.

24
Lee and Zelen (2000)
Same logic as in OBrien and Castelloe
(2006), but we switched ? and ? notation (on
purpose).
25
Lee and Zelen (2000)
26
(No Transcript)
27
Of course, this is an example of Bayes Theorem,
but you may want to say that quietly due to
Bayesphobia, disease with still high prevalence
in the scientific community (including
statistical scientists).
28
Greater power reduces both crucial error rates.
29
  • Study crucial error rates with Excel program.

30
  • Fighting the Common Cold

31
Does zinc gluconate glycine reduce the duration
of the common cold?
Dr. Macknins study
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
Mossad, Macknin, et al. (1996)
36
Mossad, Macknin, et al. (1996)
  • sample size of 100 patients
  • detect a difference in number of days with cold
    symptomsmeans
  • 8 days in the placebo group
  • 4 days in the zinc group
  • standard deviation of 6 days
  • two-sided P-value of 0.05
  • approximate power of 90

It looks like theyassumed Normality.
37
But wait! Does this scenario make sense?
38
Checking (with SAS)
  • proc power
  • TwoSampleMeans
  • GroupMeans 4 8
  • StdDev 6
  • alpha 0.05
  • NPerGroup 50
  • power .
  • run

39
  • The POWER Procedure
  • Two-sample t Test for Mean Difference
  • Distribution Normal
  • Method Exact
  • Alpha 0.05
  • Group 1 Mean 4
  • Group 2 Mean 8
  • Standard Deviation 6
  • Sample Size Per Group 50
  • Number of Sides 2
  • Null Difference 0
  • Computed Power
  • 0.910

40
Better way assume logNormal
  • proc power
  • TwoSampleMeans
  • test ratio
  • dist logNormal
  • MeanRatio 0.5 / 4/8 /
  • / Coef_Var SD/mean 6/4, 6/6, 6/8 /
  • CV 1.5 1.0 0.75
  • alpha 0.05
  • NPerGroup 50
  • power .
  • run

41
Powers for logNormal
  • The POWER Procedure
  • Two-sample t Test for Mean Ratio
  • Distribution Lognormal
  • Method Exact
  • Alpha 0.05
  • Geometric Mean Ratio 0.5
  • Sample Size Per Group 50
  • Number of Sides 2
  • Null Geometric Mean Ratio 1

  • Computed Power
  • Index CV Power
  • 1 1.50 0.885
  • 2 1.00 0.985
  • 3 0.75 gt.999

42
As-analyzed way log-rank test
  • proc power
  • TwoSampleSurvival
  • test logrank
  • alpha 0.05
  • AccrualTime 30
  • TotalTime 90
  • GroupMedSurvTimes 4 8
  • NPerGroup 50
  • power .
  • run

43
  • The POWER Procedure
  • Log-Rank Test for Two Survival Curves
  • Method Lakatos
    normal approx
  • Form of Survival Curve 1 Exponential
  • Form of Survival Curve 2 Exponential
  • Accrual Time 30
  • Total 90
  • Alpha 0.05
  • Group 1 Median Survival Time 4
  • Group 2 Median Survival Time 8
  • Sample Size Per Group 50
  • Computed Power
  • 0.917

44
Results
P lt 0.001, log-rank test
Placebo (n 50)
Cold-Eeze(n 49)
45
(No Transcript)
46
Dr. Macknin
  • I got goosebumps when we broke the code. I
    didnt think it was going to work.
  • here was something that actually looked like
    it was helping the common cold.
  • nothing had really worked like this before.

47
What was that again?
  • I didnt think it was going to work.

48
So, what do you think?
49
  • Quickly Reducing Atherosclerosis

50
(No Transcript)
51
Atheroma Volume
Atheroma Area

EEM Area
8.1 mm2

14.37 mm2
56
52
Does SuperHDL reduce atheroma volume in
patients with atherosclerosis?
Dr. Nissens study
53
One critical thing
November 5, 2003 Cholesterol Study Offers Hope
for a Bold Therapy By GINA KOLATA
  • A small study of heart disease patients testing a
    hypothesis so improbable its principal
    investigator says he gave it a one-in-10,000
    chance of succeeding

54
What was that again?
  • a one-in-10,000 chance of succeeding

55
(No Transcript)
56
Power 75
57
Change in Atheroma Volume
baseline
After 5 weeks
56
46
58
Primary analysis
SuperHDL vs Placebo p 0.29
59
ABC News World News TonightNovember 4, 2003
60
Quotes from ABC News story
  • Peter Jennings
  • enormously promising treatment to stave off
    heart disease.
  • very much appears to be a real breakthrough

61
Quotes from ABC News story
  • Dr. Rader (wrote JAMA editorial)
  • unprecedented. This study shows that plaque
    regression occurred much faster and to a much
    greater extent than weve ever seen

62
Quotes from ABC News story
  • ABCs John McKenzie
  • After just five weekly treatments, researchers
    saw an average of 4 reduction in the amount of
    plaque on artery walls.

63
Quotes from ABC News story
  • Dr. Rader (wrote JAMA editorial)
  • A regression of 4 actually represents several
    years worth of plaque build-up in the coronary
    arteries.

64
Dr. Nissen(Cleveland Clinic website, 11
November 2006)
  • This is an extraordinary and unprecedented
    finding, said Cleveland Clinic cardiologist
    Steven E. Nissen, M.D., who directed the
    10-center nationwide study. This is the first
    convincing demonstration that targeting HDL, good
    cholesterol, can benefit patients with heart
    disease, the leading cause of death in the United
    States.

65
So, what do you think?
66
Why is this statement wrong?
  • As a result of this logic, if we are willing to
    assert a difference when P lt .05, we are tacitly
    agreeing to accept the fact that, over the long
    run, we expect 1 assertion of a difference in 20
    to be wrong.
  • From Glantz, SA (2002), Primer of Biostatistics,
    p. 108

67
This is a common misunderstanding of p-values.
  • But the error rate it describesis what we should
    care about.

68
The crucial error rates
  • Crucial False Positive Rate (?)If a test
    turns out to be significant (p ?),what is the
    chance it is a (Type I) error?
  • Crucial False Negative Rate (?)If a test turns
    out to be non-significant(p gt ?), what is the
    chance it is a (Type II) error?

69
  • Not a new idea!

70
Same logic as common statistical methodology in
diagnostic testing
  • sensitivity
  • specificity
  • positive predictive value
  • negative predictive value

? power 1 - ? ? 1 - ?? ? 1 - ? ? 1 - ?
71
Wacholder, et. Al (2004)
  • FPRP False Positive Report Probability

Same logic as Lee and Zelen (2000).
72
How many candidate drugs are studied for every
one that finally gets FDA approval?
  • Thousands!

73
  • What about Cold-Eeze?

74
JAMA, 24 June 1998
75
(No Transcript)
76
(No Transcript)
77
Needed!
  • Sound way to use the data to compute the
    posterior March of Science position.
  • Stay within frequentist testing. (Sorry, even as
    cool as they are, regular Bayesian methods
    require a much fuller specification of prior
    information than just position of March of
    Science.)
  • Stay tuned.

78
Thanks!
Dont make the wrong mistake. - Yogi
Berra
Write a Comment
User Comments (0)
About PowerShow.com