SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates - PowerPoint PPT Presentation

1 / 78

About This Presentation

Title:

SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates

Description:

SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates – PowerPoint PPT presentation

Number of Views:103

Avg rating:3.0/5.0

Slides: 79

Provided by: ralpho

Category:

more less

Transcript and Presenter's Notes

Title: SampleSize Analysis: Considering Traditional and Crucial Type I and Type II Error Rates

1
Sample-Size AnalysisConsidering Traditional and
CrucialType I and Type II Error Rates

Ralph O'Brien, PhDCenter for Clinical
InvestigationCase Western Reserve University

Edvard MunchThe Scream1893
2
Objectives

Explain Type I and II errors and error rates,
both classical and crucial.
Understand what factors affect these error rates.
Determine the sample size to achieve given
objectives.
Use the software PASS and OBriens Excel program
to analyze power and sample size.

3
Eric Topol, et al (1997)

The calculation and justification of sample size
is at the crux of the design of a trial. Ideally,
clinical trials should have adequate power, 90,
to detect a clinically relevant difference
between the experimental and control therapies.
Unfortunately, the power of clinical trials is
frequently influenced by budgetary concerns as
well as pure biostatistical principles

4
Eric Topol, et al (1997)

Yet an underpowered trial is, by definition,
unlikely to demonstrate a difference between the
interventions assessed and may ultimately be
considered of little or no clinical value. From
an ethical standpoint, an underpowered trial may
put patients needlessly at risk of a new therapy
without being able to come to a clear conclusion.

5
Richard Feynman
Richard Feynman, 1918-1988Nobel Laureate in
PhysicsAdventuresome, joking ever the
curious character
6
The March of Science

Scientific knowledge is a body of statements of
varying degrees of uncertainty,
some mostly unsure,
some nearly sure,
none absolutely certain

Richard Feynman, 1918-1988Nobel Laureate in
PhysicsAdventuresome, joking ever the
curious character
7
March of Science in clinical research
8
Peter Stacpoole

Does DCAdecrease mortality inchildren
withsevere malaria?

9
Sol Capote

Does QCAdecrease mortality inchildren
withsevere malaria?

Pablo Picasso Portrait Max Jacob, 1907
10
Capotes proposed study?

Design? Allocation ratio?
Subjects?
Primary efficacy outcome measure?
Primary analysis? One- or two-sided test?
Scenario for the infinite data set?

11
Design? Allocation ratio?

Two groups
Randomized, double blind.
1 pt gets Usual Care 2 pts get QCA

12
Subjects?

Set inclusion and exclusion criteria.
Total N? Consider 700 1400 2100.

13
Primary efficacy outcome measure?

Death before 10 days (0 vs. 1)
No censoring
Disregard exact survival times

14
Primary analysis?One- or two-sided test?

Compare deaths
Likelihood ratio test of 2 ind.
proportions.(Others would be OK, too.)
Two-sided (This is arguable.)

15
Scenario for the infinite data set?

Usual Care mortality rate 0.15
QCA cuts mortality by 25 or 33.33
? QCA mortality rate 0.1125 or 0.10.

16
(No Transcript)
17
(No Transcript)
18

Do classical power analysis with PASS.

19
Beyond ? and ?

What arecrucialType I and Type II error rates?

20
Crucial error rates

Type I If the test yields traditional
statistical significance (p ? ?), what is the
chance this will be an incorrect inference?
Type II If the test does not yield traditional
statistical significance (p ??), what is the
chance this will be an incorrect inference?

How does this relate to statistical power?

22
Which study has the strongest evidence that QCA
is effective?
Note With UCO mortality of 0.15 and QCA relative
risk of 0.67 and ? 0.05 Ntotal 450 ? power
0.33 Ntotal 2100 ? power 0.90
23
Suppose you believe that

Only 30 of hypotheses being investigated are
actually non-null.

24
Lee and Zelen (2000)
Same logic as in OBrien and Castelloe
(2006), but we switched ? and ? notation (on
purpose).
25
Lee and Zelen (2000)
26
(No Transcript)
27
Of course, this is an example of Bayes Theorem,
but you may want to say that quietly due to
Bayesphobia, disease with still high prevalence
in the scientific community (including
statistical scientists).
28
Greater power reduces both crucial error rates.
29

Study crucial error rates with Excel program.

Fighting the Common Cold

31
Does zinc gluconate glycine reduce the duration
of the common cold?
Dr. Macknins study
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
Mossad, Macknin, et al. (1996)
36
Mossad, Macknin, et al. (1996)

sample size of 100 patients
detect a difference in number of days with cold
symptomsmeans
8 days in the placebo group
4 days in the zinc group
standard deviation of 6 days
two-sided P-value of 0.05
approximate power of 90

It looks like theyassumed Normality.
37
But wait! Does this scenario make sense?
38
Checking (with SAS)

proc power
TwoSampleMeans
GroupMeans 4 8
StdDev 6
alpha 0.05
NPerGroup 50
power .
run

The POWER Procedure
Two-sample t Test for Mean Difference
Distribution Normal
Method Exact
Alpha 0.05
Group 1 Mean 4
Group 2 Mean 8
Standard Deviation 6
Sample Size Per Group 50
Number of Sides 2
Null Difference 0
Computed Power
0.910

40
Better way assume logNormal

proc power
TwoSampleMeans
test ratio
dist logNormal
MeanRatio 0.5 / 4/8 /
/ Coef_Var SD/mean 6/4, 6/6, 6/8 /
CV 1.5 1.0 0.75
alpha 0.05
NPerGroup 50
power .
run

41
Powers for logNormal

The POWER Procedure
Two-sample t Test for Mean Ratio
Distribution Lognormal
Method Exact
Alpha 0.05
Geometric Mean Ratio 0.5
Sample Size Per Group 50
Number of Sides 2
Null Geometric Mean Ratio 1
Computed Power
Index CV Power
1 1.50 0.885
2 1.00 0.985
3 0.75 gt.999

42
As-analyzed way log-rank test

proc power
TwoSampleSurvival
test logrank
alpha 0.05
AccrualTime 30
TotalTime 90
GroupMedSurvTimes 4 8
NPerGroup 50
power .
run

The POWER Procedure
Log-Rank Test for Two Survival Curves
Method Lakatos
normal approx
Form of Survival Curve 1 Exponential
Form of Survival Curve 2 Exponential
Accrual Time 30
Total 90
Alpha 0.05
Group 1 Median Survival Time 4
Group 2 Median Survival Time 8
Sample Size Per Group 50
Computed Power
0.917

44
Results
P lt 0.001, log-rank test
Placebo (n 50)
Cold-Eeze(n 49)
45
(No Transcript)
46
Dr. Macknin

I got goosebumps when we broke the code. I
didnt think it was going to work.
here was something that actually looked like
it was helping the common cold.
nothing had really worked like this before.

47
What was that again?

I didnt think it was going to work.

48
So, what do you think?
49

Quickly Reducing Atherosclerosis

50
(No Transcript)
51
Atheroma Volume
Atheroma Area

EEM Area
8.1 mm2

14.37 mm2
56
52
Does SuperHDL reduce atheroma volume in
patients with atherosclerosis?
Dr. Nissens study
53
One critical thing
November 5, 2003 Cholesterol Study Offers Hope
for a Bold Therapy By GINA KOLATA

A small study of heart disease patients testing a
hypothesis so improbable its principal
investigator says he gave it a one-in-10,000
chance of succeeding

54
What was that again?

a one-in-10,000 chance of succeeding

55
(No Transcript)
56
Power 75
57
Change in Atheroma Volume
baseline
After 5 weeks
56
46
58
Primary analysis
SuperHDL vs Placebo p 0.29
59
ABC News World News TonightNovember 4, 2003
60
Quotes from ABC News story

Peter Jennings
enormously promising treatment to stave off
heart disease.
very much appears to be a real breakthrough

61
Quotes from ABC News story

Dr. Rader (wrote JAMA editorial)
unprecedented. This study shows that plaque
regression occurred much faster and to a much
greater extent than weve ever seen

62
Quotes from ABC News story

ABCs John McKenzie
After just five weekly treatments, researchers
saw an average of 4 reduction in the amount of
plaque on artery walls.

63
Quotes from ABC News story

Dr. Rader (wrote JAMA editorial)
A regression of 4 actually represents several
years worth of plaque build-up in the coronary
arteries.

64
Dr. Nissen(Cleveland Clinic website, 11
November 2006)

This is an extraordinary and unprecedented
finding, said Cleveland Clinic cardiologist
Steven E. Nissen, M.D., who directed the
10-center nationwide study. This is the first
convincing demonstration that targeting HDL, good
cholesterol, can benefit patients with heart
disease, the leading cause of death in the United
States.

65
So, what do you think?
66
Why is this statement wrong?

As a result of this logic, if we are willing to
assert a difference when P lt .05, we are tacitly
agreeing to accept the fact that, over the long
run, we expect 1 assertion of a difference in 20
to be wrong.
From Glantz, SA (2002), Primer of Biostatistics,
p. 108

67
This is a common misunderstanding of p-values.

But the error rate it describesis what we should
care about.

68
The crucial error rates

Crucial False Positive Rate (?)If a test
turns out to be significant (p ?),what is the
chance it is a (Type I) error?
Crucial False Negative Rate (?)If a test turns
out to be non-significant(p gt ?), what is the
chance it is a (Type II) error?

Not a new idea!

70
Same logic as common statistical methodology in
diagnostic testing

sensitivity
specificity
positive predictive value
negative predictive value

? power 1 - ? ? 1 - ?? ? 1 - ? ? 1 - ?
71
Wacholder, et. Al (2004)

FPRP False Positive Report Probability

Same logic as Lee and Zelen (2000).
72
How many candidate drugs are studied for every
one that finally gets FDA approval?

Thousands!

What about Cold-Eeze?

74
JAMA, 24 June 1998
75
(No Transcript)
76
(No Transcript)
77
Needed!

Sound way to use the data to compute the
posterior March of Science position.
Stay within frequentist testing. (Sorry, even as
cool as they are, regular Bayesian methods
require a much fuller specification of prior
information than just position of March of
Science.)
Stay tuned.

78
Thanks!
Dont make the wrong mistake. - Yogi
Berra

Write a Comment

User Comments (0)