Biostatistics in Practice - PowerPoint PPT Presentation

About This Presentation
Title:

Biostatistics in Practice

Description:

Title: PowerPoint Presentation Author: Biostatistics Last modified by: Peter Christenson Created Date: 10/1/2004 9:05:25 PM Document presentation format – PowerPoint PPT presentation

Number of Views:197
Avg rating:3.0/5.0
Slides: 32
Provided by: Biostatistics
Category:

less

Transcript and Presenter's Notes

Title: Biostatistics in Practice


1
Biostatistics in Practice
Session 4 Study Size for Precision or Power
Peter D. Christenson Biostatistician http//rese
arch.LABioMed.org/Biostat
2
Session 4 Issue
How many subjects?
3
Session 4 Preparation
  • We have been using a recent study on
    hyperactivity in children under diets with
    various amounts of food additives for the
    concepts in this course. The questions below
    based on this paper are intended to prepare you
    for session 4, which is on determining the size
    of a study.
  • How many children were deemed necessary to
    complete the entire study? Use the second column
    on the 4th page of the paper.

4
Session 4 Preparation 1
5
Session 4 Preparation 2
2. The authors accounted for some children to
start, but not complete the study. What
percentage of "dropouts" did they build into
their calculations?
The statistical requirements are for 80
evaluable subjects. They decided on a study
size of 120, so they were allowing up to 40/120
33 of subjects to not complete.
6
Session 4 Preparation 3
3. The authors will perform a test similar to the
t-test we discussed last week, to conclude
whether there is evidence that hyperactivity
differs under Mix A than placebo. There are two
mistakes that they may make in this decision.
What are they?
  1. Conclude Mix A ? Placebo, but Mix A Placebo
  2. Conclude Mix A Placebo, but Mix A ? Placebo

7
Session 4 Preparation 4 and 5
4. How large a difference between Mix A and
placebo do they want to detect? 5. Does the
value of 0.32 in the study size description
(second column on the 4th page) refer to a
difference? They seem to imply it is a SD. Based
on what we have said about tests comparing
"signal" to "noise", do you think both a
difference and SD are relevant for determining
the study size?
8
Session 4 Preparation 4 and 5
9
Session 4 Preparation 4 and 5
They want to detect a difference ? of 0.32 in
GHA. Smallest clinically relevant ?? Both
the ? and SD need to be accounted for. Effect
size ? / SD of SDs. Remember, reference
range 4 to 6 SDs. For this study (unusual) GHA
is scaled to have a SD of 1, so ? effect size
0.32.
10
Session 4 Goals
Review estimating and testing ?, SD and N in
estimating and testing False positive and false
negative conclusions from tests What is needed
to determine study size Software for study size
11
Review Estimation
  • Typically
  • Have sample of N representing all.
  • Find mean and SD from the N units.
  • Expect new unit to be within mean 2SD.
  • Confident (95) that mean of all is in
  • mean 2SD/vN.
  • May have this info for one or multiple groups.

12
Study Size to Achieve Precision
Precision refers to how well a measure is
estimated. Margin of error the value
(half-width) of the 95 confidence
interval. Lower margin of error ? greater
precision. To achieve a specified margin of
error, say d, solve the CI formula for N For a
mean, d 2SD/vN, so N(2SD/d)2. For a proportion
p, d 2p(1-p)/N1/2 1/vN.
Most polls use N 1000, so margin of error on
3
13
Review Statistical Tests
  • Calculate a standardized quantity for the
    particular test, a test statistic
  • Often t (Mean Expected) / SE(Mean)
  • If 1 group, Mean may be a change score.
  • If 2 groups, Mean may be the difference between
    means for two groups.
  • Expected 0 if no effect.
  • Looking for evidence to contradict no effect.

14
Review Statistical Tests
  • Compare the test statistic to the range of values
    it should be if expectations are correct.
  • Often The range has approxly normal bell curve.
  • Declare effect if test statistic is too
    extreme, relative to this range.
  • Often test statistic gt2 ? Declare effect.

15
t-Test
Declare effect if test statistic is too extreme.
Declare
How extreme? Convention Too extreme means lt
5 chance of wrongly declaring an effect.
Effect No Effect Effect
Expect
2.5
2.5
(mean expected)
t
SD/vN
95 Chance
16
t-Test
Declare effect if test statistic is too extreme.
Declare
Effect No Effect Effect
Convention Too extreme means lt 5 chance of
wrongly declaring an effect. But, what are the
chances of wrongly declaring no effect?
Expect
2.5
2.5
95 Chance
17
t-Test
Declare effect if test statistic is too extreme.
Declare
Effect No Effect Effect
But, what are the chances of wrongly declaring no
effect? To answer, we need a similar curve for
the range of values expected when there is an
effect.
Expect
2.5
2.5
95 Chance
18
Two Possible Errors from t-test
No real effect (0) Real effect 3 Effect in
study1.13
Red
Blue
Green
Consider just one possible real effect, the value
3.
41
Real Effect
No Effect
5
? Effect (Difference Between Group Means)
Just ?, not t ?/SE(?)
Conclude effect.
\\\ Probability Conclude Effect, But no Real
Effect (5). /// Probability Conclude No
Effect, But Real Effect (41).
19
Graphical Representation of t-test
No real effect (0) Real effect 3 Effect in
study1.13
Red
Blue
Suppose we need stronger proof i.e., shift
cutoff to right.
Green
41
Real Effect
No Effect
5
? Effect (Difference Between Group Means)
Just ?, not t ?/SE(?)
Conclude effect.
Then, chance of false positive is reduced to 1,
but false negative is increased to 60.
20
Power of a Study
Statistical power is the sensitivity of a study
to detect real effects, if they exist. It is
100-4159 two slides back.
21
Two Possible Errors in a Diagnostic Test
Truth
No Disease
Disease
Diagnosis
No Disease
Correct
Error
Specificity
Sensitivity
Disease
Correct
Error
Need high in follow-up test
Want high for a screening test
Specificity ? as Sensitivity?
22
Analogy with Diagnostic Testing
Truth
No Effect
Effect
Study Claims
No Effect
Correct
Error (Type II)
Specificity
Sensitivity
Effect
Correct
Error (Type I)
Set a0.05 Specificity95
Power Maximize. Choose N for 80
? Typical ?
23
Summary Factors Related to Study Size
  • Five factors are inter-related. Fixing four of
    these specifies the fifth
  • Study size, N.
  • Power (often 80 is desirable).
  • p-value cutoff (level of significance, e.g.,
    0.05).
  • Magnitude of the effect to be detected (?).
  • Heterogeneity among subjects (SD).

The next slide shows how these factors (except
SD) are typically presented in a study protocol.
24
Quote from Local Protocol Example
Thus, with a total of the planned 80 subjects, we
are 80 sure to detect (plt0.05) group differences
if treatments actually differ by at least 5.2 mm
Hg in MAP change, or by a mean 0.34 change in
number of vasopressors.
25
Comments on the Previous Table
  • Typically power80 and almost always plt0.05.
  • SD was not mentioned. There may be several
    estimates from other studies (different
    populations, intervention characteristics such as
    dosage, time, etc). Here, a pilot study exactly
    like the trial was performed by the same
    investigators.
  • Detectable difference refers to the unknown true
    difference for all, not the difference that
    will be seen eventually in the N study subjects.
  • N ? as detectable difference ?.
  • So, the major consideration is usually a
    tradeoff between N and the detectable difference.

26
Free Study Size Software
www.stat.uiowa.edu/rlenth/Power
27
Local Protocol Example Calculations
Pilot data SD8.16 for ?MAP in 36 subjects. For
p-valuelt0.05, power80, N40/group, the
detectable ? of 5.2 in the previous table is
found as
28
Hyperactivity Study Size
Study is 1-sample or paired (for each age
group). SD1 ?0.32 Use p-valuelt0.05. Want
power80. Solve for N in software to get N79.
29
Study Size for Some Other Study Types
  • Phase I Dose escalation. Safety, not efficacy.
    No power. Use N3 low dose if safe N3 in higher
    dose, etc.
  • Phase II Small, primarily safety look for
    enough evidence of efficacy to go on to Phase
    III. Often staged e.g., if 3/10 respond, test 10
    more, etc.
  • Mortality studies Patterns of deaths over time
    can be used in sample size calculations. Software
    not in the online package.

30
Approximate Formulas for Study Size
  • Two-sample t-test
  • Total N 4 x 7.85 x (SD/?)2
  • MAP Example 4 x 7.85 x (8.16/5.2)2 77 80
  • Paired t-test
  • N 7.85 x (SD/?)2
  • Hyperactivity Example
  • 7.85 x (1/0.32)2 77 80

31
Summary Study Size and Power
  • Power analysis assures that effects of a
    specified magnitude can be detected.
  • Five factors including power are inter-related.
    Fixing four of these specifies the fifth.
  • For comparing means, need pilot or data from
    other studies to estimate SD for the outcome
    measure. Comparing s does not require SD.
  • Helps support the believability of studies if the
    conclusions turn out to be negative.
Write a Comment
User Comments (0)
About PowerShow.com