Statistics%20for%20clinicians - PowerPoint PPT Presentation

About This Presentation

Title:

Statistics%20for%20clinicians

Description:

Statistics for clinicians Biostatistics course by Kevin E. Kip, ... Note the use of a pooled standard deviation denoted SP. ... but the natural log (ln) ... – PowerPoint PPT presentation

Number of Views:114

Avg rating:3.0/5.0

Slides: 56

Provided by: Kip65

Learn more at: https://sites.pitt.edu

Category:

more less

Transcript and Presenter's Notes

Title: Statistics%20for%20clinicians

1
Statistics for clinicians

Biostatistics course by Kevin E. Kip, Ph.D.,
FAHAProfessor and Executive Director, Research
CenterUniversity of South Florida, College of
NursingProfessor, College of Public
HealthDepartment of Epidemiology and
BiostatisticsAssociate Member, Byrd Alzheimers
InstituteMorsani College of MedicineTampa, FL,
USA

2
SECTION 3.1 Module Overview and
Introduction Confidence intervals, estimation of
parameters, and hypothesis testing.
3

Module 3 Learning Objectives
Describe the concepts of parameter estimation and
confidence intervals
Apply use of the z and t distribution for
calculation of confidence intervals based on
sample size
Select appropriate z and t values based on the
width of a desired confidence interval
Calculate and interpret confidence intervals for
means, proportions, and relative risk for one and
two sample designs including matched design
Use SPSS to calculate confidence intervals
Distinguish the theoretical relationship between
the risk ratio and odds ratio

Module 3 Learning Objectives
List the concept, guidelines, and primary steps
involved in hypothesis testing
Differentiate between the null and
alternative hypothesis.
Understand and interpret parameters used in
hypothesis testing (level of significance,
p-value).
Differentiate type I and type II error and
factors that impact statistical power.
Calculate and interpret sample hypotheses
a) One-sample - continuous outcome
b) One-sample - dichotomous outcome
c) One-sample - categorical/ ordinal outcome
d) Matched design continuous outcome

5
Assigned Reading Textbook Essentials of
Biostatistics in Public Health Chapters 6
and 7
6
Key terms Estimation Process of determining a
likely value for a population parameter (e.g.
mean or proportion) based on a sample. Point
Estimate Single valued estimate of a population
parameter, such as a mean or a proportion. Confid
ence Interval (CI) Range of values (e.g. likely)
for a population parameter with a level of
confidence attached (e.g. 95 confidence that the
interval contains the unknown parameter). Genera
l form for CI is point estimate margin of
error Common confidence levels are 90, 95,
and 99 but, theoretically, any level between 0
and 100 can be selected.
7
SECTION 3.2 Use of the z and t distributions for
calculation of confidence intervals
8
For the standard normal distribution, the
following is true P(-1.96 lt z lt 1.96)
0.95 i.e. there is a 95 probability that a
standard normal variable, denoted z, will fall
between -1.96 and 1.96. Using the Central Limit
Theorem, and some algebra, the 95 confidence
interval (CI) for the population mean is
General form for a CI can be written as point
estimate zSE(point estimate) where z is value
from standard normal distribution reflecting the
desired confidence level, and SEstandard error
of the point estimate
9
For the formula below for the mean (or any other
parameter, we often do not know the true value of
the population standard deviation (s)

For large sample sizes (n gt 30), s can be
estimated from the sample
standard deviation (s) based on the Central Limit
Theorem.
For small sample size (n lt 30), the Central Limit
Theorem does not
apply, and instead, the t distribution is used
(Table 2 of Appendix)
t values depend on n
small samples have larger t value (less
precision)
values are indexed by degrees of freedom (df
n-1)

10
Listing of Selected t Values for Confidence
Intervals
Example For a confidence interval of a mean with
n lt 30, use t
Confidence Level Confidence Level Confidence Level Confidence Level Confidence Level Confidence Level
df 80 90 95 98 99
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
11

SECTION 3.3
Calculation and interpretation of confidence
intervals
One Sample
Continuous outcome
Dichotomous outcome

12
CI for One Sample Continuous Outcome Parameter
Mean Body Mass Index (BMI) Sample N 180 (n gt
30, so use large sample z value) Sample
Mean 28.2 Sample SD 5.4 Confidence
Level 95 Z value 1.96 95 Confidence
Interval for µ
28.2 1.96 x (5.4 / sqrt(180)) 28.2
0.79 (27.4, 29.0)
13
28.2 1.96 x (5.4 / sqrt(180)) 28.2
0.79 (27.4, 29.0)
95 C.I.
µ 28.2
Lower limit 27.4
Upper limit 29.0
From the sample, we estimate the mean BMI as
28.2, and are 95 confident that the true
population mean lies between the interval of 27.4
to 29.0
14
CI for One Sample Continuous Outcome
(Practice) Parameter Mean diastolic blood
pressure Sample N 503 Sample
Mean 80.69 Sample SD 10.176 Confidence
Level 95 Z value ___ or t value
____ 95 Confidence Interval for µ
15
CI for One Sample Continuous Outcome
(Practice) Parameter Mean diastolic blood
pressure Sample N 503 (large sample, n gt 30)
Sample Mean 80.69 Sample SD 10.176 Confidence
Level 95 Z value 1.96 95 Confidence
Interval for µ
80.69 1.96 x 10.176 / sqrt(503) 80.69
0.889 (79.8, 81.6)
16
CI for One Sample Continuous Outcome
(Practice) Parameter Mean diastolic blood
pressure Sample N 503 (large sample, n gt 30)
Sample Mean 80.69 Sample SD 10.176 Confidence
Level 95 Z value 1.96
80.69 1.96 x 10.176 / sqrt(503) 80.69
0.889 (79.8, 81.6)
SPSS Analyze Compare Means One Sample T
Test Options 95 confidence interval
17
CI for One Sample Continuous Outcome
(Practice) Parameter Mean resting pulse (beats
per minute) Sample N 14 Sample
Mean 63.3 Sample SD 9.5 Confidence
Level 95 Z value ___ or t value
____ 95 Confidence Interval for µ
18
CI for One Sample Continuous Outcome
(Practice) Parameter Mean resting pulse (beats
per minute) Sample N 14 (small sample, n gt 30)
Sample Mean 63.3 Sample SD 9.5 Confidence
Level 95 t value 2.16 (i.e. n-1) 95
Confidence Interval for µ
63.3 2.16 x 9.5 / sqrt(14) 63.3 5.484
(57.8, 68.8)
19
CI for One Sample Dichotomous
Outcome Parameter Proportion of population
treated for hypertension Sample N 3,532
(large sample, so use z value) Sample
Proportion 0.345 (i.e. 1,219 /
3,532) Confidence Level 95 Z value
1.96 95 Confidence Interval for
0.345 0.016 (0.329, 0.361)
From the sample, we estimate the proportion of
persons treated for hypertension to be 0.345, and
we are 95 confident that the true proportion
lies between the interval of 0.329 to 0.361.
20
CI for One Sample Dichotomous Outcome
(Practice) Parameter Proportion of population
with diabetes Sample N 501 Sample
Proportion (91 / 501) _______ Confidence
Level 95 Z value _______ 95 Confidence
Interval for
21
CI for One Sample Dichotomous Outcome
(Practice) Parameter Proportion of population
with diabetes Sample N 501 (large
sample, so use z value) Sample Proportion (91 /
501) 0.1816 Confidence Level 95 Z value
1.96 95 Confidence Interval for
0.1816 0.0338
(0.148, 0.215)
From the sample, we estimate the proportion of
persons with diabetes to be 0.1816, and we are
95 confident that the true proportion lies
between the interval of 0.148 to 0.215.
22

SECTION 3.4
Calculation and interpretation of confidence
intervals
Two Samples Matched
Continuous outcome

CI for Two Samples Matched Continuous Outcome
Often used for intervention studies with a pre-
and post-measurement design (e.g. before and
after treatment)
Goal is to compare the mean score before and
after the intervention
Because the sample is matched (same persons
completing pre- and post measurements), cannot
use aggregate means (i.e. see below)
Subject ID Pre Post Difference
1 158 132 -26
2 148 138 -10
3 152 158 6
4 155 131 -24
Parameter of interest is the mean difference,
denoted µd
Parameter of interest is SD of the difference
scores, denoted sd

24
CI for Two Samples Matched Continuous
Outcome Parameter Mean difference in
depressive symptom scores after taking a new
drug Xd -12.7 Sample N 100 (number of
persons, not measurements) Sample SD SD of
difference scores sd 8.9 Confidence Level
95 Z value 1.96
-12.7 1.96 x (8.9 / sqrt(100)) -12.7
1.74 (-14,4, -11.0)
25
CI for Two Samples Matched Continuous Outcome
(Practice) Parameter Mean difference in
anxiety symptom scores after psychotherapy Xd
-14.8 Sample N 52 (number of persons, not
measurements) Sample SD SD of difference
scores sd 9.6 Confidence Level 90 Z
value ______
26
CI for Two Samples Matched Continuous Outcome
(Practice) Parameter Mean difference in
anxiety symptom scores after psychotherapy Xd
-14.8 Sample N 52 (number of persons, not
measurements) Sample SD SD of difference
scores sd 9.6 Confidence Level 90 Z
value 1.645
-14.8 1.645 x (9.6 / sqrt(52)) -14.8
2.19 (-17.0, -12.6)
From the sample, we estimate a mean difference in
anxiety scores of -14.8 after undergoing
psychotherapy, and we are 90 confident that the
true proportion lies between the interval of
-16.7 to -12.6.
27

SECTION 3.5
Calculation and interpretation of confidence
intervals
Two Samples - Independent
Continuous mean difference
Dichotomous risk difference
Dichotomous risk ratio
Dichotomous odds ratio

CI for Two Samples Independent Continuous
Outcome
Common parameter of interest is difference in
means between the two groups, X1 and X2, and
denoted for the population as
Since there are 2 independent groups, we also
have
n1 and n2 and s1 and s2
If the sample variances are approximately equal,
then we can pool the standard deviations, s1
and s2. A typical rule of thumb to pool is
s21 / s22 gt 0.5 and s21 / s22 lt 2.0
The pooled (common) standard deviation is a
weighted average

µ1 µ2
29
CI for Two Samples Independent Continuous
Outcome
Parameter Mean difference in systolic blood
pressure between a sample of men and a sample
of women Xmen 128.2 n1 1623 s1
17.5 Xwomen 126.5 n2 1911 s2
20.1 Note s21 / s22 0.76, so can use pooled
SD (Sp) Confidence Level 95 Z value 1.96
sqrt(359.12) 19.0
Formula
30
CI for Two Samples Independent Continuous
Outcome
Parameter Mean difference in systolic blood
pressure between a sample of men and
women Xmen 128.2 n1 1623 s1
17.5 Xwomen 126.5 n2 1911 s2 20.1
Formula
1.7 1.26 (0.44, 2.96)
31
CI for Two Samples Independent Continuous
Outcome (Practice)
Parameter Mean difference in depression scores
between a sample of men and women Xmen
5.77 n1 163 s1 7.674 Xwomen
6.86 n2 333 s2 8.714
Note s21 / s22
_________
Assume calculation of a 95 confidence interval
32
CI for Two Samples Independent Continuous
Outcome (Practice)
Parameter Mean difference in depression scores
between a sample of men and women Xmen
5.77 n1 163 s1 7.674 Xwomen
6.86 n2 333 s2 8.714
Note s21 / s22 0.78, so can use pooled SD
(Sp)
sqrt((9540 25210) / 494) 8.39
1 1 163 333
(5.77 6.86) 1.96(8.39)
(-2.66, 0.49)

-1.09
33
CI for Two Samples Independent Continuous
Outcome (Practice)
Parameter Mean difference in depression scores
between a sample of men and women Xmen
5.77 n1 163 s1 7.674 Xwomen
6.86 n2 333 s2 8.714
From the sample, we estimate a mean difference in
depression scores between men and women of -1.09,
and we are 95 confident that the true mean
difference lies between the interval of -2.66 to
0.49.
SPSS Analyze Compare Means Independent
Samples T Test Test Variable Grouping
Variable Options CI percentage
34

CI for Two Samples Independent Risk Difference
Parameter of interest is the risk difference for
the incidence proportions in the population,
denoted as RD p1 p2
For a sample, the point estimate for the risk
difference is denoted
as RD p1 p2

Formula
Example Incidence of CVD in Smokers and
Non-Smokers
No CVD CVD Total Incidence
Current smoker 663 81 (x1) 744 p1 81 / 744 0.1089
Non-smoker 2757 298 (x2) 3055 p2 298 / 3055 0.0975
Total 3420 379 3799
35

CI for Two Samples Independent Risk Difference
Example Compare the incidence proportion of CHD
among smokers (exposed) and non-smokers (not
exposed)
Smokers n1 744 w/CHD(x1) 81 p1 0.1089
Non-smokers n2 3055 w/CHD(x2) 298 p2
0.0975
Confidence Level 95 Z value 1.96

0.0114 0.0247 (-0.0133, 0.0361)
36

CI for Two Samples Independent Risk Difference
(Practice)
Example Compare the incidence proportion of
sleep disorder among person on statins (exposed)
and not on statins (not exposed)
Confidence Level 95 Z value _______

Sleep OK Sleep Dx Total Incidence
Statin user 91 14 (x1) 105 p1 14 / 105 0.1333
Non-statin user 369 28 (x2) 397 p2 28 / 397 0.0705
Total 460 42 502
37

CI for Two Samples Independent Risk Difference
(Practice)
Example Compare the incidence proportion of
sleep disorder among person on statins (exposed)
and not on statins (not exposed)
Confidence Level 95 Z value 1.96

Sleep OK Sleep Dx Total Incidence
Statin user 91 14 (x1) 105 p1 14 / 105 0.1333
Non-statin user 369 28 (x2) 397 p2 28 / 397 0.0705
Total 460 42 502
0.1333(1 0.1333) 0.0705(1 0.0705)
0.1333 0.0705 1.96

105
397
0.063 0.0697 (-0.007, 0.133)
38

CI for Two Samples Independent Risk Difference
(Practice)
Example Compare the incidence proportion of
sleep disorder among person on statins (exposed)
and not on statins (not exposed)
Confidence Level 95 Z value 1.96

Sleep OK Sleep Dx Total Incidence
Statin user 91 14 (x1) 105 p1 14 / 105 0.1333
Non-statin user 369 28 (x2) 397 p2 28 / 397 0.0705
Total 460 42 502
0.063 0.0697 (-0.007, 0.133)
From the sample, we estimate that absolute risk
of sleep disorder is 0.063 higher in statin-users
compared to non-users, and we are 95 confident
that the true risk difference lies between the
interval of -0.007 to 0.1333.
39

CI for Two Samples Independent Risk Ratio
Parameter of interest is the ratio of the
incidence proportions for the population, denoted
as RR p1 / p2
For a sample, the point estimate for the risk
ratio (RR) is denoted as
RR p1 / p2
Note that the RR does not follow a normal
distribution, but the natural log (ln) of the RR
is approximately normally distributed and is used
to calculate the confidence interval this
entails 2 steps
--- Calculate CI for ln(RR)
--- Calculate CI for RR (i.e. transform)

CI for ln(RR)
CI for (RR)
exp(Lower limit), exp(Upper limit)
40
CI for Two Samples Independent Risk Ratio RR
p1 / p2
CI for ln(RR)
CI for (RR)
exp(Lower limit), exp(Upper limit)

Example Compare future risk of CHD among smokers
(exposed) and non-smokers (not exposed)
Smokers n1 744 w/CHD(x1) 81 p1 0.1089
Non-smokers n2 3055 w/CHD(x2) 298 p2
0.0975
Confidence Level 95 Z value 1.96

RR p1 / p2 0.1089 / 0.0975 1.12
CI for ln(RR)
0.113 0.232 (-0.119, 0.345) (exp(-0.119)
, exp(0.345)) (0.89, 1.41)
41
CI for Two Samples Independent Risk Ratio
(Practice) RR p1 / p2
CI for ln(RR)
CI for (RR)
exp(Lower limit), exp(Upper limit)

Example Compare the future risk of sleep
disorder among statin users (exposed) versus
non-statin users (not exposed)
Confidence Level 95 Z value _______

Sleep OK Sleep Dx Total Incidence
Statin user 91 14 (x1) 105 p1 14 / 105 0.1333
Non-statin user 369 28 (x2) 397 p2 28 / 397 0.0705
Total 460 42 502
RR p1 / p2
CI for ln(RR)
42
CI for Two Samples Independent Risk Ratio
(Practice) RR p1 / p2
CI for ln(RR)
CI for (RR)
exp(Lower limit), exp(Upper limit)

Example Compare the future risk of sleep
disorder among statin users (exposed) versus
non-statin users (not exposed)
Confidence Level 95 Z value 1.96

Example Compare the future risk of sleep
disorder among statin users (exposed) versus
non-statin users (not exposed)
Confidence Level 95 Z value 1.96

Sleep OK Sleep Dx Total Incidence
Statin user 91 14 (x1) 105 p1 14 / 105 0.1333
Non-statin user 369 28 (x2) 397 p2 28 / 397 0.0705
Total 460 42 502
RR p1 / p2 0.1333 / 0.0705 1.89
CI for ln(RR)
0.6366 0.6044 (0.0322,
0.6044) (exp(0.0322), exp(1.24)) (1.03,
3.46)
From the sample, we estimate that risk of sleep
disorder is 1.89 times higher in statin-users
compared to non-users, and we are 95 confident
that the true risk lies between the interval of
1.03 to 3.46.
44

CI for Two Samples Independent Odds Ratio
Conceptually similar to risk ratio, yet the
parameter of interest is the odds ratio (OR),
defined as
Odds of exposure among cases / Odds of exposure
among controls

Example Prevalence of CVD in Smokers and
Non-Smokers (95 C.I.)
CVD (D) No-CVD (D-)
Current smoker(E) 81 663
Non-smoker (E-) 298 2757
Cases Controls
Exposed a b
Not exposed c d
OR (81 / 298) / (663 / 2757) 1.13 Z 1.96
CI for ln(OR)
0.122 0.260 (-0.138, 0.382) (exp(-0.138)
, exp(0.382)) (0.87, 1.47)
45
CI for Two Samples Independent Odds Ratio
(Practice) OR Odds of exposure among cases /
Odds of exposure among controls
Prevalence of Sleep Disorder Among Statin and
Non-Statin Users (95 C.I.)
Cases Controls
Exposed a b
Not exposed c d
Sleep Dx Sleep OK
Statin user (E) 14 91
Non-statin user (E-) 28 369
OR (a / c) / (b / d) _________ Z
___________
CI for ln(OR)
CI for (OR)
exp(Lower limit), exp(Upper limit)
46
CI for Two Samples Independent Odds Ratio
(Practice) OR Odds of exposure among cases /
Odds of exposure among controls
Example Prevalence of Sleep Disorder Among
Statin and Non-Statin Users
Cases Controls
Exposed a b
Not exposed c d
Sleep Dx Sleep OK
Statin user (E) 14 91
Non-statin user (E-) 28 369
OR (14 / 28) / (91 / 369) 2.027 Z 1.96
CI for ln(OR)
0.7066 0.6813 (0.0253,
1.3879) (exp(0.0253), exp(1.3879)) (1.03,
4.01)
47
CI for Two Samples Independent Odds Ratio
(Practice) OR Odds of exposure among cases /
Odds of exposure among controls
Example Prevalence of Sleep Disorder Among
Statin and Non-Statin Users
Cases Controls
Exposed a b
Not exposed c d
Sleep Dx Sleep OK
Statin user (E) 14 91
Non-statin user (E-) 28 369
OR (14 / 28) / (91 / 369) 2.027
0.7066 0.6813 (0.0253,
1.3879) (exp(0.0253), exp(1.3879)) (1.03,
4.01)
From the sample, we estimate that the odds of
statin use among persons with sleep disorder are
2.03 times higher that the odds of statin-use
among persons without sleep disorder, and we are
95 confident that the value lies between the
interval of 1.03 to 4.01.
48
SECTION 3.6 Use of SPSS to calculate confidence
intervals
49
CI for Two Samples Independent Odds Ratio
(Practice)
Example Prevalence of Sleep Disorder Among
Statin and Non-Statin Users
Cases Controls
Exposed a b
Not exposed c d
Sleep Dx Sleep OK
Statin user (E) 14 91
Non-statin user (E-) 28 369
OR (14 / 28) / (91 / 369) 2.027
0.7066 0.6813 (0.0253,
1.3879) (exp(0.0253), exp(1.3879)) (1.03,
4.01)
SPSS Analyze Descriptive Statistics Crosstab
s Row and Column Variable Statistics
(check Risk)
50
Sleep Dx Sleep OK
Statin user (E) 14 91
Non-statin user (E-) 28 369
OR 2.027 95 C.I. 1.04, 4.01
1.0 Null value
OR 2.03
Lower limit 1.04
Upper limit 4.01
0 Bounded at 0
10 Unbounded

Note
The confidence interval for a continuous variable
such as mean or difference in mean is symmetric
around the point estimate.
In contrast, for the risk ratio and odds ratio,
the confidence interval is skewed to the right of
the point estimate This is because
Values for RR and OR have a lower bound of 0 yet
no upper bound
The C.I. formulas are based on an exponential
function

51
SECTION 3.7 Relationship between the risk ratio
and the odds ratio
52
Odds Ratio Risk Ratio

Relationship between RR and OR
The odds ratio will provide a good estimate of
the
risk ratio when
The outcome (disease) is rare
OR
2. The effect size is small or modest

53
Odds Ratio Risk Ratio

The odds ratio will provide a good estimate of
the
risk ratio when
The outcome (disease) is rare

a / (a b ) RR ------------ c / (c d)
D D-
E a b
E- c d

If the disease is rare, then cells (a) and (c)
will be small
a / (a b ) a / b ad RR ------------
------ --- OR c / (c d) c / d bc
OR (a / c) / (b / d)
OR (ad) / (bc)
54
Odds Ratio Risk Ratio
The odds ratio will provide a good estimate of
the risk ratio when 2. The effect size is small
or modest.
D D-
E 40 60
E- 120 180

(40 / 120) 0.333 OR ------------ -------
1.0 (60 / 180) 0.333
40 / (40 60) 0.40 RR --------------------
------ 1.0 120 / 120 180) 0.40
55
Odds Ratio Risk Ratio
Finally, we expect the risk ratio to be closer to
the null value of 1.0 than the odds ratio.
Therefore, be especially cautious when
interpreting the odds ratio as a measure of
relative risk when the outcome is not rare and
the effect size is large.
(20 / 10) 2.0 OR ------------ -------
6.0 (30 / 90) 0.333
D D-
E 20 30
E- 10 90

(20 / 50) 0.40 RR ------------ -------
4.0 (10 / 100) 0.10

Write a Comment

User Comments (0)