Review of observational study design and basic statistics for contingency tables - PowerPoint PPT Presentation

About This Presentation

Title:

Review of observational study design and basic statistics for contingency tables

Description:

Relationship between atherosclerosis and late-life depression (Tiemeier et al. ... P('E')= Prevalence of atherosclerosis (coronary calcification 500): (511 12 ... – PowerPoint PPT presentation

Number of Views:375

Avg rating:3.0/5.0

Slides: 71

Provided by: Joh74

Learn more at: https://web.stanford.edu

Category:

more less

Transcript and Presenter's Notes

Title: Review of observational study design and basic statistics for contingency tables

1
Review of observational study design and basic
statistics for contingency tables
2
(No Transcript)
3
(No Transcript)
4

Coffee Chronicles
BY MELISSA AUGUST, ANN MARIE BONARDI, VAL
CASTRONOVO, MATTHEW
JOE'S BLOWS Last week researchers reported that
coffee might help prevent Parkinson's disease. So
is the caffeine bean good for you or not? Over
the years, studies haven't exactly been clear

According to scientists, too much coffee may
cause...
1986 --phobias, --panic attacks
1990 --heart attacks, --stress, --osteoporosis
1991 -underweight babies, --hypertension
1992 --higher cholesterol
1993, 08 --miscarriages
1994 --intensified stress
1995 --delayed conception
But scientists say coffee also may help
prevent...
1988 --asthma
1990 --colon and rectal cancer,...
2004Type II Diabetes (6 cups per day!)
2006alcohol-induced liver damage
2007skin cancer

5
Medical Studies
The General Idea
Evaluate whether a risk factor (or preventative
factor) increases (or decreases) your risk for an
outcome (usually disease, death or intermediary
to disease).
6
Observational vs. Experimental Studies
Observational studies the population is
observed without any interference by the
investigator
Experimental studies the investigator tries to
control the environment in which the hypothesis
is tested (the randomized, double-blind clinical
trial is the gold standard)
7
Limitation of observational research confounding

Confounding risk factors dont happen in
isolation, except in a controlled experiment.
Example In a case-control study of a salmonella
outbreak, tomatoes were identified as the source
of the infection. But the association was
spurious. Tomatoes are often eaten with serrano
and jalapeno peppers, which turned out to be the
true source of infection.
Example Breastfeeding has been linked to higher
IQ in infants, but the association could be due
to confounding by socioeconomic status. Women who
breastfeed tend to be better educated and have
better prenatal care, which may explain the
higher IQ in their infants.

8
Confounding A major problem for observational
studies
9
Why Observational Studies?

Cheaper
Faster
Can examine long-term effects
Hypothesis-generating
Sometimes, experimental studies are not ethical
(e.g., randomizing subjects to smoke)

10
Possible Observational Study Designs

Cross-sectional studies
Cohort studies
Case-control studies

11
Cross-Sectional (Prevalence) Studies

Measure disease and exposure on a random sample
of the population of interest. Are they
associated?
Marginal probabilities of exposure AND disease
are valid, but only measures association at a
single time point.

12
The 2x2 Table
N
13
Example cross-sectional study

Relationship between atherosclerosis and
late-life depression (Tiemeier et al. Arch Gen
Psychiatry, 2004).
Methods Researchers measured the prevalence of
coronary artery calcification (atherosclerosis)
and the prevalence of depressive symptoms in a
large cohort of elderly men and women in
Rotterdam (n1920).

14
Example cross-sectional study
P(D) Prevalence of depression (sub-thresshold
or depressive disorder) (20131291116)/1920
4.2
P(E) Prevalence of atherosclerosis (coronary
calcification gt500) (5111216)/1920 28.1
15
The 2x2 table
P(depression) 81/1920 4.2
P(atherosclerosis) 539/1920 28.1
P(depression/atherosclerosis) 28/539 5.2
16
Difference of proportions Z-test
17
Or, use relative risk (risk ratio)
Interpretation those with coronary calcification
are 37 more likely to have depression (not
significant).
18
Or, use chi-square test
Observed
Expected
19
Chi-square test
Note 1.77 1.332
20
Chi-square test also works for bigger contingency
tables (RxC)
21
Chi-square test also works for bigger contingency
tables (RxC)
Coronary calcification No depression Sub-threshhold depressive symptoms Clinical depressive disorder
0-100 865 20 9
101-500 463 13 11
gt500 511 12 16
22
Observed
Expected
Coronary calcification No depression Sub-threshhold depressive symptoms Clinical depressive disorder
0-100 865 20 9 894
101-500 463 13 11 487
gt500 511 12 16 539
1839 45 36 1920
Coronary calcification No depression Sub-threshhold depressive symptoms Clinical depressive disorder
0-100 8941839/1920 856.3 84945/1920 21 894-(21856.3)16.7
101-500 4871839/1920 466.5 48745/1920 11.4 487-(466.511.4)9.1
gt500 1839-(856.3466.5) 516.2 45-(2111.4) 12.6 36-(16.79.1) 10.2
23
Chi-square test
24
Cause and effect?
depression in elderly
atherosclerosis
25
Confounding?
depression in elderly
atherosclerosis
26
Cross-Sectional Studies

Advantages
cheap and easy
generalizable
good for characteristics that (generally) dont
change like genes or gender
Disadvantages
difficult to determine cause and effect
problematic for rare diseases and exposures

27
2. Cohort studies

Sample on exposure status and track disease
development (for rare exposures)
Marginal probabilities (and rates) of developing
disease for exposure groups are valid.

28
Example The Framingham Heart Study

The Framingham Heart Study was established in
1948, when 5209 residents of Framingham, Mass,
aged 28 to 62 years, were enrolled in a
prospective epidemiologic cohort study.
Health and lifestyle factors were measured (blood
pressure, weight, exercise, etc.).
Interim cardiovascular events were ascertained
from medical histories, physical examinations,
ECGs, and review of interim medical record.

29
Example 2 Johns Hopkins Precursors
Study(medical students 1948 through 1964)
http//www.jhu.edu/jhumag/0601web/study.html
From the John Hopkins Magazine website (URL
above).
30
Cohort Studies
Disease
Disease-free
Target population
Disease
Disease-free
TIME
31
The Risk Ratio, or Relative Risk (RR)
32
Hypothetical Data

33
Advantages/LimitationsCohort Studies

Advantages
Allows you to measure true rates and risks of
disease for the exposed and the unexposed groups.
Temporality is correct (easier to infer cause and
effect).
Can be used to study multiple outcomes.
Prevents bias in the ascertainment of exposure
that may occur after a person develops a disease.
Disadvantages
Can be lengthy and costly! 60 years for
Framingham.
Loss to follow-up is a problem (especially if
non-random).
Selection Bias Participation may be associated
with exposure status for some exposures

34
Case-Control Studies

Sample on disease status and ask retrospectively
about exposures (for rare diseases)
Marginal probabilities of exposure for cases and
controls are valid.
Doesnt require knowledge of the absolute risks
of disease
For rare diseases, can approximate relative risk

35
Case-Control Studies
Exposed in past

Disease
(Cases)

Not exposed
Target population
Exposed
No Disease (Controls)
Not Exposed
36
Example the AIDS epidemic in the early 1980s

Early, case-control studies among AIDS cases and
matched controls indicated that AIDS was
transmitted by sexual contact or blood products.
In 1982, an early case-control study matched AIDS
cases to controls and found a positive
association between amyl nitrites (poppers) and
AIDS odds ratio of 8.6 (Marmor et al. 1982).
This is an example of confounding.

37
Case-Control Studies in History

In 1843, Guy compared occupations of men with
pulmonary consumption to those of men with other
diseases (Lilienfeld and Lilienfeld 1979).
Case-control studies identified associations
between lip cancer and pipe smoking (Broders
1920), breast cancer and reproductive history
(Lane-Claypon 1926) and between oral cancer and
pipe smoking (Lombard and Doering 1928). All
rare diseases.
Case-control studies identified an association
between smoking and lung cancer in the 1950s.

38
Case-control example

A study of the relation between body mass index
and the incidence of age-related macular
degeneration (Moeini et al. Br. J. Ophthalmol,
2005).
Methods Researchers compared 50 Iranian patients
with confirmed age-related macular degeneration
and 80 control subjects with respect to BMI,
smoking habits, hypertension, and diabetes. The
researchers were specifically interested in the
relationship of BMI to age-related macular
degeneration.

39
Results
Table 2 Comparison of body mass index (BMI) in
case and control groups

40
Corresponding 2x2 Table
50
80
What is the risk ratio here? Tricky There is no
risk ratio, because we cannot calculate the risk
of disease!!
41
The odds ratio

We cannot calculate a risk ratio from a
case-control study.
BUT, we can calculate a measure called the odds
ratio

42
Odds vs. Risk
If the risk is Then the odds are
½ (50)
¾ (75)
1/10 (10)
1/100 (1)
11
31
19
199
Note An odds is always higher than its
corresponding probability, unless the probability
is 100.
43
The Odds Ratio (OR)
abcases
cdcontrols
44
The Odds Ratio (OR)
45
Proof via Bayes Rule (optional)

46
The Odds Ratio (OR)
47
The Odds Ratio (OR)
48
The Odds Ratio (OR)
Can be interpreted as Overweight people have a
43 decrease in their ODDS of age-related macular
degeneration. (not statistically significant
here)
49
The odds ratio is a good approximation of the
risk ratio if the disease is rare.
If the disease is rare (affecting lt10 of the
population), then
WHY? If the disease is rare, the probability of
it NOT happening is close to 1, and the odds is
close to the risk. Eg
50
The rare disease assumption
51
The odds ratio vs. the risk ratio
Rare Outcome
1.0 (null)
Common Outcome
1.0 (null)
52
When is the OR is a good approximation of the RR?
53
Advantages/LimitationsCase-control studies

Advantages
Cheap and fast
Efficient for rare diseases
Disadvantages
Getting comparable controls is often tricky
Temporality is a problem (did risk factor cause
disease or disease cause risk factor?
Recall bias

54
Inferences about the odds ratio
55
Properties of the OR (simulation)
(50 cases/50 controls/20 exposed)
If the Odds Ratio1.0 then with 50 cases and 50
controls, of whom 20 are exposed, this is the
expected variability of the sample OR?note the
right skew
56
Properties of the lnOR
57
Hypothetical Data
30
30
58
When can the OR mislead?
59
ExampleDoes dementia predict death?

Dementia The leading predictor of death in a
defined elderly population. Neurology 2004 62
1156-1162
Among patients with dementia 291/355 (82) died
Among patients without dementia 947/4328 (22)
died

60
Dementia study

Authors report OR 16.23 (12.27, 21.48)
But the RR 3.72
Fortunately, they do not dwell on the OR, but it
could mislead if not interpreted correctly

61
Better to give OR or RR?
From an RCT (prospective!) of a new diet drug,
the authors showed the following table
62
Better to give OR or RR?
63
Summary of statistical tests for contingency
tables
Table Size Test or measures of association
2x2 risk ratio (cohort or cross-sectional studies) odds ratio (case-control studies) Chi-square difference in proportions Fishers Exact test (cell size less than 5)
RxC Chi-square Fishers Exact test (expected cell size lt5)
64
Fishers Exact Test
65
Fishers Tea-tasting experiment
Claim Fishers colleague (call her Cathy)
claimed that, when drinking tea, she could
distinguish whether milk or tea was added to the
cup first. To test her claim, Fisher designed
an experiment in which she tasted 8 cups of tea
(4 cups had milk poured first, 4 had tea poured
first). Null hypothesis Cathys guessing
abilities are no better than chance. Alternatives
hypotheses Right-tail She guesses right more
than expected by chance. Left-tail She guesses
wrong more than expected by chance
66
Fishers Tea-tasting experiment
Experimental Results
67
Fishers Exact Test
Step 1 Identify tables that are as extreme or
more extreme than what actually happened Here
she identified 3 out of 4 of the
milk-poured-first teas correctly. Is that good
luck or real talent? The only way she could have
done better is if she identified 4 of 4 correct.
68
Fishers Exact Test
Step 2 Calculate the probability of the tables
(assuming fixed marginals)
69
Step 3 to get the left tail and right-tail
p-values, consider the probability mass
function Probability mass function of X, where
X the number of correct identifications of the
cups with milk-poured-first
SAS also gives a two-sided p-value which is
calculated by adding up all probabilities in the
distribution that are less than or equal to the
probability of the observed table (equal or more
extreme). Here 0.229.014.0.229.014 .4857
70
Summary of statistical tests for contingency
tables
Table Size Test or measures of association
2x2 risk ratio (cohort or cross-sectional study) odds ratio (case-control study) Chi-square difference in proportions Fishers Exact test (cell size less than 5)
RxC Chi-square Fishers Exact test (expected cell size lt5)

Write a Comment

User Comments (0)