# Bias - PowerPoint PPT Presentation

Title: Bias

1
There are known knowns. These are things we know
that we know. There are known unknowns. That is
to say, there are things that we know we don't
know. But there are also unknown unknowns. There
are things we don't know we don't know. Donald
Rumsfeld
2
Bias
3
Bias
A systematic error (caused by the investigator or
the subjects) that causes an incorrect (over- or
under-) estimate of an association.
True Effect
Relative Risk
0
1.0
10
Protective effect No Difference
Increased risk
4
Suppose a study was conducted multiple times in
an identical way.
True value
Null
Precise Accurate
5
Errors Affecting Validity
Consider
• Chance (Random Error Sampling Error)
• Bias (Systematic Errors inaccuracies)
• Selection bias
• Loss to follow-up bias
• Information bias
• Nondifferential (e.g. simple misclassification)
• Differential Biases (e.g., recall bias,
interviewer bias)
• Confounding (Imbalance in Other Factors)

6
Selection Bias
Occurs when selection, enrollment, or continued
participation in a study is somehow dependent on
the likelihood of having the exposure of interest
or the outcome of interest.
Selection bias can cause an overestimate or
underestimate of the association.
7
• Selection bias can occur in several ways
• Selection of a comparison group ("controls") that
is not representative of the population that
produced the cases in a case-control study.
(Control selection bias)
• Differential loss to follow up in a cohort study,
such that the likelihood of being lost to follow
up is related to outcome status and exposure
status. (Loss to follow-up bias)
• Refusal, non-response, or agreement to
participate that is related to the exposure and
disease (Self-selection bias)
• Using the general population as a comparison
group for an occupational cohort study ("Healthy
worker" effect)
• Differential referral or diagnosis of subjects

8
Selection Bias in a Case-Control Study
Selection bias can occur in a case-control study
if controls are more (or less) likely to be
selected if they have the exposure.
Do women of low SES have higher risk of cervical
cancer?
MGH 100 Hospital Cases
200 Controls Door-to-door survey of
neighborhood around the hospital during work day.
9
200 Controls Door-to-door survey of
neighborhood around the hospital during work day.
• Problems
• SE status of people living around the hospital
may generally be different from that of the
population that produced the cases.
• The door-to-door method of selecting controls may
tend to select people of lower (or higher) SE
status.

10
Selection bias can occur in a case-control study
if controls are more (or less) likely to be
selected if they have the exposure.
Selection bias is not caused by differences in
other potential risk factors (confounding). It is
caused by selecting controls who are more (or
less) likely to have the exposure of interest.
11
Selection Bias in a Case-Control Study
Dis.
Dis.
Y N
Y N
75 100
25 100
75 120
25 80
Y N
Y N
Exp.
Exp.
Control Selection Bias
True
OR3.0
OR2.0
12
Control Selection BiasThe Would Criterion
• Are the controls a representative sample of the
population that produced the cases?
• If a control had developed cervical cancer,
would she have been included in the case group?
(Would criterion)

You should try to fulfill the would criterion
studied, is it likely that they would have ended
up in the case group? If the answer is not
necessarily, then there is likely to be a
problem with selection bias.
13
2,000,000 women gt age 20 in MA, about 200 cases
of cervical cancer per year.
If low SES were associated with cervical cancer
with OR3.0, MA would look like this.
Entire Population Cancer Cases Normal
Low SES (ltmedian) 150 1,000,000
High SES (gtmedian) 50 1,000,000
Sample Cancer Cases Normal
Low SES 75
High SES 25
OR (75/25) 2.0 (120/80)
(Biased)
14
Are mothers of children with hemifacial
microsomia more often diabetic?
Cases are referred, but what if controls are
selected from the general pediatrics ward at MGH?
Referred Cases
Referral mechanism of controls might be very
different from that of the cases with microsomia.
Could mothers of controls be more or less likely
to be diabetic than the cases (regardless of any
association between diabetes and microsomia)?
How would you select controls for this study?
15
Self- Selection Bias in a Case-Control Study
Selection bias can be introduced into
case-control studies with low response or
participation rates if the likelihood of
responding or participating is related to both
the exposure and outcome. Example A
case-control study explored an association
between family history of heart disease
(exposure) and the presence of heart disease in
subjects. Volunteers are recruited from an HMO.
Subjects with heart disease may be more likely to
participate if they have a family history of
disease.
16
Self-Selection Bias in a Case-Control Study
Dis.
Dis.
Y N
Y N
300 200
200 300
240 (80) 120 (60)
120 (60) 180 (60)
Y N
Y N
Exp.
Exp.
Self-Selection Bias
True
OR2.25
OR3.0
Best solution is to work toward high
participation (gt80) in all groups.
17
Selection Bias in a Retrospective Cohort Study
In a retrospective cohort study selection bias
occurs if selection of exposed non-exposed
subjects is somehow related to the outcome.
What will be the result if the investigators are
more likely to select an exposed person if they
have the outcome of interest?
18
Selection Bias in a Retrospective Cohort Study
Example Investigating occupational exposure (an
organic solvent) occurring 15-20 yrs. ago in a
factory. Exposed unexposed subjects are
enrolled based on employment records, but some
records were lost.
Suppose there was a greater likelihood of
retaining records of those who were exposed got
disease.
19
Selection Bias in a Retrospective Cohort Study
Differential referral or diagnosis of subjects
Dis.
Dis.
Y N
Y N
100 900
50 950
99 720
40 760
Y N
Y N
Exp.
Exp.
20 of employee health records were lost or
discarded, except in solvent workers who
reported illness (1 loss).
True
RR2.0
RR2.42
Workers in the exposed group were more likely to
be included if they had the outcome of interest.
20
The Healthy Worker Effect
Can be considered a form of selection bias
because the general population controls have a
higher probability of getting the outcome (death).
The general population is often used in
occupational studies of mortality, since data is
readily available, and they are mostly unexposed.
vs.
Mortality Rates?
The main disadvantage is bias by the healthy
worker effect. The employed work force (mostly
healthy) generally has lower rates of mortality
and disease than the general population (with
healthy ill people).
21
Differential Retention (Loss to Follow Up) in
Prospective Cohort Studies
Enrollment into a prospective cohort study will
not be biased by the outcome, because the outcome
has not occurred at enrollment. However,
prospective cohort studies can have selection
bias if the exposure groups have differential
retention of subjects with the outcomes of
interest. This can cause either an over- or
under- estimate of association
22
Selection Bias in a Prospective Cohort Study
More events lost in one exposure group
Dis.
Dis.
Y N
Y N
20 9980
10 9990
8 5980
8 5990
Y N
Y N
Exp.
Exp.
True
OR2.0
RR1.0
23
Differential loss to follow up in a prospective
cohort study on oral contraceptives (OC)
thromboembolism (TE).
If OC were associated with TE with RR2.0
(TRUTH), the 2x2 for all of MA would look like
this.
Without Losses TE Normal
OC 20 9,980
OC- 10 9,990
If OC users with TE are more likely to be lost
than non-OC-users with TE
There is 40 loss to follow up overall, but a
greater tendency to loose OC users with TE
results in a de facto selection.
Final Sample TE Normal
OC 8 5,980
OC- 8 5,990
(Biased)
RR (8/5988) 1.0 (8/5998)
24
Observation Bias (Information Bias)
Systematic errors due to incorrect categorization.
Diseased Not Diseased
Exposed
Not Exposed
The Correct Classification
25
Misclassification Bias
Subjects are misclassified with respect to their
risk factor status or their outcome, i.e., errors
in classification.
Non-differential Misclassification (random) If
errors are about the same in both groups, it
tends to minimize any true difference between the
groups (bias toward the null).
• Differential Misclassification (non-random)
If information is better in one group than
another, the association maybe over- or
underestimated.

26
Non-Differential Misclassification
When errors in exposure or outcome status occur
with approximately equal frequency in groups
being compared.
• Difficulty remembering exposures (equal in both
groups)
• Example Case-control study of heart disease and
past activity difficulty remembering your
specific exercise frequency, duration, intensity
over many years
• Recording and coding errors in records and
databases.
• Example ICD-9 codes in hospital discharge
summaries.
• Using surrogate measures of exposure
• Example Using prescriptions for
anti-hypertensive medications as an indication of
treatment
• Non-specific or broad definitions of exposure or
outcome.
• Example Do you smoke? to define exposure to
tobacco smoke.

27
Non-Differential Misclassification
Random errors in classification of risk factors
or outcome (i.e., error rate about the same in
all groups).
• Example
• When patients are discharged, the MD dictates a
summary which is transcribed. Diagnoses and
procedures noted on the summary are encoded
(ICD-9 codes) and sent to the MA Health Data
Consortium.
• MDs dont list all relevant diagnoses.
• Errors occur in 25-30 of records.

28
Non-Differential Misclassification
Random errors in classification of risk factors
or outcome (i.e., error rate about the same in
all groups).
Tends to minimize differences, generally causing
an underestimate of effect.
Effect
Example A case-control study comparing CAD cases
controls for history of diabetes. Only half of
the diabetics are correctly recorded as such in
cases and controls.
With Nondifferential Misclassification
True Relationship
CAD Controls Diabetes 40 10 No
diabetes 60 90 OR 40x90 6.0 10x60
5 No diabetes 80 95 OR 20x95 4.75
5x80
29
Non-Differential Misclassification
When there are random errors in classification of
risk or outcome, i.e. errors occur with equal
frequency in both groups.
Effect With a dichotomous exposure, it
minimizes differences causes an underestimate
of effect, i.e. bias toward the null.
30
Diseased Not Diseased
Exposed
Not Exposed
Nondifferential Misclassification of Exposure 1
31
Diseased Not Diseased
Exposed
Not Exposed
Nondifferential Misclassification of Exposure 2
32
Validation to Identify Random Misclassification
in a Prospective Cohort Study
• Obesity heart disease in women
(questionnaires)
• Guessing at weight?

Self-reported weights were validated in a
subsample of 184 NHS participants living in the
Boston, MA area and were highly correlated with
actual measured weights (r 0.96).
Cho E, Manson JE, et al. A Prospective Study of
Obesity and Risk of Coronary Heart Disease Among
Diabetic Women. Diabetes Care 2511421148, 2002.
33
Differential Misclassification
When there are more frequent errors in exposure
or outcome classification in one of the groups.
• Differences in accurately remembering exposures
(unequal)
• Example Mothers of children with birth defects
will remember the drugs they took during
pregnancy better than mothers of normal children
(maternal recall bias).
• Interviewer or recorder bias.
• Example Interview has subconscious belief about
the hypothesis.
• More accurate information in one of the groups.
• Example Case-control study with cases from one
facility and controls from another with
differences in record keeping.

34
Recall Bias
(Differential)
(If the groups have the same of errors based on
faulty memory, thats non-differential
misclassification.)
• People with disease may remember exposures
differently (more or less accurately) than those
without disease.
• To Minimize
• Use a control group that has a different disease
(unrelated to the disease under study).
• Use questionnaires that are constructed to
maximize accuracy and completeness. Ask specific
questions. More accuracy means fewer differences.
• For socially sensitive questions, such as alcohol
and drug use or sexual behaviors, use a
interviewer.
• If possible, assess past exposures from
biomarkers or from pre-existing records.

35
Interviewer Bias( Recorder Bias in Chart
Reviews)
(Differential)
Systematic difference in soliciting, recording,
or interpreting information.
• Minimized by
• Blinding the interviewers if possible.
• Using standardized questionnaires consisting of
closed-end, easy to understand questions with
appropriate response options.
• Training all interviewers to adhere to the
question and answer format strictly, with the
same degree of questioning for both cases and
controls.
• Obtaining data or verifying data by examining
pre-existing records (e.g., medical records or
employment records) or assessing biomarkers.

36
Effects of Bias
Non-Differential Misclassification
Bias to Null
Selection bias
Interviewer bias
Recall Bias
Differential Misclassification
These are differential and can bias toward or
away from null.
37
Misclassification of Outcome Can Also Introduce
Bias
• but it usually has much less of an impact than
misclassification of exposure, because
• Most of the problems with misclassification occur
with respect to exposure status, not outcome.
• There are a number of mechanisms by which
misclassification of exposure can be introduced,
but most outcomes are more definitive and there
are few mechanisms that introduce errors in
outcome.
• Most outcomes are relatively uncommon.
• Misclassification of outcome will generally bias
toward the null, so if an association is
demonstrated, if anything the true effect might
be slightly greater.

38
Any concerns?
A study is conducted to see if serum cholesterol
screening reduces the rate of heart attacks.
1,500 members of an HMO are offered the
opportunity to participate in the screening
program, 600 volunteer to be screened. Their
rates of MI are compared to those of randomly
selected members who were not invited to be
screened. After 3 years of follow-up rates of MI
are found to be significantly less in the
screened group.
1. No
2. Differential misclassification
3. Interviewer bias
4. Recall bias
5. Selection bias

39
Background Information on Abdominal Aortic
Aneurysms
40
Diagnosis of AAA
• Usually asymptomatic (surgery if gt 5 cm.)
• Discovered during routine abdominal
• exam by palpation, or
• Seen on x-ray or ultrasound of
• abdomen (done for other reasons).
• Known risk factors
• Age
• Male gender
• Smoking
• Hypertension

41
Costa Robbs Br. J. Surg. 1986Abdominal
Aneurysms.
• A vascular surgery (referral) service in So.
Africa reviewed records of elective peripheral
vascular surgery.

Other a variety of readily apparent
conditions.
320 1,862
Conclusion AAA uncommon in Blacks and more
often due to infections.
OR 0.12 (0.09 0.15)
42
Was there selection bias?
Other variety of readily apparent conditions.
1. Yes
2. No

43
Was there selection bias?
Other variety of readily apparent conditions.
44
A possibility of misclassification?
All black patients were screened for TB and
for syphilis.
Blacks Whites Atherosclerotic 34 99 Inflamm
atory or Infectious 47 0.5 Uncertain
etiology 19 0. 0
AAA in blacks are more often due to infectious
causes.
1. No
2. Yes, random.
3. Yes, differential.

45
White Black
MaleFemale 21 11
Mean age 49.4 67.1
Admitted for Uncontrolled HBP 0 17
Smoking 76 48
• (Known risk factors)
• Age
• Male gender
• Smoking
• Hypertension

46
Environmental tobacco smoke and tobacco
related mortality in a prospective study of
Californians, 1960-98. James E. Enstrom, Geoffrey
C. Kabat. BMJ 20033261057
118,094 adults enrolled in an ACS cancer study in
1959 were followed until 1998. For never smokers
married to ever smokers compared with never
smokers married to never smokers
RR in Males RR in Females Heart
disease 0.94 (0.85 - 1.05) 1.01 (0.94 -
1.08) Lung cancer 0.75 (0.42 - 1.35) 0.99
(0.72 - 1.37) Chr. Pulm. Dis. 1.27 (0.78 - 2.08)
1.13 (0.80 - 1.58) Conclusions The results
do not support a causal relation between
environmental tobacco smoke and tobacco related
mortality, although they do not rule out a small
effect.
47
Environmental tobacco smoke and tobacco
related mortality in a prospective study of
Californians, 1960-98. James E. Enstrom, Geoffrey
C. Kabat. BMJ 20033261057
The independent variable was exposure to
environmental tobacco smoke based on smoking
status of the spouse in 1959, 1965, and
1972. Never smokers married to a current
smoker were subdivided into categories according
to the smoking status of their spouse 1-9,
10-19, 20, 21-39, 40 cigarettes consumed per day
for men and women, with the addition of pipe or
cigar usage for women. Former smokers were
48
Any potential selection bias in the ETS study?
1. I dont think so.
2. Yes, there was a potential for it.

49
Any potential information bias in the ETS study?
1. I dont think so.
2. Non-differential misclassification.
3. Differential misclassification.
4. Interviewer bias.
5. Recall bias.

50
Are Analgesic Drugs Associated with Increased
Risk of Renal Failure?
• Case-Control study
• in Maryland, Virginia, West Virginia, D.C.
• Cases found with renal dialysis registry.
• Controls random digit dial.
• Data Estimated lifetime analgesic use based on
phone interview.

51
Case-Control Study Analgesic Use Renal Failure
OR 95 CI
Acetaminophen
0-999 1.0 -
1000-4999 2.0 1.3-3.2
gt5000 2.4 1.2-4.8
Aspirin
0-999 1.0 -
1000-4999 0.5 0.4-0.7
gt5000 1.0 0.6-1.8
NSAIDs
0-999 1.0 -
1000-4999 0.6 0.3-1.1
gt5000 8.8 1.1-71.8
Conclusion Acetaminophen NSAIDS increase risk
of renal failure, but not aspirin.
Could any biases have influenced the conclusion?
52
Could interviewer bias have affected results?
1. Highly unlikely.
2. Definitely a possibility.

53
Could recall bias have affected results?
1. Highly unlikely.
2. Definitely a possibility.

54
Reverse Causation
Example Chronic diabetes is a common cause of
renal failure. Suppose diabetics more frequently
have conditions that require analgesics.
In this case, it may appear that analgesic use
that is greater than in controls is associated
with a greater risk of renal failure.
55
Avoiding Bias
Once its in the study, you cant fix it.
• Select subjects by similar mechanism.
• Blind interviewers.
• Get subjects with equal tendency to remember.
• Use clear, homogeneous definitions of disease
exposure.
• Get accurate data collected in a similar way.
• Confirm data error trapping during data entry.
• Use procedures to minimize loss to follow-up.

56
(No Transcript)
57
Confounding By Indication
• A bias that occurs in observational studies of
drug effects. Allocation is not randomized and
drug selection may be influenced by pre-existing
disease.

Example Physicians might advise their patients
with renal failure not to take aspirin.
58
JK Allen, et al. Disparities in Womens Referral
to and Enrollment in Outpatient Cardiac
Rehabilitation J. Gen. Intern. Med
200419747-753.
253 women (108 African American, 145 white) were
surveyed within the first month of discharge from
the hospital for a PCTA, CABG, or MI. 234 (99
African American, 135 white) completed the
6-month follow-up. RESULTS The rate of referral
to outpatient phase 2 cardiac rehabilitation was
significantly lower for African-American women
compared with white women, 12 (12) vs. 33 (24)
(P .03). Only 35 (15) of women in the study
reported enrollment in phase 2 cardiac
rehabilitation programs, with fewer
African-American women reporting enrollment
compared with white women, 9 (9) versus 26 (19)
(P .03). Controlling for age, education, angina
class, and co-morbidities, women with annual
incomes lt20,000 were 66 less likely to be
referred to cardiac rehabilitation (P .01) and
60 less likely to enroll compared to women with
incomes gt20,000 (P .01). Although borderline
significant, African-American women were 55 less
likely to be referred (P .059) and 58 less
likely to enroll (P .059) than white women.
59
Methods women were identified at the time of
hospitalization for a coronary event. They were
interviewed by telephone within the first 4 weeks
following their hospital discharge to collect
baseline socio-demographic and clinical data.
They were interviewed again 6 months later by
telephone to obtain information on referral to
and enrollment in cardiac rehabilitation
programs, and information on psychosocial and
behavioral factors that may be associated with
rehabilitation utilization. Interviews were
conducted by three trained research assistants.
The 6-month interview assessed the receipt of
a referral from self-report of the patient,
including the patients recall of having received
a verbal or written referral by a health
professional at any time since being
hospitalized. For those who reported receiving a
referral, the reinforcing factors of the
patients perception of the strength of the
health professionals and family/significant
others encouragement to participate in cardiac
rehabilitation was measured using a scale of 1
(little or no encouragement) to 10 (strongly
encouraged). Enabling factors such as the
accessibility, availability, and acceptability of
cardiac rehabilitation services were assessed.
60
Diseased Not Diseased
Exposed
Not Exposed
Differential Misclassification of Outcome
61
Diseased Not Diseased
Exposed
Not Exposed
Nondifferential Misclassification of Outcome
View by Category
Title:

## Bias

Description:

### Title: Bias Author: LaMorte Last modified by: Lamorte, Wayne W Created Date: 1/23/2006 3:17:34 PM Document presentation format: On-screen Show (4:3) – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 62
Provided by: LaMo2
Category:
Tags:
Transcript and Presenter's Notes

Title: Bias

1
There are known knowns. These are things we know
that we know. There are known unknowns. That is
to say, there are things that we know we don't
know. But there are also unknown unknowns. There
are things we don't know we don't know. Donald
Rumsfeld
2
Bias
3
Bias
A systematic error (caused by the investigator or
the subjects) that causes an incorrect (over- or
under-) estimate of an association.
True Effect
Relative Risk
0
1.0
10
Protective effect No Difference
Increased risk
4
Suppose a study was conducted multiple times in
an identical way.
True value
Null
Precise Accurate
5
Errors Affecting Validity
Consider
• Chance (Random Error Sampling Error)
• Bias (Systematic Errors inaccuracies)
• Selection bias
• Loss to follow-up bias
• Information bias
• Nondifferential (e.g. simple misclassification)
• Differential Biases (e.g., recall bias,
interviewer bias)
• Confounding (Imbalance in Other Factors)

6
Selection Bias
Occurs when selection, enrollment, or continued
participation in a study is somehow dependent on
the likelihood of having the exposure of interest
or the outcome of interest.
Selection bias can cause an overestimate or
underestimate of the association.
7
• Selection bias can occur in several ways
• Selection of a comparison group ("controls") that
is not representative of the population that
produced the cases in a case-control study.
(Control selection bias)
• Differential loss to follow up in a cohort study,
such that the likelihood of being lost to follow
up is related to outcome status and exposure
status. (Loss to follow-up bias)
• Refusal, non-response, or agreement to
participate that is related to the exposure and
disease (Self-selection bias)
• Using the general population as a comparison
group for an occupational cohort study ("Healthy
worker" effect)
• Differential referral or diagnosis of subjects

8
Selection Bias in a Case-Control Study
Selection bias can occur in a case-control study
if controls are more (or less) likely to be
selected if they have the exposure.
Do women of low SES have higher risk of cervical
cancer?
MGH 100 Hospital Cases
200 Controls Door-to-door survey of
neighborhood around the hospital during work day.
9
200 Controls Door-to-door survey of
neighborhood around the hospital during work day.
• Problems
• SE status of people living around the hospital
may generally be different from that of the
population that produced the cases.
• The door-to-door method of selecting controls may
tend to select people of lower (or higher) SE
status.

10
Selection bias can occur in a case-control study
if controls are more (or less) likely to be
selected if they have the exposure.
Selection bias is not caused by differences in
other potential risk factors (confounding). It is
caused by selecting controls who are more (or
less) likely to have the exposure of interest.
11
Selection Bias in a Case-Control Study
Dis.
Dis.
Y N
Y N
75 100
25 100
75 120
25 80
Y N
Y N
Exp.
Exp.
Control Selection Bias
True
OR3.0
OR2.0
12
Control Selection BiasThe Would Criterion
• Are the controls a representative sample of the
population that produced the cases?
• If a control had developed cervical cancer,
would she have been included in the case group?
(Would criterion)

You should try to fulfill the would criterion
studied, is it likely that they would have ended
up in the case group? If the answer is not
necessarily, then there is likely to be a
problem with selection bias.
13
2,000,000 women gt age 20 in MA, about 200 cases
of cervical cancer per year.
If low SES were associated with cervical cancer
with OR3.0, MA would look like this.
Entire Population Cancer Cases Normal
Low SES (ltmedian) 150 1,000,000
High SES (gtmedian) 50 1,000,000
Sample Cancer Cases Normal
Low SES 75
High SES 25
OR (75/25) 2.0 (120/80)
(Biased)
14
Are mothers of children with hemifacial
microsomia more often diabetic?
Cases are referred, but what if controls are
selected from the general pediatrics ward at MGH?
Referred Cases
Referral mechanism of controls might be very
different from that of the cases with microsomia.
Could mothers of controls be more or less likely
to be diabetic than the cases (regardless of any
association between diabetes and microsomia)?
How would you select controls for this study?
15
Self- Selection Bias in a Case-Control Study
Selection bias can be introduced into
case-control studies with low response or
participation rates if the likelihood of
responding or participating is related to both
the exposure and outcome. Example A
case-control study explored an association
between family history of heart disease
(exposure) and the presence of heart disease in
subjects. Volunteers are recruited from an HMO.
Subjects with heart disease may be more likely to
participate if they have a family history of
disease.
16
Self-Selection Bias in a Case-Control Study
Dis.
Dis.
Y N
Y N
300 200
200 300
240 (80) 120 (60)
120 (60) 180 (60)
Y N
Y N
Exp.
Exp.
Self-Selection Bias
True
OR2.25
OR3.0
Best solution is to work toward high
participation (gt80) in all groups.
17
Selection Bias in a Retrospective Cohort Study
In a retrospective cohort study selection bias
occurs if selection of exposed non-exposed
subjects is somehow related to the outcome.
What will be the result if the investigators are
more likely to select an exposed person if they
have the outcome of interest?
18
Selection Bias in a Retrospective Cohort Study
Example Investigating occupational exposure (an
organic solvent) occurring 15-20 yrs. ago in a
factory. Exposed unexposed subjects are
enrolled based on employment records, but some
records were lost.
Suppose there was a greater likelihood of
retaining records of those who were exposed got
disease.
19
Selection Bias in a Retrospective Cohort Study
Differential referral or diagnosis of subjects
Dis.
Dis.
Y N
Y N
100 900
50 950
99 720
40 760
Y N
Y N
Exp.
Exp.
20 of employee health records were lost or
discarded, except in solvent workers who
reported illness (1 loss).
True
RR2.0
RR2.42
Workers in the exposed group were more likely to
be included if they had the outcome of interest.
20
The Healthy Worker Effect
Can be considered a form of selection bias
because the general population controls have a
higher probability of getting the outcome (death).
The general population is often used in
occupational studies of mortality, since data is
readily available, and they are mostly unexposed.
vs.
Mortality Rates?
The main disadvantage is bias by the healthy
worker effect. The employed work force (mostly
healthy) generally has lower rates of mortality
and disease than the general population (with
healthy ill people).
21
Differential Retention (Loss to Follow Up) in
Prospective Cohort Studies
Enrollment into a prospective cohort study will
not be biased by the outcome, because the outcome
has not occurred at enrollment. However,
prospective cohort studies can have selection
bias if the exposure groups have differential
retention of subjects with the outcomes of
interest. This can cause either an over- or
under- estimate of association
22
Selection Bias in a Prospective Cohort Study
More events lost in one exposure group
Dis.
Dis.
Y N
Y N
20 9980
10 9990
8 5980
8 5990
Y N
Y N
Exp.
Exp.
True
OR2.0
RR1.0
23
Differential loss to follow up in a prospective
cohort study on oral contraceptives (OC)
thromboembolism (TE).
If OC were associated with TE with RR2.0
(TRUTH), the 2x2 for all of MA would look like
this.
Without Losses TE Normal
OC 20 9,980
OC- 10 9,990
If OC users with TE are more likely to be lost
than non-OC-users with TE
There is 40 loss to follow up overall, but a
greater tendency to loose OC users with TE
results in a de facto selection.
Final Sample TE Normal
OC 8 5,980
OC- 8 5,990
(Biased)
RR (8/5988) 1.0 (8/5998)
24
Observation Bias (Information Bias)
Systematic errors due to incorrect categorization.
Diseased Not Diseased
Exposed
Not Exposed
The Correct Classification
25
Misclassification Bias
Subjects are misclassified with respect to their
risk factor status or their outcome, i.e., errors
in classification.
Non-differential Misclassification (random) If
errors are about the same in both groups, it
tends to minimize any true difference between the
groups (bias toward the null).
• Differential Misclassification (non-random)
If information is better in one group than
another, the association maybe over- or
underestimated.

26
Non-Differential Misclassification
When errors in exposure or outcome status occur
with approximately equal frequency in groups
being compared.
• Difficulty remembering exposures (equal in both
groups)
• Example Case-control study of heart disease and
past activity difficulty remembering your
specific exercise frequency, duration, intensity
over many years
• Recording and coding errors in records and
databases.
• Example ICD-9 codes in hospital discharge
summaries.
• Using surrogate measures of exposure
• Example Using prescriptions for
anti-hypertensive medications as an indication of
treatment
• Non-specific or broad definitions of exposure or
outcome.
• Example Do you smoke? to define exposure to
tobacco smoke.

27
Non-Differential Misclassification
Random errors in classification of risk factors
or outcome (i.e., error rate about the same in
all groups).
• Example
• When patients are discharged, the MD dictates a
summary which is transcribed. Diagnoses and
procedures noted on the summary are encoded
(ICD-9 codes) and sent to the MA Health Data
Consortium.
• MDs dont list all relevant diagnoses.
• Errors occur in 25-30 of records.

28
Non-Differential Misclassification
Random errors in classification of risk factors
or outcome (i.e., error rate about the same in
all groups).
Tends to minimize differences, generally causing
an underestimate of effect.
Effect
Example A case-control study comparing CAD cases
controls for history of diabetes. Only half of
the diabetics are correctly recorded as such in
cases and controls.
With Nondifferential Misclassification
True Relationship
CAD Controls Diabetes 40 10 No
diabetes 60 90 OR 40x90 6.0 10x60
5 No diabetes 80 95 OR 20x95 4.75
5x80
29
Non-Differential Misclassification
When there are random errors in classification of
risk or outcome, i.e. errors occur with equal
frequency in both groups.
Effect With a dichotomous exposure, it
minimizes differences causes an underestimate
of effect, i.e. bias toward the null.
30
Diseased Not Diseased
Exposed
Not Exposed
Nondifferential Misclassification of Exposure 1
31
Diseased Not Diseased
Exposed
Not Exposed
Nondifferential Misclassification of Exposure 2
32
Validation to Identify Random Misclassification
in a Prospective Cohort Study
• Obesity heart disease in women
(questionnaires)
• Guessing at weight?

Self-reported weights were validated in a
subsample of 184 NHS participants living in the
Boston, MA area and were highly correlated with
actual measured weights (r 0.96).
Cho E, Manson JE, et al. A Prospective Study of
Obesity and Risk of Coronary Heart Disease Among
Diabetic Women. Diabetes Care 2511421148, 2002.
33
Differential Misclassification
When there are more frequent errors in exposure
or outcome classification in one of the groups.
• Differences in accurately remembering exposures
(unequal)
• Example Mothers of children with birth defects
will remember the drugs they took during
pregnancy better than mothers of normal children
(maternal recall bias).
• Interviewer or recorder bias.
• Example Interview has subconscious belief about
the hypothesis.
• More accurate information in one of the groups.
• Example Case-control study with cases from one
facility and controls from another with
differences in record keeping.

34
Recall Bias
(Differential)
(If the groups have the same of errors based on
faulty memory, thats non-differential
misclassification.)
• People with disease may remember exposures
differently (more or less accurately) than those
without disease.
• To Minimize
• Use a control group that has a different disease
(unrelated to the disease under study).
• Use questionnaires that are constructed to
maximize accuracy and completeness. Ask specific
questions. More accuracy means fewer differences.
• For socially sensitive questions, such as alcohol
and drug use or sexual behaviors, use a
interviewer.
• If possible, assess past exposures from
biomarkers or from pre-existing records.

35
Interviewer Bias( Recorder Bias in Chart
Reviews)
(Differential)
Systematic difference in soliciting, recording,
or interpreting information.
• Minimized by
• Blinding the interviewers if possible.
• Using standardized questionnaires consisting of
closed-end, easy to understand questions with
appropriate response options.
• Training all interviewers to adhere to the
question and answer format strictly, with the
same degree of questioning for both cases and
controls.
• Obtaining data or verifying data by examining
pre-existing records (e.g., medical records or
employment records) or assessing biomarkers.

36
Effects of Bias
Non-Differential Misclassification
Bias to Null
Selection bias
Interviewer bias
Recall Bias
Differential Misclassification
These are differential and can bias toward or
away from null.
37
Misclassification of Outcome Can Also Introduce
Bias
• but it usually has much less of an impact than
misclassification of exposure, because
• Most of the problems with misclassification occur
with respect to exposure status, not outcome.
• There are a number of mechanisms by which
misclassification of exposure can be introduced,
but most outcomes are more definitive and there
are few mechanisms that introduce errors in
outcome.
• Most outcomes are relatively uncommon.
• Misclassification of outcome will generally bias
toward the null, so if an association is
demonstrated, if anything the true effect might
be slightly greater.

38
Any concerns?
A study is conducted to see if serum cholesterol
screening reduces the rate of heart attacks.
1,500 members of an HMO are offered the
opportunity to participate in the screening
program, 600 volunteer to be screened. Their
rates of MI are compared to those of randomly
selected members who were not invited to be
screened. After 3 years of follow-up rates of MI
are found to be significantly less in the
screened group.
1. No
2. Differential misclassification
3. Interviewer bias
4. Recall bias
5. Selection bias

39
Background Information on Abdominal Aortic
Aneurysms
40
Diagnosis of AAA
• Usually asymptomatic (surgery if gt 5 cm.)
• Discovered during routine abdominal
• exam by palpation, or
• Seen on x-ray or ultrasound of
• abdomen (done for other reasons).
• Known risk factors
• Age
• Male gender
• Smoking
• Hypertension

41
Costa Robbs Br. J. Surg. 1986Abdominal
Aneurysms.
• A vascular surgery (referral) service in So.
Africa reviewed records of elective peripheral
vascular surgery.

Other a variety of readily apparent
conditions.
320 1,862
Conclusion AAA uncommon in Blacks and more
often due to infections.
OR 0.12 (0.09 0.15)
42
Was there selection bias?
Other variety of readily apparent conditions.
1. Yes
2. No

43
Was there selection bias?
Other variety of readily apparent conditions.
44
A possibility of misclassification?
All black patients were screened for TB and
for syphilis.
Blacks Whites Atherosclerotic 34 99 Inflamm
atory or Infectious 47 0.5 Uncertain
etiology 19 0. 0
AAA in blacks are more often due to infectious
causes.
1. No
2. Yes, random.
3. Yes, differential.

45
White Black
MaleFemale 21 11
Mean age 49.4 67.1
Admitted for Uncontrolled HBP 0 17
Smoking 76 48
• (Known risk factors)
• Age
• Male gender
• Smoking
• Hypertension

46
Environmental tobacco smoke and tobacco
related mortality in a prospective study of
Californians, 1960-98. James E. Enstrom, Geoffrey
C. Kabat. BMJ 20033261057
118,094 adults enrolled in an ACS cancer study in
1959 were followed until 1998. For never smokers
married to ever smokers compared with never
smokers married to never smokers
RR in Males RR in Females Heart
disease 0.94 (0.85 - 1.05) 1.01 (0.94 -
1.08) Lung cancer 0.75 (0.42 - 1.35) 0.99
(0.72 - 1.37) Chr. Pulm. Dis. 1.27 (0.78 - 2.08)
1.13 (0.80 - 1.58) Conclusions The results
do not support a causal relation between
environmental tobacco smoke and tobacco related
mortality, although they do not rule out a small
effect.
47
Environmental tobacco smoke and tobacco
related mortality in a prospective study of
Californians, 1960-98. James E. Enstrom, Geoffrey
C. Kabat. BMJ 20033261057
The independent variable was exposure to
environmental tobacco smoke based on smoking
status of the spouse in 1959, 1965, and
1972. Never smokers married to a current
smoker were subdivided into categories according
to the smoking status of their spouse 1-9,
10-19, 20, 21-39, 40 cigarettes consumed per day
for men and women, with the addition of pipe or
cigar usage for women. Former smokers were
48
Any potential selection bias in the ETS study?
1. I dont think so.
2. Yes, there was a potential for it.

49
Any potential information bias in the ETS study?
1. I dont think so.
2. Non-differential misclassification.
3. Differential misclassification.
4. Interviewer bias.
5. Recall bias.

50
Are Analgesic Drugs Associated with Increased
Risk of Renal Failure?
• Case-Control study
• in Maryland, Virginia, West Virginia, D.C.
• Cases found with renal dialysis registry.
• Controls random digit dial.
• Data Estimated lifetime analgesic use based on
phone interview.

51
Case-Control Study Analgesic Use Renal Failure
OR 95 CI
Acetaminophen
0-999 1.0 -
1000-4999 2.0 1.3-3.2
gt5000 2.4 1.2-4.8
Aspirin
0-999 1.0 -
1000-4999 0.5 0.4-0.7
gt5000 1.0 0.6-1.8
NSAIDs
0-999 1.0 -
1000-4999 0.6 0.3-1.1
gt5000 8.8 1.1-71.8
Conclusion Acetaminophen NSAIDS increase risk
of renal failure, but not aspirin.
Could any biases have influenced the conclusion?
52
Could interviewer bias have affected results?
1. Highly unlikely.
2. Definitely a possibility.

53
Could recall bias have affected results?
1. Highly unlikely.
2. Definitely a possibility.

54
Reverse Causation
Example Chronic diabetes is a common cause of
renal failure. Suppose diabetics more frequently
have conditions that require analgesics.
In this case, it may appear that analgesic use
that is greater than in controls is associated
with a greater risk of renal failure.
55
Avoiding Bias
Once its in the study, you cant fix it.
• Select subjects by similar mechanism.
• Blind interviewers.
• Get subjects with equal tendency to remember.
• Use clear, homogeneous definitions of disease
exposure.
• Get accurate data collected in a similar way.
• Confirm data error trapping during data entry.
• Use procedures to minimize loss to follow-up.

56
(No Transcript)
57
Confounding By Indication
• A bias that occurs in observational studies of
drug effects. Allocation is not randomized and
drug selection may be influenced by pre-existing
disease.

Example Physicians might advise their patients
with renal failure not to take aspirin.
58
JK Allen, et al. Disparities in Womens Referral
to and Enrollment in Outpatient Cardiac
Rehabilitation J. Gen. Intern. Med
200419747-753.
253 women (108 African American, 145 white) were
surveyed within the first month of discharge from
the hospital for a PCTA, CABG, or MI. 234 (99
African American, 135 white) completed the
6-month follow-up. RESULTS The rate of referral
to outpatient phase 2 cardiac rehabilitation was
significantly lower for African-American women
compared with white women, 12 (12) vs. 33 (24)
(P .03). Only 35 (15) of women in the study
reported enrollment in phase 2 cardiac
rehabilitation programs, with fewer
African-American women reporting enrollment
compared with white women, 9 (9) versus 26 (19)
(P .03). Controlling for age, education, angina
class, and co-morbidities, women with annual
incomes lt20,000 were 66 less likely to be
referred to cardiac rehabilitation (P .01) and
60 less likely to enroll compared to women with
incomes gt20,000 (P .01). Although borderline
significant, African-American women were 55 less
likely to be referred (P .059) and 58 less
likely to enroll (P .059) than white women.
59
Methods women were identified at the time of
hospitalization for a coronary event. They were
interviewed by telephone within the first 4 weeks
following their hospital discharge to collect
baseline socio-demographic and clinical data.
They were interviewed again 6 months later by
telephone to obtain information on referral to
and enrollment in cardiac rehabilitation
programs, and information on psychosocial and
behavioral factors that may be associated with
rehabilitation utilization. Interviews were
conducted by three trained research assistants.
The 6-month interview assessed the receipt of
a referral from self-report of the patient,
including the patients recall of having received
a verbal or written referral by a health
professional at any time since being
hospitalized. For those who reported receiving a
referral, the reinforcing factors of the
patients perception of the strength of the
health professionals and family/significant
others encouragement to participate in cardiac
rehabilitation was measured using a scale of 1
(little or no encouragement) to 10 (strongly
encouraged). Enabling factors such as the
accessibility, availability, and acceptability of
cardiac rehabilitation services were assessed.
60
Diseased Not Diseased
Exposed
Not Exposed
Differential Misclassification of Outcome
61
Diseased Not Diseased
Exposed
Not Exposed
Nondifferential Misclassification of Outcome