Bias and Confounding Play or Chance Measure of Association - PowerPoint PPT Presentation

1 / 93
About This Presentation
Title:

Bias and Confounding Play or Chance Measure of Association

Description:

health events or exposures may not occur symmetrically over time. 18 ... H0 not rejected; pet ownership not associated with asthma attacks. Hypothesis Testing ... – PowerPoint PPT presentation

Number of Views:281
Avg rating:5.0/5.0
Slides: 94
Provided by: janmhr
Category:

less

Transcript and Presenter's Notes

Title: Bias and Confounding Play or Chance Measure of Association


1
Bias and ConfoundingPlay or ChanceMeasure of
Association
  • Introduction to Epidemiology
  • Fall, 2001

2
Objectives - Bias
  • Define the following
  • Bias (precision / accuracy)
  • Selection bias
  • Information bias
  • Recall bias
  • Interviewer bias

3
Definition
  • Bias is introduced by any systematic error in the
    design, conduct, or analysis of a study that
    results in a mistaken estimate of the exposures
    effect on the risk of disease.
  • Schlesselman Stolley, 1982

4
Are these results valid?
  • We know how to measure associations (RR, OR, AR,
    EF)
  • We can explain these associations with words
  • However, are the results valid?
  • Are they a true representation of the truth
    that is, are they unbiased?

5
Random or Systematic Errors
  • Random error refers to imprecision
  • Governed by chance
  • Systematic error refers to mistakes
  • Also called bias

6
Random Errors
  • Random governed by chance
  • small sample size
  • biological variability
  • instrument variability
  • chance variation
  • Can often be fixed by increasing the number of
    study subjects

7
BIAS
  • Two major types to consider
  • selection bias non-comparable
  • criteria used to enroll participants
  • information bias non-comparable
  • information obtained due to
  • interviewer or recall bias

8
BIAS
  • Bias has to do with research design
  • Bias results from systematic flaws in
  • study design
  • data collection
  • analysis
  • interpretation

9
BIAS
  • Bias is the difference between the expected value
    of an estimate and the real population
    parameter it purports to estimate
  • Bias is an attribute of methodology

10
Selection Bias
  • If the study population is selected in a way to
    represent the target population in terms of the
    distribution of the variables of interest, and
    the data is collected in a way to reflect the
    real status of the individual in terms of the
    presence or absence of the variables of interest,
    then the bias is minimized in the study.

11
Selection Bias
  • Telephone survey at 10 a.m. Monday morning.
  • Interview the first 100 people who answer the
    phone.
  • Is this a representative sample?
  • What groups would be systematically excluded from
    the sample?

12
Selection Bias
  • A distortion in a measure of disease frequency or
    association resulting from the manner in which
    subjects are selected for the study

13
Selection Bias
  • When the sample is not representative of the
    target population
  • When selection was related to either exposure or
    disease

14
Selection Bias
  • Alf Landon was predicted to win the election
    against Franklin Roosevelt
  • Interviews by phone
  • Few people had phones
  • Rich people had phones
  • Rich people were more likely to be Republicans

15
Selection Bias
  • How to minimize selection bias
  • always try to avoid human choice in the selection
    of a sample
  • (depends on your study design)
  • whenever possible, use random sampling mechanisms

16
Selection Bias
  • Survival bias
  • Exposed cases do not have the same survival as
    non-exposed cases
  • Non-response bias
  • Participants are different than non-participants
  • Publicity bias
  • News media may effect behavior

17
Selection Bias
  • Healthy worker effect
  • ill and chronically disabled people are excluded
    from the work-force
  • Time or place bias
  • health events or exposures may not occur
    symmetrically over time

18
Selection Bias
  • Selection Bias involves errors in determining
  • who to select and
  • how they will be selected

19
Selection Bias
  • On WHO to select
  • We would want to select groups from the diseased
    and non-diseased populations that do not have a
    particular distribution of exposure that is
    different from that in the target population.

20
Selection Bias
  • On HOW to select
  • The choice of the study population might be
    valid, but the way we choose to sample from the
    study population might introduce bias

21
Selection Bias - WHO
  • In a hospital based case-comparison study of the
    association of CHD and alcohol consumption the
    comparison group should consist of those without
    CHD. A poor choice of a non-CHD group would be
    patients admitted for liver diseases liver,
    because of the known high alcohol intake level
    among that group.

22
Selection Bias - WHO
  • The results from such a study will tend to show
    no association between alcohol intake and CHD
    simply because the comparison group that was
    chosen to had a high exposure level.

23
Selection Bias - HOW
  • In a community based case-comparison study of
    the association of CHD and alcohol consumption
    comparisons are recruited by placing ads in all
    the local community papers, including the Baptist
    Weekly Crier, and the Women's Temperance
    Newsletter.

24
Selection Bias - HOW
  • The problem with such a sampling method, is that
    the volunteers who would respond to the ads might
    have different prevalence of exposure than that
    of the general population
  • This type of bias is referred to as "volunteer
    bias"

25
Selection Bias
  • Etiology of homosexuality (1962)
  • Three questionnaires sent to members of the New
    York-based psychoanalytic society
  • Psychiatrist complete forms on homosexual
    patients
  • If fewer that 3 - use remaining forms for male
    heterosexuals as controls

26
Information Bias
  • Assume your initial decision on who to select as
    diseased individuals is correct
  • (i.e. your non-diseased individuals really do
    represent all non-diseased individuals in regard
    to exposure).
  • However, you incorrectly divide them into exposed
    or non-exposed
  • because you do not accurately measure the
    exposure (e.g. your information on exposure is
    faulty).

27
Information Bias
  • If this happened to a different extent in the
    diseased and non-diseased groups then bias is
    introduced
  • Misclassification
  • Interviewer
  • Recall

28
Information Bias
  • Reproducibility or precision
  • The probability that multiple measurements of
    exposure or outcome will yield the same results

29
Information Bias
  • An association between cervical cancer and
    circumcision of primary sexual partner was
    described in 1954

30
Information Bias
  • The first study also asked women about the
    circumcision status of their sexual partners
  • A second study was conducted and information was
    collected on religion, and on circumcision from
    females and from their sexual partners

31
Information Bias
  • Also - men were asked to confirm their
    circumcision status with a physical examination

32
Information Bias
  • The original study was criticized because it did
    not take into account religion.
  • Jewish and Muslim men are more likely to be
    circumcised - and their religious beliefs may
    influence their sexual practices

33
Interviewer Bias
  • Interviewer Bias Example
  • An interviewer might ask the comparisons
  • INTERVIEWER On the average how many cups of
    coffee did you drink per day when you were 25
    years old?
  • COMPARISON About two
  • INTERVIEWER Thank you

34
Interviewer Bias
  • INTERVIEWER On the average how many cups of
    coffee did you drink per day when you were 25
    years old?
  • CASE About two
  • INTERVIEWER Are you including coffee from
    coffee breaks, what about decaffeinated coffee is
    that included?
  • CASE OH! OOPS, No, No, No, Not two. Three
    cups all decaffeinated

35
Interviewer Bias
  • Obviously information about exposure was
    prompted more thoroughly from the case than from
    the control, possibly leading to
    misclassification of exposure more among the
    controls than among the cases.

36
Recall Bias
  • Additionally if there has been publicity on the
    adverse effects of coffee, particularly in regard
    to cancer who do you think is more likely to
    overestimate, or perhaps recall more accurately,
    their past coffee consumption?
  • Cases or controls?
  • Why?

37
Bias and Measures of Association
  • Depending on how they operate in specific
    circumstances, selection bias and information
    bias can distort the true association in every
    conceivable way They can
  • Create a positive or negative association where
    none exists.
  • Change an association from positive to negative,
    or vice versa.

38
Bias and Measures of Association
  • Make an association appear stronger than it truly
    is.
  • Make an association appear weaker than it truly
    is, or eliminate it entirely.

39
Controlling BIAS
  • CONTROL IN DESIGN PHASE!
  • Prepare a manual that describes in detail the
    procedures for selecting participants.
  • Thoroughly train study personnel in these
    procedures
  • Standardization of procedures, including tight
    control over the conduct of these procedures

40
Controlling BIAS
  • In a hospital-based study, consider the
    possibility of obtaining a second control from
    the general population
  • Select a population that can be followed with
    little or no loss to follow-up
  • Choose study groups to be representative of the
    target groups

41
Controlling BIAS
  • Prepare a detailed manual of operations that
    covers all aspects of data collection. Allow no
    room for individual interpretation of procedures.
  • Train all study personnel. Establish minimum
    criteria for performance.

42
Controlling BIAS
  • In multi-center projects, use central facilities
    for interpreting and analyzing data, e.g., a
    central laboratory for doing blood chemistries

43
Controlling BIAS
  • Maintain tight quality control, e.g., by sending
    blind replicates to your laboratory, retesting
    technicians, holding retraining sessions, and
    collecting data on reliability and validity
  • Keep morale high for participants and study
    personnel

44
Controlling BIAS
  • Sources of data and methods for collecting data
    should be the same for all participants
    regardless of exposure status or disease

45
Controlling BIAS
  • Whenever feasible, data on exposure in should be
    obtained by study personnel who are unaware of a
    participant's outcome status
  • Data on occurrence of outcomes should be obtained
    and evaluated without knowledge of exposure
    status.

46
BIAS
  • Bias occurs due to errors in the design and
    execution of the study.
  • Bias MUST be addressed before the study is
    conducted.
  • IT IS VERY DIFFICULT to correct data that was
    collected with bias.

47
Confounding
  • Confounding is revealed during the analysis of
    the study
  • Confounding is not an error in design or
    execution
  • Confounding can be assessed during the analysis
    stage of the study

48
Confounding
  • a mixing of effects
  • between the exposure, the disease, and other
    factors associated with both the exposure and the
    disease
  • such that the effects the effects of the two
    processes are not separated.

49
Confounding
  • A bias due to the association of a third variable
    with both the exposure and the disease
    independently and the failure to disassociate the
    third variable from the association under study

50
Confounding
  • What is a confounding variable?
  • A variable which distorts an association wholly
    or partially due to its association with both the
    outcome (disease) and the exposure under study
    independently.

51
Confounding
  • the variable must be associated with the disease
    (i.e., the confounder itself may be a risk
    (factor).
  • the variable is associated with the exposure
    independently of the disease
  • the results of the association under study must
    be confounded (i.e., the result achieved is
    false)

52
Confounding
  • It is not necessary that the confounding variable
    be causally or significantly associated with the
    disease or exposure

53
Confounding
Coffee Observed Association
Cancer Presumed causation
Smoking, Alcohol, other Factors
54
Confounding
Low SES
Hypertension
Race/Ethnicity
55
Confounding
Obesity
Hypertension
Age
56
Confounding
Gambling
Cancer
Smoking, Alcohol, other Factors
57
Confounding
  • HYPOTHESIS Is the incidence of coronary heart
    disease greater among men who drink coffee than
    among men who do not drink coffee
  • DISEASE Coronary heart disease
  • EXPOSURE History of coffee drinking
  • POTENTIAL CONFOUNDER Smoking

58
Confounding
  • To assess whether or not smoking confounds the
    association between coronary heart disease and
    coffee drinking three questions must be answered.
  • What are these three questions?

59
Confounding
  • 1) Is smoking associated with coffee drinking?
    The exposure
  • 2) Is smoking associated with coronary heart
    disease? The Disease
  • 3) Does the odds ratios for the association
    between the exposure and the disease differ when
    you consider the confounding variable?

60
Confounding
  • Detecting and removing spurious associations
    related variables can be done at
  • the design stage, and/or
  • the analysis stage

61
Control of Confounding
  • Design stage
  • restriction
  • matching
  • Analysis stage
  • stratification
  • multivariate techniques

62
Restriction
  • Confounding cannot occur if the factor does not
    vary.
  • For example if the study is limited to black
    women, race and gender cannot be confounding
    variables.
  • However if restriction is carried to extremes the
    study may have a limited number of eligible
    participants

63
Restriction
  • Restriction also limits the interpretation of the
    study.
  • Often partial restriction is used.

64
Matching
  • Matching is used mainly in case-comparison
    studies.
  • Application of restraints to the comparison
    group to make it more similar to the case group
    is respect to one or more potential confounding
    variables.

65
Matching
  • How close should matching be? Matching may be
    done on an individual basis (pair-matching) or on
    a group basis (frequency matching)
  • If a pair-matched design is used, then matching
    must be taken into account in the analysis.

66
Randomization
  • Randomization is used in experimental studies to
    allocate individuals to treatment groups by
    chance with the purpose of ensuring that all
    potential confounders are equally distributed
    among the groups. It is not haphazard
    assignment. Randomization does not always
    achieve its purpose.

67
Common Pitfalls in Research
  • Failing to evaluate accuracy
  • Drawing spurious conclusions
  • Generalizing to inappropriate populations
  • Failing to evaluate the role of chance
  • Assuming causality based only on statistical
    significance

68
Bias in a Case Series
  • no comparison group
  • selection of study group cannot described
  • no way of ascertaining confounding

69
Bias in a Case Control Study
  • do the controls represent the population from
    which the cases were drawn
  • are controls at similar risk of being exposed?
  • is case status / control status similar
  • survival bias
  • volunteer bias
  • information bias

70
Bias in a cross sectional study
  • survival bias
  • migration out of exposure
  • cart before the horse bias
  • confounding

71
Bias in a cohort study
  • exposed and non-exposed from same base population
  • Internal comparisons start with a
    cross-sectional study of a population sample
  • External comparisons try to ensure that the
    non-exposed are similar in all ways to the
    exposed group.

72
Statistical Issues
  • What do we mean by chance and how does this
    relate to determining a true association
  • Where do we start?

73
Statistical Issues
  • The evaluation of the role of chance is done in 2
    steps
  • Estimate the magnitude of the association.
  • Hypothesis testing
  • Calculate a test statistic,
  • obtain a p value or confidence interval

74
Measures of association
  • relative risk
  • odds ratio
  • attributable risk
  • also called risk difference
  • attributable risk percent
  • Also called etiologic fraction

75
Statistical Issues
  • p-value the probability of obtaining a sample
    showing an association of the observed size or
    larger by chance alone under the hypothesis that
    no association exists.
  • Confidence interval a range of values that one
    can say, with a specific degree of confidence,
    contains the true population value.

76
Statistical Issues
  • A statistically significant finding does not mean
    that the results DID NOT occur by chance
  • - only that it is unlikely that they occurred by
    chance.
  • A non-significant finding does not mean that the
    results DID occur by chance.

77
Statistical Issues
  • All tests of statistical significance lead to a
  • probability statement
  • usually expressed as a p value

78
Statistical Issues
  • A probability of 0.05 is the usual (arbitrary)
    cut-off level for statistical significance
  • If p lt0.05, we conclude that chance is an
    unlikely explanation for the finding.
  • The null hypothesis is rejected, and the
    statistical association is said to be significant

79
Statistical Issues
  • No p value
  • however small - completely excludes chance
  • No p value
  • however large - completely rules out a true
    association

80
Statistical Issues
  • p values only evaluate the role of chance
  • they say nothing about other alternative
    explanations or about causality
  • p values reflect the strength of the association
    and the study sample size

81
Statistical Issues
  • A small difference may achieve statistical
    significance if the sample size is large
  • A large difference may not achieve statistical
    significance if the sample size is too small

82
Statistical Issues
  • We address these problems by calculating
    confidence intervals (CI)
  • The CI gives all the information of a p value
    PLUS the expected range of effect sizes.

83
Statistical Issues
  • CI indicates the range within which the true
    magnitude of effect lies with a certain degree of
    assurance. The degree of assurance is defined by
    the p value you assign.

84
Statistical Issues
  • If the null value is included in a 95 confidence
    interval, then the corresponding p value is, by
    definition, greater than 0.05.

85
Statistical Issues
  • Is selection bias a likely explanation for the
    results?
  • Is information bias a likely explanation for the
    results?
  • Is chance a likely explanation for the results?
  • Are the authors conclusions reasonable in terms
    of the information presented?

86
Evaluating the Role of Chance
  • Population parameter
  • A number which describes some aspect of a
    population.
  • Sample statistic
  • A number which describes some aspect of a sample
  • Sampling error
  • The error that arises from measuring a sample
    rather than the population

87
Hypothesis Testing
  • Type I and Type II Error

Type I error reject H0 when it is true false
positive Type II error accept H0 when it is
false false negative
88
Hypothesis Testing
  • H0 Pet owners who are asthmatic are no more
    likely to have asthma attacks than non pet owners.

RR (risk ratio) 33/78 ? 30/75 1.06 ?2 0.08
p 0.77 95 C.I. 0.72 to 1.55
H0 not rejected pet ownership not associated
with asthma attacks
89
Hypothesis Testing
  • H0 Pet owners who are asthmatic are no more
    likely to have asthma attacks than non pet owners.

RR (risk ratio) 200/210 ? 300/450 1.43 ?2
63.64 p lt0.0001 95 C.I. 1.33 to lt1.54
H0 rejected pet ownership associated with asthma
attacks
90
Hypothesis Testing
  • H0 There is no association between
    post-menopausal estrogen replacement therapy
    (ERT) and risk for developing AD

RR 30/54,000 ? 60/54,000 0.5 ?2 77.43 p
lt0.0001 95 C.I. 0.02 ltRR lt0.88
H0 rejected ERT is associated with a reduction
in risk for developing AD
91
Hypothesis Testing
92
Hypothesis Testing
  • In determining statistical significance, the
    precision of the estimate should be based on both
    p-value and confidence interval
  • p-value is susceptible to variability and sample
    size large sample can detect a statistically
    significant difference that may not be important
    and vice versa
  • confidence interval--width depends on variability
    in data location of the no effect value (in or
    out of interval) and width both informative

93
Hypothesis Testing
  • Statistical significance vs clinical importance
  • although the p-value and confidence interval may
    lead to the conclusion that chance is an unlikely
    explanation for the findings,
  • they provide no information regarding the effects
    of uncontrolled bias or confounding
  • all factors must be weighed in light of clinical
    importance
Write a Comment
User Comments (0)
About PowerShow.com