Lecture 3: Reliability and validity of scales - PowerPoint PPT Presentation

Loading...

PPT – Lecture 3: Reliability and validity of scales PowerPoint presentation | free to download - id: 562f5a-YWMxN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Lecture 3: Reliability and validity of scales

Description:

Lecture 3: Reliability and validity of scales Reliability: internal consistency test-retest inter- and intra-rater alternate form Validity: content, criterion, and ... – PowerPoint PPT presentation

Number of Views:89
Avg rating:3.0/5.0
Slides: 49
Provided by: JaneMc65
Learn more at: http://www.medicine.mcgill.ca
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Lecture 3: Reliability and validity of scales


1
Lecture 3 Reliability and validity of scales
  • Reliability
  • internal consistency
  • test-retest
  • inter- and intra-rater
  • alternate form
  • Validity
  • content, criterion, and construct validity
  • responsiveness

2
Multi-item scales
  • Measure constructs without a gold standard
  • e.g., depression, satisfaction, quality of life
  • Items are intended to sample the content of the
    underlying construct
  • Items summarized in various ways
  • sum or average of responses to individual items
  • item weighting or other algorithm
  • profiles/sub-scale scores

3
Example Reliability and validity of a measure of
severity of delirium Source McCusker et al,
Internat Psychogeriatrics 1998 10421-33
  • Delirium - acute confusion
  • Common in older hospitalized patients
  • Diagnosis of delirium is based on the following
    symptoms
  • acute onset, fluctuations
  • inattention, disorganized thinking
  • altered consciousness, disorientation
  • memory impairment, perceptual disturbances
  • psychomotor agitation or retardation

4
Requirements of new scale
  • Administered by interviewer at bedside
  • Not using patient chart (to maintain blinding)
  • Brief (avoid patient burden)
  • Responsive to within-patient changes over time

5
Delirium Index (DI)
  • Assesses severity of 7 symptoms of delirium
    (excl. acute onset, fluctuations, sleep
    disorder)
  • inattention, disorganized thinking
  • altered consciousness, disorientation
  • memory impairment, perceptual disturbances
  • psychomotor agitation or retardation

6
Administration and scoring
  • Administered in conjunction with first 5
    questions of Mini-Mental State Exam (MMSE)
  • Each symptom rated on 4-point scale
  • 0 absent
  • 1 mild
  • 2 moderate
  • 3 severe
  • Operational definition of each symptom

7
Scoring
  • Score is sum of 7 item scores
  • Scoring of symptoms that could not be assessed
  • patient non-responsive - coded as severe for
    items 1,2,4,5
  • coding instructions provided for questions 3, 6,
    7
  • patient refuses - questions 1, 2, 4, 5 scores
    replaced by score of item 3

8
Reliability
  • Internal consistency
  • Test-retest reliability
  • Inter-rater and intra-rater reliability

9
Internal consistency
  • Relevant to additive scales (that sum or average
    items)
  • Split-half reliability
  • correlation between scores on arbitrary half of
    measure with scores on other half
  • Coefficient alpha (Cronbach)
  • estimates split half correlation for all possible
    combinations of dividing the scale

10
Internal consistency of DI
  • Cronbachs alpha (overall) 0.74
  • After exclusion of perceptual disturbance 0.82
  • In sub-groups of patients
  • delirium and dementia 0.69, 0.79
  • delirium alone 0.67, 0.79
  • dementia alone 0.55, 0.59
  • neither 0.44, 0.52

11
Test-retest reliability (stability)
  • Scale is repeated
  • short-term
  • for constructs that fluctuate, 2 weeks often used
    to reduce effects of memory and true change
  • long-term
  • for constructs that should not fluctuate (e.g.,
    personality traits)
  • Correlation between 2 scores is computed
  • Also important to look at systematic increase or
    decrease in score

12
Test-retest reliability of DI
  • Delirium is marked by fluctuations
  • Variability over time is expected

13
Mean within-patient standard deviation in DI
score during 1st week in hospital
14
Inter- and intra-rater reliability
  • Inter-rater reliability
  • For scales requiring rater skill, judgment
  • 2 or more independent raters of same event
  • Intra-rater reliability
  • Independent rating by same observer of same event

15
Measures of inter- and intra-rater reliability
categorical data
  • Percent agreement
  • can be used for di- and polychotomous scales
  • limitation value is affected by prevalence -
    higher if very low or very high prevalence
  • Kappa statistic
  • takes chance agreement into account
  • defined as fraction of observed agreement not due
    to chance

16
Kappa statistic
  • Kappa p(obs) - p(exp)
  • 1 - p(exp)
  • p(obs) proportion of observed agreement
  • p(exp) proportion of agreement expected by
    chance

17
(No Transcript)
18
Interpretation of kappa
  • Various suggested interpretations
  • Example Fleiss (1981)
  • excellent 0.75 and above
  • fair to good 0.40 - 0.74
  • poor less than 0.40
  • Limitations
  • depends on prevalence (see Szklo Nieto)
  • do not use as only measure of agreement

19
Measures of inter- and intra-rater reliability
continuous data
  • Measures of correlation
  • Correlation graph (scatter diagram)
  • Correlation coefficients
  • Measures of pairwise comparison

20
Correlation coefficients
  • Pearsons r
  • assesses linear association, not systematic
    differences between 2 sets of observations
  • sensitive to range of values, especially outliers
  • Spearman r
  • ordinal or rank order correlation
  • less influenced by outliers
  • doesnt assess systematic differences

21
Correlation coefficients
  • Intra-class correlation coefficient (ICC)
  • Estimate of total measurement variability due to
    between-individuals (vs error variance)
  • Equivalent to kappa and same range of values
  • Reflects true agreement, including systematic
    differences
  • Affected by range of values - if less variation
    between individuals, ICC will be lower

22
Inter-rater reliability of DI
  • Intraclass correlation coefficient (ICC)
  • n 26 patients (39 pairs of ratings)
  • ICC 0.98 (SD 0.06)

23
Alternate form reliability
  • Agreement between alternate forms of same
    instrument
  • longer vs shorter version
  • alternate method of administration
  • face-to-face vs telephone
  • subject vs proxy (see Magaziner paper)

24
Validity
  • Content and face validity
  • Criterion validity concurrent and predictive
  • Construct validity

25
Validity
  • Depends on purpose
  • screening discrimination
  • outcome of treatment responsive, sensitivity to
    change
  • prognosis predictive validity

26
Content and face validity
  • Judgment of experts and/or members of target
    population
  • Does measure adequately sample domain being
    measured?
  • Does it appear to measure what it is intended to
    measure? (eyeball test)

27
Content validity of DI
  • Based on Confusion Assessment Method (CAM)
  • based on accepted diagnostic criteria (DSM)
  • widely used

28
Criterion validity
  • Criterion (gold standard)
  • Concurrent criterion validity
  • e.g., screening test vs diagnostic test
  • Predictive criterion validity
  • e.g., cancer staging test vs 5-year survival

29
Criterion validity of DI
  • Correlation between psychiatrist-scored DI (based
    only on patient observation) and Delirium Rating
    Scale (using all available information)
  • original scale
  • adjusted scale, omitting 4 items not assessed by
    DI

30
Criterion validity of DI results
  • Spearman correlation coefficient ( and 95 CI)
    between DI and adjusted DRS (using multiple
    observations)
  • at one point in time 0.84 (0.75, 0.89)
  • within-subject change
  • over time 0.71 (0.53, 0.82)

31
Delirium severity and survival
  • Proportional hazards regression of delirium
    severity in delirium cohort
  • Mean of 1st 2 DI scores
  • Results
  • significant interaction DI predicted survival
    in patients with delirium alone, not in those
    with dementia

32
Construct validity
  • Is the theoretical construct underlying the
    measure valid?
  • Development and testing of hypotheses
  • Requires multiple data sources and
    investigations
  • Convergent validity measure is correlated with
    other measures of similar constructs
  • discriminant validity measure is not correlated
    with measures of different constructs

33
Construct validity (cont)
  • Multitrait-multi-method
  • Convergent validity measure is correlated with
    other measures of similar constructs
  • discriminant validity measure is not correlated
    with measures of different constructs
  • Factorial method
  • factor analysis or principle components analysis
    to identify underlying dimensions

34
Spearman correlation coefficients between
Delirium Index and 3 baseline measures of current
status
35
Spearman correlation coefficients between
Delirium Index and 3 baseline measures of prior
status
36
Responsiveness of measures
  • Ability to detect clinically important change
    over time or differences between treatments
  • Requirement of evaluative measures

37
Some sources of bias in scales
  • Response sets
  • Social desirability
  • Acquiescent

38
Social desirability
  • Tendency to give answers to questions that are
    perceived to be more socially desirable than the
    true answer
  • Different from deliberate distortion (faking
    good)
  • Depends on
  • Individual characteristics (age, sex, cultural
    background)
  • Specific question

39
Social desirability
  • Measures of social desirability (SD)
  • SD scales (e.g., Jackson SD scale, Crowne
    Marlowe SD scale)
  • individual tendency to SD bias
  • Prevention
  • phrasing of questions
  • questionnairemode
  • training of interviewers

40
Acquiescent response set
  • Tendency to agree with Likert-type questions
  • Can be prevented by mix of positively and
    negatively-phrased questions, e.g.
  • My health care is just about perfect
  • There are serious problems with my health care

41
Measurement of Quality of life (QoL)
  • Definition
  • individuals perception of their position in life
    in the context of the culture and value systems
    in which they live and in relation to their
    goals, expectations, standards, and concerns
    (WHO QOL group, 1995)
  • Domains
  • physical, psychological,level of independence,
    social relationships, environment, and
    spirituality/religion/personal beliefs

42
Health-related quality of life (HRQoL)
  • Dimensions of QoL related to health
  • Related terms
  • health status
  • functional status
  • Usually includes
  • physical health/function
  • mental health/function
  • social health/function

43
Evaluative HRQoL instruments
  • Purpose
  • evaluate within-individual change over time
  • Reliability
  • responsiveness
  • Construct validity
  • correlations of changes in measures during period
    of time, consistent with theoretically derived
    predictions

44
Discriminative HRQoL instruments
  • Purpose
  • evaluate differences between individuals at point
    in time
  • Reliability
  • reproducibility
  • Construct validity
  • correlations between measures at point in time,
    consistent with theoretically derived predictions

45
How is HRQoL measured?
  • Mode
  • Interviewer
  • face-to-face
  • Telephone
  • Self-completed
  • Completed by
  • self
  • proxy/surrogate

46
Types of HRQoL measures
  • Generic (global)
  • Health profiles
  • Utility measures
  • Specific

47
Generic vs specific
  • Generic
  • comparisons across populations and problems
  • robust and generalizable
  • measurement properties better understood
  • Disease-specific
  • shorter
  • more relevant and appropriate
  • sensitive to change

48
Appropriateness
  • Purpose
  • describe health of population
  • evaluate effects of interventions (change over
    time)
  • compare groups at point in time
  • predict outcomes
  • Areas of function covered
  • Level of health
  • Generic/global or specific
About PowerShow.com