Class 3 Methods of Developing New Measures and How to Select Measures for Your Study October 8, 2009 - PowerPoint PPT Presentation

Loading...

PPT – Class 3 Methods of Developing New Measures and How to Select Measures for Your Study October 8, 2009 PowerPoint presentation | free to download - id: 217ceb-MzhhN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Class 3 Methods of Developing New Measures and How to Select Measures for Your Study October 8, 2009

Description:

Generate a large set of items that reflect the concept definition ... Items retained with factor loadings 0.40. MacKeigan LD et al. Med Care 1989;27:522 ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 89
Provided by: ucsf4
Learn more at: http://rds.epi-ucsf.org
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Class 3 Methods of Developing New Measures and How to Select Measures for Your Study October 8, 2009


1
Class 3 Methods of Developing New Measures
and How to Select Measures for Your Study
October 8, 2009
  • Anita L. Stewart
  • Institute for Health Aging
  • University of California, San Francisco

2
Overview of Class 3
  • Overview sequence of developing new measures
  • Rationale for multi-item measures
  • Scale construction methods
  • Steps in choosing appropriate measures for your
    study

3
Typical Sequence of Developing New Self-Report
Measures
Develop/define concept
Create item pool
Pretest/revise
Field survey
Psychometric analyses
Final measures
4
Sequence Develop Item Pool
  • Generate a large set of items that reflect the
    concept definition
  • For multidimensional concepts, items for each
    dimension
  • Item sources
  • Other measures of similar concepts
  • Qualitative research such as focus groups
  • Researchers ideas about concept

5
Considerations in Writing Item Pool
  • Items from various sources will have different
    formats, response choices, and instructions
  • Have to determine consistent approach

6
Reduce Item Pool to Manageable Number
  • Review items against concept until best ones
    remain for pretesting
  • Judgment of investigators
  • Expert panels
  • Achieve good representation of all dimensions
  • Have more items than final goal

7
Revised Interpersonal Processes of Care Concepts
and Item Pool
IPC Version I frameworkin Milbank Quarterly
Draft IPC II conceptual framework
19 focus groups -African American, Latino,and
White adults
Literature review of quality of care in diverse
groups
8
IPC Item Pool
Original IPC
DraftIPC II conceptual framework
IPC II Item pool (1,006 items)
19 focus groups
Literature review
160 items selected for pre-testing
9
Sample Item
  • These questions are about your experiences
    talking with your doctors at ___ over the past 12
    months
  • 1. How often did doctors use words that were
    hard to understand?
  • -- never
  • -- rarely
  • -- sometimes
  • -- usually
  • -- always

10
Sequence Pretest/Revise
  • Pretest, pretest, pretest
  • Numerous methods
  • For new measures, pretesting essential
  • Obtain reactions and comments of individuals
    targeted for study
  • Results in revisions of items and response choices

11
Pretest in Target Population
  • Pretesting essential for measures being applied
    to any new population group
  • Especially priority measures (e.g., outcomes)
  • Pretest is to identify
  • problems with procedures
  • method of administration, respondent burden
  • problems with questions
  • Item stems, response choices, and instructions

12
Problems with Questions or Response Choices
  • Are all words/phrases understood as intended?
  • Are questions interpreted similarly by all
    respondents?
  • Are some questions not answered?
  • Are any questions offensive or irrelevant?
  • Does each closed-ended question have an answer
    that applies to each respondent?
  • Are the response choices adequate?

13
Types of Pretests
  • General debriefing pretest (N10)
  • In-depth cognitive interviewing pretest (N5-10
    each group)

14
Sequence Field Survey/Questionnaire
  • Administer survey to large enough sample to test
    psychometric characteristics
  • Two approaches
  • Preliminary field test (N about 100)
  • Administer in main study conduct psychometric
    studies on study data
  • Some items may not be used in final scales

15
Sequence Psychometric Analyses
  • Evaluate items - variability, missing
  • Create multi-item scales according to scale
    construction criteria
  • Evaluate scale characteristics
  • Variability, reliability
  • Validity

16
Sequence Final Measures
  • When to publish depends on
  • authors standards
  • sample size of psychometric analyses
  • Many measures published with very little
    iterative work
  • Single sample testing

17
Overview of Class 3
  • Overview sequence of developing new measures
  • Rationale for multi-item measures
  • Scale construction methods
  • Steps in choosing appropriate measures for your
    study

18
Single-item Measures - Usually Ordinal
  • Advantages
  • Response choices interpretable
  • Disadvantages
  • Impossible to assess complex concept
  • Very limited variability, often skewed
  • Reliability usually low

19
Multi-Item Measures or Scales
  • Multi-item scales are created by combining two or
    more items into an overall measure or scale score
  • Sometimes called summated ratings scales

20
Advantages of Multi-item Measures (Over Single
Items)
  • More scale values (improves score distribution)
  • Reduces of scores to measure a concept
  • Improves reliability (reduces random error)
  • Reduces with missing data (can estimate score
    if items are missing)
  • More likely to reflect concept (content validity)

21
One Major Exception Self-rated Health
22
Review of 27 Studies of Self-rated Health and
Mortality
  • Independently predicted mortality in nearly all
    studies
  • Despite controlling for numerous specific health
    indicators and other predictors of mortality

Idler EI et al. J Health Soc Beh, 19973821-37
23
Overview of Class 3
  • Overview sequence of developing new measures
  • Rationale for multi-item measures
  • Scale construction methods
  • Steps in choosing appropriate measures for your
    study

24
Methods for Creating Multi-item Scales
  • Two Basic Scale Construction Approaches
  • Multitrait scaling
  • Factor analysis
  • Classical test theory approaches

25
Example of a 2-item Summated Ratings Scale
  • How much of the time .... tired?
  • 1 - All of the time
  • 2 - Most of the time
  • 3 - Some of the time
  • 4 - A little of the time
  • 5 - None of the time
  • How much of the time
  • . full of energy?
  • 1 - All of the time
  • 2 - Most of the time
  • 3 - Some of the time
  • 4 - A little of the time
  • 5 - None of the time

26
Step 1 Reverse One Item So They Are in the Same
Direction
  • How much of the time .... tired?
  • 1 - All of the time
  • 2 - Most of the time
  • 3 - Some of the time
  • 4 - A little of the time
  • 5 - None of the time
  • How much of the time
  • . full of energy?
  • 15 All of the time
  • 24 Most of the time
  • 33 Some of the time
  • 42 A little of the time
  • 51 None of the time

Reverse energy item so high score more energy
27
Step 2 Sum the Items
  • How much of the time .... tired?
  • 1 - All of the time
  • 2 - Most of the time
  • 3 - Some of the time
  • 4 - A little of the time
  • 5 - None of the time
  • How much of the time
  • . full of energy?
  • 5 - All of the time
  • 4 - Most of the time
  • 3 - Some of the time
  • 2 - A little of the time
  • 1 - None of the time

Lowest 2 (tired all of the time, full of energy
none of the time) Highest 10 (tired none of the
time, full of energy all of the time)
28
Step 2 Can Also Average the Two Items
  • How much of the time .... tired?
  • 1 - All of the time
  • 2 - Most of the time
  • 3 - Some of the time
  • 4 - A little of the time
  • 5 - None of the time
  • How much of the time
  • . full of energy?
  • 5 - All of the time
  • 4 - Most of the time
  • 3 - Some of the time
  • 2 - A little of the time
  • 1 - None of the time

Lowest 1.0 (tired all of the time, full of
energy none of the time) Highest 5.0 (tired
none of the time, full of energy all of the time)
29
Summed or Averaged Increases Number of Levels
from 5 (per item) to 9
30
Summated Rating Scales Scaling Analyses
  • To create a summated rating scale, set of items
    need to meet several criteria
  • Need to test whether the items hypothesized to
    measure a concept can be combined
  • i.e., that items form a single concept

31
Five Criteria to Qualify as a Summated Ratings
Scale
  • Item convergence
  • Item discrimination
  • No unhypothesized dimensions
  • Items contribute similar proportion of
    information to score
  • Items have equal variances

32
First Criterion Item Convergence
  • Each item correlates substantially with the total
    score of all items
  • with the item taken out or corrected for
    overlap
  • Typical criterion is gt .30
  • for well-developed scales, often gt .40

33
Example Analyzing Item Convergence for Adaptive
Coping Scale
Item-scale
correlations Adaptive coping (alpha .70) 5
Get emotional support from others .49 11
See it in a different light
.62 18 Accept the
reality of it .25 20
Find comfort in religion
.58 13 Get comfort from someone
.45 21 Learn to live with it
.21 23 Pray or meditate
.39
Moody-Ayers SY et al. J Amer Geriatr Soc,
2005532202-08.
34
Example Analyzing Item Convergence for Adaptive
Coping Scale

Item-scale correlations Adaptive coping
(alpha .70) 5 Get emotional support from
others .49 11 See it in a different
light .62
18 Accept the reality of it
.25 lt.30 20 Find comfort in
religion .58 13 Get
comfort from someone .45 21
Learn to live with it
.21 lt.30 23 Pray or meditate
.39
35
Example Split Into Two Scales
  • Item-scale
    correlations
  • Adaptive coping (alpha .76)
  • 5 Get emotional support from others
    .45
  • 11 See it in a different light
    .59
  • 20 Find comfort in religion
    .73
  • 13 Get comfort from someone .45
  • Pray or meditate
    .51
  • Acceptance (alpha .67)
  • Learn to live with it
    .50
  • 18 Accept the reality of it
    .50

36
Can Examine Item Convergence Using Any
Statistical Software
  • Programs to calculate internal consistency
    reliability
  • Provide estimated coefficient alpha
  • Produce item-scale correlations corrected for
    overlap

37
Second Criterion Item Discrimination
  • Each item correlates significantly higher with
    the construct it is hypothesized to measure than
    with other constructs
  • Item discrimination
  • Statistical significance is determined by
    standard error of the correlation
  • Determined by sample size

38
Example Two Subscales Being Developed Using
Multitrait Scaling
  • Depression and Anxiety subscales of MOS
    Psychological Distress measure

39
Example of Multitrait Scaling Matrix
Hypothesized Scales
ANXIETY DEPRESSION ANXIETY Nervous
person .80 .65
Tense, high strung .83
.70 Anxious, worried .78
.78 Restless, fidgety
.76 .68 DEPRESSION Low
spirits .75
.89 Downhearted .74 .88
Depressed .76
.90 Moody .77
.82
40
Example of Multitrait Scaling Matrix Item
Convergence
ANXIETY DEPRESSION ANXIETY Nervous
person .80 .65
Tense, high strung .83
.70 Anxious, worried .78
.78 Restless, fidgety
.76 .68 DEPRESSION Low
spirits .75
.89 Downhearted .74 .88
Depressed .76
.90 Moody .77
.82
41
Example of Multitrait Scaling Matrix Item
Discrimination
ANXIETY DEPRESSION ANXIETY Nervous
person .80 .65
Tense, high strung .83
.70 Anxious, worried .78
.78 Restless, fidgety
.76 .68 DEPRESSION Low
spirits .75
.89 Downhearted .74 .88
Depressed .76
.90 Moody .77
.82
42
Multitrait Scaling to Develop New Expectations
of Aging Measure
  • Pretested initial 94-item version (N58)
  • Eliminated items with
  • Missing data
  • Poor distributions
  • Low item-scale correlations
  • Field tested 56-item version (N588)
  • Eliminated more items
  • Low item-scale correlations
  • Weak item discriminant validity
  • Field tested again (N429)
  • 38 items, final scales

Sarkisian CA et al. Gerontologist 200242534-542
43
Multitrait Scaling - An Approach to Constructing
Summated Rating Scales
  • Confirms whether hypothesized item groupings can
    be summed into a scale score
  • Examines extent to which all five criteria are
    met
  • Reports characteristics of resulting scales
  • A confirmatory method
  • Requires strong conceptual basis for hypothesized
    scales
  • Typically used for scales well along in testing

44
Multitrait Scaling Methods
  • Used at RAND in all health measurement
    development (e.g., MOS measures)
  • Method described in reading 1 for class 3
  • Stewart and Ware, 1992, pp 67-80

45
Multitrait Scaling Analysis Described by Ron Hays
(UCLA/RAND)
  • Hays RD Wang E. (1992, April). Multitrait 
    Scaling Program MULTI. Proceedings of the
     Seventeenth Annual SAS Users Group International
    Conference, 1151-1156.
  • Hays RD et al. Behavior Research Methods,
    Instruments, and Computers, 199022167-175

46
SAS Macro Available
  • Ron Hays also makes available a SAS macro for
    conducting multitrait scaling
  • You dont have to purchase software

http//gim.med.ucla.edu/FacultyPages/Hays/util.htm
Go to MULTI Sample program including macro
call MULTI.sas and its output MULTI.out
47
Using Factor Analysis to Develop Multi-Item Scales
  • For new measures in early developmental stages
  • Exploratory factor analysis of items can identify
    possible dimensions
  • Useful when starting with item pool with
    uncertainty about subdimensions

48
Patient Satisfaction with Pharmacy Services
  • No measures started from scratch
  • Phase 1 pretested 44 items (N30)
  • Revised items
  • Phase 2 field tested 45 items (N313)
  • Exploratory factor analysis - 7 factors
  • Revised items

MacKeigan LD et al. Med Care 198927522
49
Patient Satisfaction with Pharmacy Services
  • Phase 3 field tested 44 items (N389)
  • Exploratory factor analysis - 8 factors (56 of
    variance)
  • Items retained with factor loadings gt0.40

MacKeigan LD et al. Med Care 198927522
50
Item Reduction by Analysis
51
(No Transcript)
52
Iterations of Pharmacy Services Measure
53
Iterative Approach Based on Exploratory Factor
Analysis (EFA)
  • Use EFA to identify possible subscales
  • Must make conceptual sense
  • Create summated scales with items in final
    factors
  • Test these using Multitrait Scaling analysis
  • Final scales a blend of the two methods

54
Overview of Class 3
  • Overview sequence of developing new measures
  • Rationale for multi-item measures
  • Scale construction methods
  • Steps in choosing appropriate measures for your
    study

55
Selecting Measures for Your Research
  • Goal find a measure of your concept that has
    been developed using stringent measurement
    development methods
  • Your task find measures and review them for all
    steps in measurement development process

56
Process of Selecting Good Measures for Your
Studies
Define concept (variable)
Identify potential measures
Review measures properties --conceptual
adequacy --psychometric adequacy
Pretest best 1-2 measures
Select final measure
57
Process of Selecting Good Measures for Your
Studies
Define concept (variable)
Identify potential measures
Review measures for --conceptual
adequacy --psychometric adequacy
Pretest best 1-2 measures
Select final measure
58
Process of Selecting Good Measures for Your
Studies
Define concept (variable)
Identify potential measures
Review measures for --conceptual
adequacy --psychometric adequacy
Pretest best 1-2 measures
Select final measure
59
Review Potential Measures for
  • Conceptual adequacy for your study
  • Psychometric adequacy in target group(s)
  • Practicality, acceptability in your study
  • Translations available if needed

60
Review Potential Measures for
  • Conceptual adequacy for your study
  • Psychometric adequacy in target group(s)
  • Practicality, acceptability in your study
  • Translations available if needed

Matrix provides template for reviewing measures
61
Conceptual Adequacy for Your Study
  • Concept being measured matches the concept you
    defined
  • Sometimes can only be determined by reviewing
    items
  • If not a perfect match
  • How close is it to your concept?
  • Can it be modified to get at missing components?

62
Conceptual Adequacy
  • You are interested in reports of perceived
    discrimination in health care setting
  • Measures of discrimination pertain to
  • Discrimination over the lifecourse
  • Discrimination in various settings (work, school)
  • Not adequate for your purpose

63
Psychometric Adequacy for Your Study
  • In samples similar to yours
  • good variability
  • good reliability
  • good validity
  • As an outcome for an intervention
  • responsiveness, sensitivity to change in similar
    populations

64
Specify Context
  • Study characteristics affecting choices of
    measures
  • Nature of target population
  • Practical constraints

65
Context Nature of Population
  • Age (range, mean)
  • Health states
  • chronic conditions
  • frail, cognitively impaired
  • Socioeconomic status
  • low education
  • limited literacy
  • Race/ethnicity diversity
  • Language
  • limited English proficiency

66
Context Practical Constraints
  • Time frame for completing study
  • Personnel
  • RAs, interviewers
  • Budget/funding
  • Data entry, mailings, follow-up non-responders
  • Preferred method of administration
  • Acceptable respondent burden

67
Practical Considerations Match to Your Context
  • Permission needed to use
  • Cost of using
  • Method of administration
  • Data collection issues
  • Short forms if needed
  • Reading level
  • Translations if needed
  • Acceptability, respondent burden

68
Practical - Obtaining Permission
  • Need permission to use or to adapt?
  • Public domain
  • If items are published or in the public domain,
    usually dont need permission
  • Private or proprietary
  • Need to write to author or distributor
  • Allow 4-6 weeks to obtain measure and/or
    permission

69
Permission Statements RAND
  • About Our Surveys/Permissions
  • All of the surveys from RAND Health are public
    documents, available without charge (for
    non-commercial purposes).
  • Please provide an appropriate citation when using
    these products. In some cases, the materials
    themselves include specific instructions for
    citation.

70
Permission Statements RAND
  • Pediatric Quality of Life Inventory (PedsQL
  • The PedsQLTM 4.0 Measurement Model is a modular
    approach to measuring health-related quality of
    life in both healthy children and adolescents and
    in those with acute and chronic health
    conditions. The survey integrates generic core
    scales and disease-specific modules
  • The survey instrument, guidelines for
    administering it, and scoring instructions are
    available at http//www.pedsql.org

71
Visual Function Questionnaire (VFQ)
  • The VFQ-25 is a public document available without
    charge to all researchers provided that they
    identify the measure as such in all publications.
    Users should also cite the following article
  • Mangione, C. M., Lee, P. P., Gutierrez, P. R.,
    Spritzer, K., Berry, S., Hays, R. D. (2001).
    Development of the 25-item National Eye Institute
    Visual Function Questionnaire (VFQ-25). Archives
    of Ophthalmology, 119, 1050-1058.
  • Specific permissions for using the VFQ-25 are
    detailed on the cover page of the questionnaire
    itself.

72
Permission to Modify Measures
  • Permission to modify or adapt measure
  • Especially if you think you might need to
  • Modifications future class

73
Practical - Cost to Use or to Score Measures
  • Cost of administering and scoring
  • Fee per person?
  • Cost of any needed scoring software?
  • Cost to have it scored by source?

74
SF-36 Example of Proprietary Measure
  • http//www.sf-36.org
  • Information on SF-36 and SF-36v2
  • Can obtain permission to use
  • Need to purchase manuals and scoring materials

75
Scoring SF-36 and the Two Summary Scales (PCS and
MCS)
Was 108
Was 108
76
Block Nutrition Questionnaires
  • Extremely complicated to score nutritional intake
  • Requires professional scoring
  • Pricing example Block 2005 FFQ
  • Purchase cost per booklet 1.00
  • Processing cost per booklet
  • 7.50 for batches of 20-99
  • 6.25 for batches of 100-499
  • 5.25 for batches of 500 or more

http//www.nutritionquest.com/products/pricing.htm
77
Practical - Method of Administration
  • Face-to face interview
  • Telephone interview
  • Self-administered questionnaire
  • A combination
  • Proxy (any of the above)
  • Each method introduces some type of bias
  • The first two methods require interviewer training

78
Practical - Data Collection Issues
  • How will you collect data?
  • Paper?
  • Directly into computer (CATI)
  • Will data entry be necessary?
  • If so, what data entry program will you use?
  • Survey format depends on data entry plans
  • Teleforms are faxed

79
Practical - Scoring
  • Are scoring instructions clearly documented?
  • Do you have a scoring codebook?
  • Are computer scoring programs available?
  • (Cost of scoring)

80
Practical Short Forms?
  • Are there reliable and valid short-forms
    available if you need it?
  • Many measures have short forms, but they
    typically have not been tested as thoroughly
  • Shorter forms can have lower variability,
    reliability, validity, and sensitivity to change

81
Practical - Reading Level
  • Is reading level appropriate for your target
    population?
  • Special concern in lower SES, limited English
    proficiency groups
  • If reading level not known
  • Make your own judgment
  • Pretest with target population

82
Acceptability
  • Ease with which measure can be used in your
    setting and population
  • Acceptability to target population
  • respondent burden
  • culturally sensitive
  • Acceptability to interviewers
  • amount of training needed

83
Respondent Burden
  • Real burden
  • Length, convenience, time needed to complete
  • Perceived burden
  • a function of item difficulty, distress due to
    content, perceived value of survey, expectations
    of length
  • Some population subgroups may have more
    difficulty, take longer to complete

84
Availability of Translations if Needed
  • If you need measure in another language, are
    there translations available?
  • Official (published and tested)
  • Unofficial (by some other researcher)

85
Translation Availability and Quality
  • Is the measure available in the language of your
    target populations?

No
Yes
  • Know method of translation
  • Assess adequacy or quality of translation
  • Perform translation using state-of-the-art
    methods
  • A resource issue

86
Homework Download matrix
  • Select two of the measures from those you found
    that are most likely to meet your needs
  • You will review these for the remainder of the
    class
  • Complete rows 1-12 on matrix for the two measures
    you selected
  • Overview, definition of concept, method of
    administration, scale construction methods,
    description of measure

87
Handout Outline of Final Paper
  • May help in planning your review with the matrix

88
NOTE Class Schedule Change
  • No lecture November 19
  • Class extends to December 10
  • Final due December 17
About PowerShow.com