Advising on test validity - PowerPoint PPT Presentation

About This Presentation
Title:

Advising on test validity

Description:

If the scree plot drops steeply, and a .75, use sumscore for research. Plug sumscore into experimental designs, ANOVAs, behavior genetic analyses, fMRI ... – PowerPoint PPT presentation

Number of Views:82
Avg rating:3.0/5.0
Slides: 54
Provided by: dennybo
Category:

less

Transcript and Presenter's Notes

Title: Advising on test validity


1
Advising on test validity Denny
Borsboom University of Amsterdam
2
or
3
Things that keep me awake at night
4
  • Overview
  • Rocks and hard places
  • The psychometric orthodoxy
  • The validity problem
  • What I think of validity
  • What I advise on validity
  • Even more miscellaneous issues

5
(No Transcript)
6
. . . .
flying
litter
environmentalism
attitude
relevance
others
7
tell the researcher to do a PCA and be done with
it!
do what you can to further real scientific
progress!
8
  • The Psychometric Orthodoxy
  • Make up a number of items you think are related
    to a construct
  • Compute Cronbachs a
  • Run a principal components analysis
  • If the scree plot drops steeply, and a gt .75, use
    sumscore for research
  • Plug sumscore into experimental designs, ANOVAs,
    behavior genetic analyses, fMRI studies, etc.
  • Publish results
  • Worry about validity

9
Disclaimer
  • The psychometric orthodoxy works perfectly for
    mundane goals, like
  • getting publishable results
  • predicting all sorts of things
  • building carreers in psychology
  • That is not what I am concerned about

10
(No Transcript)
11
validitydoes the test really
measureenvironmentalism?
12
  • The construct validity doctrine
  • To study validity, one should
  • - compute correlations with similar variables
  • - compute correlations with dissimilar variables
  • - examine group differences
  • - etc.
  • Results will typically be inconclusive

13
The question of validity
  • What does it mean to really measure something?
  • Does it mean more than to just measure
    something?
  • And who is taking care of the measurement
    problem in the first place?

14
we assume tests are valid and take it from there
methodology mountain
validity? why dont we ask the methodologist?!
substantive psychology ville
15
(No Transcript)
16
Four questions
  • what do our models assume?
  • do these assumptions make sense in psychology?
  • what are we really doing?
  • should this keep me awake at night?

17
Four questions
  • what do our models assume? lt- common causes
  • do these assumptions make sense in psychology? lt-
    no
  • what are we really doing? lt- something else
  • should this keep me awake at night? lt-?

18
Measurement models
19
(No Transcript)
20
(No Transcript)
21
(No Transcript)
22
Correlation
Correlation
23
Correlation
Correlation
Size of fire
24
No correlation
Size of fire
25
Local Independence
Size of fire
26
Correlation
Correlation
I feel comfortable around people
I am the life of the party
27
Reflective measurement model
I feel comfortable around people
I am the life of the party
28
Reflective measurement models
  • Are an instantiation of a common cause structure
  • So what causal process links environmentalism
    to my decision to fly or not to fly?
  • And what element of that process is the same one
    that causes me to throw litter in the trashcan?

29
Reflective measurement
Temperature
30
Reflective measurement with one item
  • What makes one thermometer a valid measurement
    instrument for temperature?
  • Its outcomes causally depend on temperature
  • The specification of this causal link is the most
    important problem in assessing validity

31
Essence
attribute
test score
causal process
32
How plausible is this...
  • ...for environmentalism and flying?
  • ...for intelligence and IQ-scores?
  • ...for personality and the Big Five?
  • ...for depression and DSM-diagnoses?
  • ...

33
  • The Psychometric Orthodoxy
  • Make up a number of items you think are related
    to a construct
  • Compute Cronbachs a
  • Run a principal components analysis
  • If the scree plot drops steeply, and a gt .75, use
    sumscore for research
  • Plug sumscore into experimental designs, ANOVAs,
    behavior genetic analyses, fMRI studies, etc.
  • Publish results
  • Worry about validity

34
So what are we really doing?
35
significant others
KLM attitude
flying
self-efficacy
litter
educational level
annual income
job performance
Sex
annual income
numerical ability
SES
physique
genetic differences
length
36
significant others
KLM attitude
flying
self-efficacy
litter
educational level
annual income
job performance
sex
annual income
numerical ability
SES
shower
genetic differences
length
37
environmentalism
significant others
KLM attitude
flying
self-efficacy
litter
educational level
annual income
job performance
sex
annual income
numerical ability
SES
shower
genetic differences
length
38
We are constructing variables outof other
variables, and labelingthem as constructs
39
Advice implications?
  • So I think that psychologys measurement story
    is implausible in many cases
  • I do not believe that it is true for
    environmentalism and flying
  • Should this play a role in my methodological
    advice?

40
NO
41
Reasons
  • I do not represent a majority position
  • I do not know for sure that Im right
  • I am uncertain what the alternative should be
  • This is not the researchers problem until the
    scientific community makes it his or her problem

42
Catharsis
  • So what I do instead is try to solve the
    researchers problem (not mine)
  • Try to push the scientific and methodological
    literature in the direction I think should be
    labelled forward
  • Wait for alternative ideas to catch on, and the
    consensus to change

43
Message
  • When you are advising, you are a window between
    the methodological literature and your client
  • If the methodological literature thinks that
    constructs are o.k., and your client agrees, then
    you are not in a position to advertise your
    hangups
  • Researchers should not suffer from your problems

44
But...
45
(No Transcript)
46
Example 1
  • A researcher wants to do an Anova to see whether
    people score higher on optimism than they do on
    extraversion
  • Two different scales, used to measure two
    different attributes, thrown into an RM anova
  • This is nonsense and will always be nonsense

47
(No Transcript)
48
Example 2
  • An organization wants to estimate the proportion
    of alternative healers that are involved in
    malpractice
  • They have a very small, very biased sample
  • This is not a responsible course of action

49
(No Transcript)
50
Example 3
  • An fmri researcher wants to interpret
    correlations in very small subgroups (n8)
  • She wants to satisfy a reviewer and conclude that
    the correlation is higher in group A than in
    group B
  • Pragmatically, I understand scientifically, I
    think its nonsense

51
(No Transcript)
52
Conclusion
  • In my experience, most methodologists do have
    conflicts now and then
  • I think thats part of methodological life
  • We should not burden clients with our personal
    hangups
  • However, neither do we have the responsibility to
    always satisfy your clients

53
Science
Write a Comment
User Comments (0)
About PowerShow.com