Title: General Latent Variable Modeling Approaches to Measurement Issues using Mplus
1General Latent Variable Modeling Approaches to
Measurement Issues using Mplus
- Rich Jones jones_at_mail.hrca.harvard.edu
- Psychometrics Workshop
- Friday Harbor, San Juan Island, WA
- August 24, 2005
2Overview
- Part 1
- IRT overview
- DIF overview
- Part 2
- IRT via Factor Analysis
- Factor analysis and general latent variable
models for measurement issues using Mplus - Limitations of Mplus approach
- Part 3
- Applied Example
- Part 4 (time permitting)
- Bells and Whistles
- Discussion
3Part 1a
4Semantics
- Multiple Fields, Conflicting Language
- Educational Testing, Psychological Measurement,
Epidemiology Biostatistics, Psychometrics
Structural Equation Modeling - Characteristics of People
- ability, trait, state, construct, factor level,
item response - Characteristics of Items
- difficulty, severity, threshold, location
- discrimination, sensitivity, factor loading,
measurement slope
5Key Ideas of IRT
- Persons have a certain ability or trait
- Items have characteristics
- difficulty (how hard the item is)
- discrimination (how well the item measures the
ability) - (I wont talk about guessing)
- Person ability, and item characteristics are
estimated simultaneously and expressed on unified
metric - Interval-level measure of ability or trait
- Used to be hard to do
6Some Things You Can Do with IRT
- Refine measures
- Identify biased test items
- Adaptive testing
- Handle missing data at the item level
- Equate measures
7Latent Ability / Trait
- Symbolized with qi or hi
- Assumed to be continuously, and often normally,
distributed in the population - The more of the trait a person has, the more
likely they are to ...whatever...(endorse the
symptom, get the answer right etc.) - The latent trait is that unobservable,
hypothetical construct presumed to be measured by
the test (assumed to cause item responses)
8Item Characteristic Curve
- The fundamental conceptual unit of IRT
- Relates item responses to ability presumed to
cause them - Represented with cumulative logistic or
cumulative normal forms
9Item Response Function
P(yij1qi) Faj(qi-bj)
10(No Transcript)
11Example of an Item Characteristic Curve High
Ability
12Example of an Item Characteristic Curve Low
Ability
13Example of an Item Characteristic Curve Item
Difficulty
14Example of two ICCs that Differ in Difficulty
15Example of an Item Characteristic Curve Item
Discrimination
16Example of two ICCs that Differ in Discrimination
17Item Response Function
18Extra Creditone way to get estimates of
underlying ability
Remember Bayes Theorem
19Extra Creditone way to get estimates of
underlying ability
Bayes modal estimates of latent ability
(h) (modal a posteriori MAP estimates)
20Part 1b
21Identify Biased Test ItemsDifferential Item
Functioning (DIF)
- Differences in likelihood of error to a given
item may be due to - group differences in ability
- item bias
- both
- IRT can parse this out
- Item Bias Differential Item Function
Rationale - Most workers in IRT identify DIF when two groups
do not have the same ICC
22(No Transcript)
23(No Transcript)
24(No Transcript)
25Part 2
26IRT and Factor Analysis
- IRT describes a class of statistical models
- IRT models can be estimated using factor analysis
- Appropriate routines for ordinal dependent
variables (tetrachoric/polychoric correlation
coefficients) - Factor analysis models can be extended in very
general ways using structural equation modeling
techniques / software
27- www.statmodel.com
- Used to be LISCOMP, owes lineage to LISREL
- Does just about everything other continuous
latent variable / structural equation software
implement (LISREL, EQS, AMOS, CALIS) - Plus, very general latent variable modeling
- Continuous latent variables (latent traits)
- Categorical latent variables (latent classes,
mixtures) - Missing data
- Estimation with data from complex designs
- Expensive, demo version available
28Mplus approach to IRT Model
- One or Two-parameter IRT models (not explicit)
- Discrimination Factor loadings/slopes
- Difficulty Item thresholds
- Two estimation methods
- Weighted Least Squares
- Limited information
- Multivariate probit (theta or delta
parameterization) - Latent response variable formulation (Assume
underlying continuous variables) - Maximum Likelihood
- Full information
- Multivariate logistic
- Conditional probability formulation
- More experience, fit statistics with WLS
- Some model types require ML, others WLS
29Latent Response Variable Formulation (picture)
30Latent Response Variable Formulation (words)
- Assume observed ordinal (dichotomous) y has
corresponding underlying continuous normal but
unobservable (latent) form (y) - When a persons value for y exceeds some
threshold (t), y1 is observed, otherwise, y0 is
observed - Analysis is focused on relationship among the y
and estimating the thresholds (t)
31Latent Response Variable Formulation (equation)
32Conditional ProbabilityFormulation
33Factor Analysis Model
34Factor Analysis Model
35Factor Analysis with Covariates
36Multiple Group CFA
37Multiple Group (MG) MIMIC
38MIMIC and MG-MIMIC Model
- Disadvantages
- Not so good for factor score generation
- Not exactly the IRT model
- different conceptualization of NU-DIF
- Some work to get as bs and standard errors
- Relatively little experience / literature in
field - Confusing / overlapping measurement noninvariance
literature from SEM field
39MIMIC and MG-MIMIC Model
- Advantages
- Can be easy to estimate, good for modeling
- No need to equate parameters
- No data re-arrangements required, missing data
tricks - Simultaneous analysis/evaluation of all items and
possible sources of model mis-fit (including
potential DIF or bias) - Multiple independent variables (with DIF)
- Ys and Xs can be categorical or continuous
- Anchor items not necessary, but...
- Embed in more complex models
- Complimentary measurement noninvariance
literature from SEM field
40MIMIC Model how to do it
From within STATA using runmplus.ado
runmplus y1-y4 x1, categorical(y1-y4)
type(meanstructure) model(eta by y1-y4 eta_at_1
eta on x1 y1 on x1)
Mplus syntax file
Title MIMIC model Data File is
__000001.dat Variable Names are y1 y2 y3 y4
x1 categorical y1-y4 Analysis
type meanstructure MODEL eta by
y1-y4 eta_at_1 eta
on x1 y1 on x1
41Some Applied Examples and Technical Articles
- Muthén, B. O. (1989). Latent variable modeling in
heterogeneous populations. Meetings of
Psychometric Society (1989, Los Angeles,
California and Leuven, Belgium). Psychometrika,
54(4), 557-585. - McArdle, J., Prescott, C. (1992). Age-based
construct validation using structural equation
modeling. Experimental Aging Research, 18(3),
87-116. - Gallo, J. J., Anthony, J. C., Muthén, B. O.
(1994). Age differences in the symptoms of
depression a latent trait analysis. Journals of
Gerontology, 49(6), 251-264. - Salthouse, T., Hancock, H., Meinz, E.,
Hambrick, D. (1996). Interrelations of age,
visual acuity, and cognitive functioning. Journal
of Gerontology Psychological Sciences, 51B(6),
P317-P330. - Grayson, D. A., Mackinnon, A., Jorm, A. F.,
Creasey, H., Broe, G. A. (2000). Item bias in
the Center for Epidemiologic Studies Depression
Scale effects of physical disorders and
disability in an elderly community sample. The
Journals of Gerontology. Series B, Psychological
Sciences and Social Sciences, 55(5), 273-282. - Jones, R. N., Gallo, J. J. (2002). Education
and sex differences in the Mini Mental State
Examination Effects of differential item
functioning. The Journals of Gerontology. Series
B, Psychological Sciences and Social Sciences,
57B(6), P548-558. - Macintosh, R., Hashim, S. (2003). Variance
Estimation for Converting MIMIC Model Parameters
to IRT Parameters in DIF Analysis. Applied
Psychological Measurement, 27(5), 372-379. - Rubio, D.-M., Berg-Weger, M., Tebb, S.-S.,
Rauch, S.-M. (2003). Validating a measure across
groups The use of MIMIC models in scale
development. Journal of Social Service Research,
29(3), 53-68. - Fleishman, J. A., Lawrence, W. F. (2003).
Demographic variation in SF-12 scores true
differences or differential item functioning? Med
Care, 41(7 Suppl), III75-III86. - Jones, R. N. (2003). Racial bias in the
assessment of cognitive functioning of older
adults. Aging Mental Health, 7(2), 83-102.
42Part 3
Jones, R. N. (2003). Racial bias in the
assessment of cognitive functioning of older
adults. Aging Mental Health, 7(2),
83-102. Acknowledgement R03 AG017680
43Example Racial bias in TICS (HRS/HEAD)
- Nationally representative, very large sample
(N15,257) - Over-sample of Black or African-Americans
(N2,090) - Assessment of cognition
- Very adequate assessment of SES (education,
income, occupation)
44Objective
- Evaluate the extent to which item level
performance is due to test-irrelevant variance
due to race (White, non-Hispanic vs. Black or
African-American participants) - Control for main and potentially differential
effects of background variables - Sex, Age
- Educational attainment
- Household income, occupation groups
- Health Conditions and Health Behaviors
45TICS/AHEAD Measure of Cognitive Function (Herzog
1997)
- Points
- Orientation to time (weekday, day, month, year) 4
- Name President, Vice-President 2
- Name two objects (cactus, scissors) 2
- Count Backwards from 20 1
- Serial Sevens 5
- Immediate recall (10 nouns) 10
- Delayed free-recall (10 nouns, 5 min delay) 10
46Background Variables
- Sex
- Age (9 groups)
- Education (6 groups)
- Household Income (5 groups)
- Highest household occupation (8 groups)
-
- Health Conditions (HBP, DM, heart, stroke,
arthritis, pulmonary, cancer) - Health Behaviors (current smoking, drinking
three groups)
47(No Transcript)
48(No Transcript)
49Results
- All items show DIF by race, some by sex, age,
education - Effect of covariates (age, occupation, income,
smoking status) significantly different across
racial group - Greater variance in latent cognitive function for
Black or African-American participants - No significant race difference in mean latent
cognition by race after adjusting for measurement
differences
Jones. Aging Ment Health, 2003 783-102.
50Differences in Underlying Ability between Whites
and African Americans
- 60 is due to measurement differences (DIF, item
bias) - 12 is due to main effect of background variables
- 7 is due to structural differences (i.e.,
interactions of group and background variables) - What remains (about .2 SD) is not significantly
different from no difference
Jones. Aging Ment Health, 2003 783-102.
51Differences in Underlying Ability ignoring
measurement bias
Jones. Aging Ment Health, 2003 783-102.
52Differences in Underlying Ability after
controlling for measurement bias
Jones. Aging Ment Health, 2003 783-102.
53Differences in Underlying Ability after
controlling for measurement biasinteraction with
age group
Jones. Aging Ment Health, 2003 783-102.
54(No Transcript)
55(No Transcript)
56(No Transcript)
57Model Fit / Parsimony
- Model fitting accomplished more than shifting
group differences in mental status to item-level - New model provides greater fit to observed data
using fit statistics that reward model parsimony
58Part 4
- Bells and Whistles
- Discussion
59Latent Growth Model
60Multiple Indicator Latent Growth Model
61Measurement Mixture Models
62(No Transcript)
63(No Transcript)
64Part 4b