Classical and Bayesian Computerized Adaptive Testing Algorithms - PowerPoint PPT Presentation

Loading...

PPT – Classical and Bayesian Computerized Adaptive Testing Algorithms PowerPoint presentation | free to download - id: 6ea625-NTcwN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Classical and Bayesian Computerized Adaptive Testing Algorithms

Description:

Classical and Bayesian ... Basic statistical concepts and notation Trait estimation methods Item selection methods Comparisons ... method, Area between ... – PowerPoint PPT presentation

Number of Views:83
Avg rating:3.0/5.0
Slides: 47
Provided by: rjs4
Learn more at: http://www.samsi.info
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Classical and Bayesian Computerized Adaptive Testing Algorithms


1
Classical and Bayesian Computerized Adaptive
Testing Algorithms
  • Richard J. Swartz
  • Department of Biostatistics (rswartz_at_mdanderson.or
    g)

2
Outline
  • Principle of computerized adaptive testing
  • Basic statistical concepts and notation
  • Trait estimation methods
  • Item selection methods
  • Comparisons between methods
  • Current CAT Research Topics

3
Computerized Adaptive Tests (CAT)
  • First developed for assessment testing
  • Test tailored to an individual
  • Only questions relevant to individual trait level
  • Shorter tests
  • Sequential adaptive selection problem
  • Requires item bank
  • Fit with IRT models
  • Extensive initial development before CAT
    implementation

4
Item Bank Development I
  • Qualitative item development
  • Content experts
  • Response categories
  • Test model fit
  • Likelihood ratio based methods
  • Model fit indices

5
Item Bank Development II
  • Test Assumption Unidimensionality
  • Factor analysis
  • Confirmatory factor analysis
  • Multidimensional IRT models
  • Test assumption Local Dependence
  • Residual correlation after 1st factor removed
  • Multidimensional IRT models

6
Item Bank Development III
  • Test assumption Invariance
  • DIF differential item functioning
  • Over time and across groups (i.e. men vs. women)
  • Across groups
  • Many different methods (Logistic Regression
    method, Area between response curves, and others)

7
CAT Implementation
Hi Depression
3
7
4
13
6
Item bank
8
c
15
5
12
2
9
b
14
10
11
b
1
Lo Depression
8
CAT Item Selection
9
Basic Concepts/ Notation
10
Basic Concepts/ Notation II
11
Trait Estimation
12
Estimating Traits
  • Assumes Item parameters are known
  • Represent the individuals ability
  • Done sequentially in CAT
  • Estimate is updated after each additional
    response
  • Maximum Likelihood Estimator
  • Bayesian Estimators

13
Likelihood
  • Model describing a persons response pattern

14
Maximum Likelihood Estimate
  • Frequentist likely value to generate the
    responses
  • Consistency, efficiency depend on selection
    methods and item bank used.
  • Does not always exist

15
Bayesian Framework
  • ? is a random variable
  • A distribution on ? describes knowledge prior to
    data collection (Prior distribution)
  • Update information about ? (Trait) as data is
    collected (Posterior distribution)
  • Describes distribution of ? values instead of a
    point estimate

16
Bayes Rule
  • Combines information about ? (prior) with
    information from the data (Likelihood)
  • Posterior ? Likelihood Prior

17
Maximum A Posteriori (MAP) Estimate
  • Properties
  • Uniform Prior equivalent to MLE over support of
    the prior,
  • For some prior/likelihood combinations, Posterior
    can be multimodal

18
Expected A Posteriori (EAP) Estimate
  • Properties
  • Always exists for a proper prior
  • Easy to calculate with numerical integration
    techniques
  • Prior influences estimate

19
Posterior Variance
  • Describes variability of ?
  • Can be used as conditional Standard Error of
    Measurement (SEM) for a given response pattern.

20
ITEM SELECTION
21
Item Selection Algorithms
  • Choose the item that is best for the individual
    being tested
  • Define best
  • Most information about trait estimate
  • Greatest reduction in expected variability of
    trait estimate

22
Fishers Information
  • Information of a given item at a trait value

23
Maximum Fishers Information
  • Myopic algorithm
  • Pick the item ik at stage k, (ik ? Rk) that
    maximizes Fishers information at current trait
    estimate, (Classically MLE)

24
MFI - Selection
25
Minimum Expected Posterior Variance (MEPV)
  • Selects items that yields the minimum predicted
    Posterior variance given previous responses
  • Uses predictive distribution
  • Is a myopic Bayesian decision theoretic approach
    (minimizes Bayes risk)
  • First described by Owen (1969, 1975)

26
Predictive Distribution
  • Predict the probability of a response to an item
    given previous responses

27
Bayesian Decision Theory
  • Dictates optimal (sequential adaptive) decisions
  • In addition to prior and Likelihood, specify a
    loss function (squared error loss)

28
Bayesian Decision Theory Item Selection
  • Optimal estimator for Squared-error loss is
    posterior mean (EAP)
  • Select item that minimizes Bayes risk

29
Minimum Expected Posterior Variance (MEPV)
  • Pick the item ik remaining in the bank at stage
    k, (ik ? Rk) that minimizes the expected
    posterior variance (with respect to the
    predictive distribution)

30
Other Information Measures
  • Weighted Measures
  • Maximum Likelihood weighted Fishers
    Information(MLWI)
  • Maximum Posterior Weighted Fishers Information
    (MPWI)
  • Kulback-Leibler Information Global Information
    Measure

31
Hybrid Algorithms
  • Maximum Expected Information (MEI)
  • Use observed information
  • Predict information for next item
  • Maximum Expected Posterior Weighted Information
    (MEPWI)
  • Use observed information
  • Predict information for next item
  • Weight with Posterior
  • MEPWI ? MPWI

32
Mix N Match
  • MAP with uniform prior to approximate MLE
  • MFI using EAP instead of MLE (any point
    information function)
  • Use EAP for item selection, but MFI for final
    trait estimate

33
COMPARISONS
34
Study Design
  • Real Item Bank
  • Depressive symptom items (62)
  • 4 categories (fit with Graded Response IRT Model)
  • Peaked Bank Items have narrow coverage
  • Flat Bank Items have wider coverage
  • fixed length 5, 10, 20-item CATs

35
Datasets Used
  • Post hoc simulation using real data
  • 730 patients and caregivers at MDA
  • Real bank only
  • Simulated data
  • q grid -3 to 3 by .5
  • 500 simulees per q
  • Simulated and Real banks

36
Real Item Bank Characteristics
37
Real Bank, Real Data, 5 Items
38
Real Bank, Real Data, 5 items
Selection Criterion Mean SE2 RMSD CORR
MFI 0.1463 0.3763 0.9069
MLWI 0.1432 0.3736 0.9094
MPWI 0.1396 0.3738 0.9080
MEPV 0.1388 0.3598 0.9149
MEI (Fishers) 0.1388 0.3632 0.9134
MEI (Observed) 0.1388 0.3616 0.9139
Random 0.2369 0.4567 0.8565
39
Peaked Bank, Sim. Data, 5 Item
40
Peaked Bank, Sim. Data, 5 Item
Selection Criterion BIAS RMSE CORR
MFI 0.0283 0.3923 0.9822
MLWI 0.0678 0.4798 0.9724
MPWI 0.0261 0.3898 0.9822
MEPV 0.0232 0.3871 0.9822
MEI (Fishers) 0.0299 0.3903 0.9824
MEI (Observed) 0.0283 0.3911 0.9823
Random 0.0095 0.8378 0.9233
41
Summary
  • Polytomous items
  • Choi and Swartz, In press
  • Classic MFI with MLE, and MLWI not as good as
    others.
  • MFI with EAP, and all others essentially perform
    similarly.
  • Dichotomous items
  • (van der Linden, 1998)
  • MFI with MLE not as good as all others
  • Difference more pronounced for shorter tests

42
Adaptations/ Active Research Areas
  • Constrained adaptive tests/ content balancing
  • Exposure Control
  • A-stratified adaptive testing
  • Item selection including burden
  • Cheating detection
  • Response times

43
(No Transcript)
44
References and Further Reading
  • Choi SW Swartz RJ.  (in press) Comparison of CAT
    Item Selection Criteria for Polytomous Items
    Applied psychological Measurement.
  • Owen RJ (1969) A Bayesian approach to tailored
    testing (Research report 69-92) Princeton, NJ
    Educational Testing Service
  • Owen RJ (1975). A Bayesian Sequential Procedure
    for quantal response in the context of adaptive
    mental testing. Journal of the American
    Statistical Association, 70, 351-356.
  • van der Linden WJ. (1998). Bayesian item
    selection criteria for adaptive testing
    Psychometrika, 2, 201-216.
  • van der Linden WJ. Glas, C. A. W. (Eds).
    (2000). Computerized Adaptive Testing Theory and
    Practice. Dordrecht Boston Kluwer Academic.

45
(No Transcript)
46
MLE Properties
  • Usually has desirable asymptotic properties
  • Consistency and efficiency depend on selection
    criteria and item bank
  • Finite estimate does not exist for repeated
    responses in categories 1 or m
About PowerShow.com