Computerized Adaptive Testing - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Computerized Adaptive Testing

Description:

Reducing the duration and cost of assessment with the GAIN: Computer Adaptive Testing Evidence-Based Practice Requires accurate diagnosis, treatment placement, and ... – PowerPoint PPT presentation

Number of Views:128
Avg rating:3.0/5.0
Slides: 33
Provided by: dhd82
Learn more at: https://www.chestnut.org
Category:

less

Transcript and Presenter's Notes

Title: Computerized Adaptive Testing


1
Reducing the duration and cost of assessment with
the GAIN Computer Adaptive Testing
2
Evidence-Based Practice
  • Requires accurate diagnosis, treatment placement,
    and outcomes monitoring
  • Assessment over a wide range of domains
  • The cost of evidence-based assessment is
  • Time
  • Respondent Burden
  • Increased staff resources (including training

3
Improving Efficiency
  • The use of screeners and short-form instruments
    has significantly improved the efficiency of the
    assessment process
  • Can help determine whether a full assessment is
    warranted
  • But not a substitute for a full assessment
  • Lack of precision
  • Floor and ceiling effects
  • Limited content validity

4
Computerized Adaptive Testing
  • Selects items from a large bank of items based on
    the responses made to previous items.
  • Continues to select and administer items until
    sufficient measurement precision is obtained.
  • Combines the precision and comprehensiveness of a
    full assessment with the efficiency of a screener.

5
CAT Process
Typical Pattern of Responses
Increased Difficulty
  • Score is calculated and the next best item is
    selected based on item difficulty

Middle Difficulty
/- 1 Std. Error
Decreased Difficulty
Correct
Incorrect
6
CAT in Clinical Assessment
7
CAT in Clinical Assessment Issues
  • Triage of individuals to support clinical
    decision making
  • Measurement of multiple clinical dimensions and
    subdimensions
  • Persons with atypical presentation of symptoms
  • Generalizability of assessment to various groups

8
Clinical Decision Making
  • How severe are the symptoms?
  • What type of treatment is most appropriate?
  • Can CAT be used to answer these questions more
    efficiently?

9
Strategy
  • Use CAT to place persons into low, moderate and
    high levels of substance abuse and dependency.
  • Starting Rules
  • Using screener measures to set the initial
    measure and select the first item
  • Variable Stop Rules
  • Tight precision around cut points
  • Less precision away from cut points

10
CAT Standard Error
11
Results
  • CAT to full-measure correlations ranged from .87
    to .99
  • Classification of persons into treatment groups
    based on CAT and full measure (kappa
    coefficients) ranged from .66 to .71.
  • Screener starting rule improved CAT efficiency by
    7 percent
  • Variable stop rules improved efficiency by 15-38

12
Measuring Multiple Dimensions
13
Assessment on Multiple Dimensions
  • Instruments often measure multiple domains
  • In CAT, treating a multi-domain measure as
    measuring one domain is problematic
  • Some subdimensions may not be adequately measured

14
Strategy Content Balancing
  • Set an item quota for each subscale
  • Maximum number of subscale items to administer
    during the CAT
  • An item is selected if
  • Its subscale quota has not been met
  • Provides maximum information

15
Content Balancing Procedures
Method Screener Content Balanced
None No No
Screener Yes No
Mixed Yes Yes
Full No Yes
16
Percentage of Items Administered by Subscale
IMDS Scale N Items None Screener Mixed Full
Depression 1 99 100 100 100
Depression 3 79 77 100 100
Homicidal/ Suicidal 1 21 100 100 100
Homicidal/ Suicidal 3 8 8 100 100
Anxiety 1 100 100 100 100
Anxiety 3 100 100 100 100
Trauma 1 100 100 100 100
Trauma 3 100 100 100 100
17
Cont. Balancing CAT to Full IMDS Correlations
IMDS Scales None Screener Mixed Full
IMDS 0.98 0.98 0.98 0.97
Depression 0.96 0.94 0.96 0.96
Homicidal/Suicidal 0.60 0.83 0.96 0.95
Anxiety 0.96 0.95 0.96 0.96
Trauma 0.97 0.97 0.97 0.97
Average r 0.89 0.93 0.97 0.96
18
Identifying Persons with Atypical Presentation of
Symptoms
19
Overview
  • Implications Clients sometimes endorse severe
    clinical symptoms that are not reflected by
    overall scores on standard assessments.
  • Statistics that can detect atypical presentation
    of symptoms have important clinical implications.
  • Strategy Identify fit statistics sensitive to
    atypical presentation in a CAT context

20
Rasch Fit Statistics
  • Fit statistics are used to test particular
    hypotheses.
  • Atypicalness Used to detect unexpected outlying,
    off-target responses. Outlier sensitive
  • Example A person with a high level on the
    measured trait misses an easy item.
  • Randomness Used to detect unexpected inlying,
    targeted responses.
  • Both infit and outfit are chi-square statistics.
    An infit or outfit value of 1.0 indicates perfect
    fit to the Rasch model.

21
Problems with Fit
Responses by Severity Low High Responses by Severity Low High Responses by Severity Low High Randomness Atypicalness
111 11111100000 0000 0.3 0.5
111 10101100010 0000 0.6 1.0
111 11101010000 0000 1.0 1.0
111 00001110000 0000 0.9 1.3
011 11111110000 0000 3.8 1.0
111 11111100000 0001 3.8 1.0
101 01010101010 1010 4.0 2.3
000 00000000011 1111 12.6 4.3
22
Clinical Implications of Misfit
  • Our analyses indicate that there are subgroups
    who endorse severe symptoms without endorsement
    of milder symptoms.
  • Examples
  • Atypical suicide
  • Substance use withdrawal without dependence

23
Atypicalness by Number of Items
Number of Items Atypicalness Categories Atypicalness Categories Atypicalness Categories
Number of Items Uber Typical Typical Atypical
16 30.2 48.1 21.7
12 34.3 51.1 14.6
8 38.4 53.2 8.4
4 58.2 40.0 1.8
24
Content Balancing and Atypicalness
Atypicalness Category None Screener Mixed Full Full IMDS
Proto Typical 26.7 34.6 48.3 50.5 49.2
Typical 69.0 58.7 40.8 38.9 38.4
Atypical 4.3 6.5 10.9 10.6 12.4
Kappa .27 .32 .48 .50 --
25
Future Research
  • Identify alternative fit statistics that are more
    sensitive to atypical presentation of symptoms
  • Determine when it is likely that someone may be
    present with atypical symptoms, and if so, select
    items to confirm atypicalness.

26
Generalizability of CAT to Various Groups
27
Overview
  • Persons at the same severity level may differ in
    their endorsement of specific items.
  • This is called differential item functioning
    (DIF)
  • On the GAIN, DIF has been detected by
  • Age (adolescent vs. adult)
  • Gender
  • Ethnicity/Race
  • Drug of choice

28
DIF By GAIN Scale
Scale Total Age Gender Race Prim. Drug
Internal Mental Distress 43 13 5 10 26
Crime Violence 31 11 14 22 27
Behavioral Complexity 33 12 12 17 22
Substance Problems 16 8 5 9 16
29
DIF and CAT
  • The presence of DIF can limit our ability to
    generalize measurement findings across different
    groups.
  • Controlling for DIF becomes complicated as the
    number of DIF items and groups/factors increases.
  • Currently exploring a number of methods for
    controlling DIF in CAT.

30
Potential of CAT in Clinical Practice
  • Reduce respondent burden
  • Reduce staff resources
  • Reduce data fragmentation
  • Streamline complex assessment procedures
  • Assist in clinical decision making
  • Identify persons with atypical profiles
  • Improve measurement generalizability

31
Future Research
  • How do we put it all together?
  • Much of the research in the area of CAT has used
    computer simulation. There is a need to test
    working CAT systems in clinical practice.

32
Contact Information
  • A copy of this presentation will be at
    www.chestnut.org/li/posters
  • For more information, please contact Barth Riley
    at bbriley_at_chestnut.org
Write a Comment
User Comments (0)
About PowerShow.com