MEASUREMENT CHARACTERISTICS - PowerPoint PPT Presentation

Loading...

PPT – MEASUREMENT CHARACTERISTICS PowerPoint presentation | free to download - id: 518217-NGJhN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

MEASUREMENT CHARACTERISTICS

Description:

MEASUREMENT CHARACTERISTICS Error & Confidence Reliability, Validity, & Usability ERROR & CONFIDENCE Reducing error All assessment scores have error Want to minimize ... – PowerPoint PPT presentation

Number of Views:15
Avg rating:3.0/5.0
Slides: 28
Provided by: HPRDepa
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: MEASUREMENT CHARACTERISTICS


1
MEASUREMENT CHARACTERISTICS
  • Error Confidence
  • Reliability, Validity, Usability

2
ERROR CONFIDENCE
  • Reducing error
  • All assessment scores have error
  • Want to minimize so scores are accurate
  • Protocols periodic staff training/retraining
  • Increasing confidence
  • Results lead to correct placement
  • Assessments that produce valid, reliable, and
    usable results

3
ASSESSMENT RESULTS
  • Norm-referenced
  • Individuals score compared to others in their
    peer/norm group
  • School tests, 95
  • Norm group needs to be representative of test
    takers the test was designed for

4
ASSESSMENT RESULTS
  • Criterion-referenced
  • Individuals score compared to a preset standard
    or criterion
  • Standard doesnt change based on the individual
    or group
  • A250-295 points

5
VALIDITY
  • Describes how well the assessment results match
    their intended purpose
  • Are you measuring what you think you are
    measuring?
  • Relationship between program assessment content
  • Does not have validity for all purposes,
    populations or time

6
VALIDITY
  • Depends on different types of evidence
  • Is a matter of degree (no tool is perfect)
  • Is a unitary concept
  • Change from past
  • Former types are now considered as evidence
  • Content validity/content-related evidence

7
FACE VALIDITY
  • Not listed in text
  • Do the items seem to fit?

8
CONTENT VALIDITY (Content-related evidence)
  • How well does assessment measure subject or
    content?
  • Representative
  • Completeness----all major areas
  • Nonstatistical
  • Review of literature or expert opinion
  • Blueprint of major components
  • Per Austin (1991), minimum requirement for any
    assessment

9
CRITERION-RELATED VALIDITY (Criterion-related
evidence)
  • Comparison of results
  • Statistical
  • Reported as validity or correlation coefficient
  • 1 to -1 (1 is a perfect relationship)
  • 0 no relationship
  • r.73 better than r.52
  • r /-.40 to /-.70 acceptable range

10
CRITERION-RELATED VALIDITY (Criterion-related
evidence)
  • May use .30 to .40 if statistically significant
  • If validity is reported, it is generally
    criterion-related validity
  • 2 types
  • Predictive
  • Concurrent

11
PREDICTIVE VALIDITY
  • The ability of an assessment to predict future
    behaviors or outcomes
  • Measures are taken at different times
  • ACT or SAT success in college
  • Leisure Satisfaction predicts discharge

12
CONCURRENT VALIDITY
  • More than one instrument measures the same
    content
  • Desire to predict 1 set of scores from another
    set of scores that are taken at the same or
    nearly same time measuring the same variable

13
CONSTRUCT VALIDITY (Construct-related evidence)
  • Theoretical/conceptual
  • Content criterion-related validity contribute
    to construct validity
  • Research concerning conceptual framework on which
    assessment is based contribute to construct
    validity
  • Not demonstrated in a single project or
    statistical measure
  • Few TR have focus behavior not construct

14
CONSTRUCT VALIDITY (Construct-related evidence)
  • Factor analysis
  • Convergent validity (what it measures)
  • Divergent validity (what it doesnt measure)
  • Expert panels here too

15
THREATS TO VALIDITY
  • Assessment s/b valid for intended use (e.g.
    research instruments)
  • Unclear directions
  • Unclear or ambiguous terms
  • Items that are at inappropriate level for
    subjects
  • Items not related to construct being measured

16
THREATS TO VALIDITY
  • Too few items
  • Too many items
  • Items with an identifiable pattern of response
  • Method of administration
  • Testing conditions
  • Subjects health, reluctance, attitudes
  • See Stumbo, 2002, p.41-42

17
VALIDITY
  • Cant get valid results without reliable results,
    but can get reliable results without valid
    results
  • Reliability is a necessary but not sufficient
    condition for validity
  • See Stumbo, 2002, p. 54

18
RELIABLITY
  • Accuracy or consistency of a measurement
  • Reproducible results
  • Statistical in nature
  • r between 0 1 (with 1 being perfect)
  • Should not be lower than .80
  • Tells what portion of variance is non-error
    variance
  • Increases with length of test spread of scores

19
STABILITY (Test-retest)
  • How stable is the assessment?
  • Assessment not overly influenced by passage of
    time
  • Same group assessed 2 times with same instrument
    results of the 2 testings are correlated
  • Are the 2 sets of scores alike?
  • Time effects (longer, shorter)

20
EQUIVALENCY (Equivalent forms)
  • Also known as parallel-form or alternative-form
    reliability
  • How closely correlated are 2 or more forms of the
    same assessment?
  • 2 forms have been developed and demonstrated to
    measure the same construct
  • Forms have similar but not same items
  • e.g. NCTRC exam
  • Short long forms are not equivalent

21
INTERNAL CONSISTENCY
  • How closely are items on the assessment related?
  • Split half
  • 1st half vs. 2nd half
  • Odd/even
  • Matched random subsets
  • If cant divide
  • Cronbachs alpha
  • Kuder-Richardson
  • Spearman-Browns formula

22
INTERRATER RELIABILITY
  • Percentage of agreements with number of
    observations
  • Difference between agreement accuracy
  • Raters compared to each other
  • 80 agreement

23
INTERRATER RELIABILITY
  • Simple agreement
  • Number of agreements disagreements
  • Point-to-point agreement
  • Takes each data point into consideration
  • Percentages of agreement for the occurrence of
    target behavior
  • Kappa index

24
INTRARATER RELIABILITY
  • Not in text
  • Compared with self

25
RELIABILITY
  • Manuals often give this information
  • High reliability doesnt indicate validity
  • Generally a longer test has higher reliability
  • Lessens influence of chance or guessing

26
FAIRNESS
  • Reduction or elimination of undue bias
  • Language
  • Ethnic or racial backgrounds
  • Gender
  • Free of stereotypes biases
  • Beginning to be a concern for TR

27
USABILITY PRACTICALITY
  • Nonstatistical
  • Is this tool better than any other tool on market
    or one I can design?
  • Time, cost, staff qualifications, ease of
    administration, scoring, etc
About PowerShow.com