RTI Measurement Overview: Measurement Concepts for RTI Decision Making

About This Presentation

Title:

RTI Measurement Overview: Measurement Concepts for RTI Decision Making

Description:

Title: PowerPoint Presentation Author: Peabody College Last modified by: stewart Created Date: 4/25/2002 4:07:35 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:192

Avg rating:3.0/5.0

Slides: 60

Provided by: Peab8

Learn more at: http://www.cehd.umn.edu

Category:

more less

Transcript and Presenter's Notes

Title: RTI Measurement Overview: Measurement Concepts for RTI Decision Making

1
RTI Measurement Overview Measurement Concepts
for RTI Decision Making

A module for pre-service and in-service
professional development
MN RTI Center
Author Lisa H. Stewart, PhD
Minnesota State University Moorhead
www.scred.k12.mn.us click on RTI Center

2
MN RTI Center Training Modules

This module was developed with funding from the
MN legislature
It is part of a series of modules available from
the MN RTI Center for use in preservice and
inservice training

2
3
Overview

Purpose(s) of assessment
Characteristics of effective measurement for RTI
Critical features of measurement and RTI in the
areas of screening, progress monitoring, and
diagnostic instructional planning
CBM/GOMs as a frequently used RTI measurement
tool
Multiple sources of information and convergence

4
Why Learn About Measurement?

In God we trust
All others must have data.
Dr. Stan Deno

4
5
Assessment
One of the Key Components in
RTI
Curriculum and Instruction
Assessment
School Wide Organization Problem Solving
Systems (Teams, Process, etc)
Adapted from Logan City School District, 2002
6
Measurement and Assessment

Schools have to make many choices about
measurement tools and the process of gathering
information used to make decisions (assessment)
We need different measurement tools for different
purposes

7
Some Purposes of Assessment

Screening
Diagnostic - instructional planning
Monitoring student progress (formative)
Evaluation (summative)

8
Screening

Standardized measures given to all students to
Help identify students at-risk in a PROACTIVE way
Give feedback to the system about how students
progress throughout the year at a gross (e.g., 3x
per year) level
If students are on track in the fall are they
still on track in the winter?
What is happening with students who started the
year below target, are they catching up?
Give feedback to the system about changes from
year to year
Is our new reading curriculum having the impact
we were expecting?

DRAFT May 27, 2009
8
9
Diagnosis/Instructional Planning

Measures given to understand a students skill
level (strengths and weaknesses) help guide
Instructional grouping
Where to place the student in the curriculum
curricular materials
What skills are missing or weak and may need to
be retaught or practiced and the level of support
and explicitness needed
Development or selection of curriculum and
targeted interventions

10
Monitoring Student Progress (Formative)

Informally this happens all the time and helps
teachers adjust their teaching on the spot
More formalized progress monitoring involves
standardized measures, tied to important
educational outcomes, and given frequently (e.g.
weekly) to
Prompt you to change what you are doing with a
student if it is not working (formative
assessment) so you are effective and efficient
with your time and instruction
Make decisions about instructional goals,
materials, levels, and groups
Aid in communication with parents
Document progress for special education students
as required for periodic and annual reviews

11
Evaluation (Summative)

Measures used to provide a snapshot or summary
of student skill at one particular point in time,
often at the end of the instructional year or
unit
E.g. state high stakes tests
"When the cook tastes the soup, thats formative
when the guests taste the soup, thats
summative."

12
One Test Can Serve More Than One Purpose

To the extent a test does more than one thing
well, it is a more efficient use of student time
and school resources
Example 1 Reading CBM measures of Oral Reading
Fluency can be used for screening and progress
monitoring
Example 2 the NWEA (MAP) test may be used for
screening and instructional planning

13
Activity

On Measurement Overview Purposes of Assessment
Worksheet
Make a list of all the tests you have learned
about or have seen used in the school setting (or
are currently in use in your school)
Try to decide what purpose(s) each test served

14
Assessment Tools and Purpose(s)
Name of Test Purpose(s) (Screening, Instructional Planning, Progress Monitoring, Program Eval.)

15
Buyer Beware

Although it is good if a test can serve more than
one purpose, just because a test manual or
advertisement SAYS it is useful for multiple
purposes, doesnt mean the test actually IS
useful for multiple purposes
Example Many tests designed for diagnostic
purposes or for summative evaluation state they
are also useful for progress monitoring, but are
too time consuming, too costly, too unreliable,
or too insensitive to changes in student skills
to be of practical use for progress monitoring

16
Establishing a Measurement System

A core feature of RTI is identifying a
measurement system
Screen large numbers of students
Identify students in need of additional
intervention
Monitor students of concern more frequently
1 to 4x per month
Typically weekly
Diagnostic testing used for instructional
planning to help target interventions as needed

16
17
Characteristics of An Effective Measurement
System for RTI
valid reliable simple quick
inexpensive easily understood can be given
often sensitive to growth over short periods of
time
Credit K Gibbons, M Shinn
17
18
Technical Characteristics of Measurement Tools

Reliability- the consistency of the measure
If tested again right away or by a different
person or with an alternate equivalent form of
the test, the score should be similar
Allows us to have confidence in the score and use
the score to generalize what we see today to
other times and situations
If a student knows how to decode simple words on
a sheet of paper at 8am this morning, we would
expect him to be able to decode similar simple
words at noon and the next day

19
Why is Reliability so Important?

Assume you have a test that decides whether or
not you need to take (and pay for) a remedial
math class in college that does not count toward
graduation.
The test average score is 50 points.
The test has a cut off score of 35, so students
who score below 35 have to take the remedial
class.

20
Why is Reliability so Important? (Contd)

If the test is reliable, and you get a score of
30, if you take another version of the test or
take the test again a week later (without major
studying or changing what you know!) you would
likely get a score very close to 30.
If the test is not reliable, and you get a score
of 30You might be able to take the test again or
take another version of the test and get a score
of 40or a score of 20!
If the test is unreliable we cant have much
faith in the score and it becomes difficult to
use the test to make decisions!

21
Validity

But what if the test IS reliable and you get a
score of 30 but your math skills are much better
than the score implies? What if you get a score
of 30 but you dont really need a remedial math
class?
Then the test has an issue with VALIDITY-
A test is valid only if the interpretation of the
test scores are supported
A common definition of validity is that the test
measures what it says it measures
Another definition is that a test is valid if it
helps you make better decisions or leads to
better outcomes than if you had never given the
test

22
Types of Validity

There are many ways to try to demonstrate
validity
Content validity
Criterion related validity concurrent and
predictive
Treatment Validity
Construct Validity

23
Types of Validity (Contd)

Content validity
The test content is reasonable
Criterion related validity two types
Concurrent- the scores from this test are similar
to scores from other tests that measure the
same/similar thing
Predictive- the test scores from this test do a
pretty good job of letting us know what score a
student will get on another test in the future

24
Types of Validity (Contd)

Treatment Validity
If you use this test to decide about some
treatment or intervention or instructional
approach.
Do you make better decisions?
Do you have better goals? Planning? Student
engagement?
Most importantly Are the outcomes for your
students better?

25
Types of Validity (Contd)

Construct Validity
Does the test measure the theoretical trait or
characteristic?
E.g. If the theory says children need to have a
base of solid decoding skills before they will be
fast and fluent readers of new text, do the
scores on the reading test of decoding and
fluency support that?
All other ways to try to document validity are in
some way also addressing construct validity
(content, criterion, treatment, etc.)

26
The NOT Validity Kind of Validity

Face validity is NOT really validity
Positive It looks good
Just because a test looks good or you (or your
colleague) like to give it does not mean it gives
you good information or is the best test to use
Negative I just dont like it
Just because a test isnt set up exactly how you
like it does not mean it does NOT give you good
information
Look for EVIDENCE of reliability and validity,
dont rely on your reaction, or the reactions and
testimonials of colleagues, alone.

27
Reliability and Validity

Just because a test is reliable does not mean it
is valid
It may reliably give you an inaccurate score!
If a test is not reliable, it cannot be valid
No test or test score is perfectly reliable
We use test scores to help make a variety of
decisions-- some low stakes and some high
stakes decisions.
So how reliable is reliable enough?
It depends .

28
Measuring Reliability and Validity

Typically reliability and validity evidence
involves comparing the test to itself or to other
tests or outcomes
The statistic used to sum up that comparison is
often a correlation ( r )
Correlations vary from r 0.0 to 1.0
The closer a correlation is to 1.0 the stronger
the relationship or the better you can predict
one score or outcome if you know the other one

29
How Reliable is Reliable Enough?

For important INDIVIDUAL decisions? r .90
For SCREENING decisions? r .80
Salvia Yselldyke, 2006
Reliability is like money, as long as you have
it, its not a problem, but if you dont, its a
BIG problem! Fred Kurlinger

30
How Valid is Valid Enough?
Ranges Interpretation
.00-.20 Little/no validity
.21-.40 Below average validity
.41-.55 Average validity
.56-.80 Above average validity
.80-.99 Exceptional validity
Source Webb, MW, 1983 journal of reading, 26(5)
414-424
31
Looking at Validity With a Purpose in Mind

Predictive Validity is really important if you
are using the test as a screening tool to predict
which students are at risk or not at risk of
reading difficulty
Treatment validity is really important if you are
using the test in an effort to lead to some sort
of improved outcome

32
Validity isnt Just About the Test

Validity has to do with the test use and
interpretation, so even a valid test can be
used for the wrong reasons or misinterpreted or
misused
Example 1 A test score for an ELL student
should reflect the students skills, not her
ability to understand the directions and what is
being asked
Example 2 on next slide

33
Validity isnt Just About the Test (Contd)

Example 2 Letter Naming Fluency (LNF)
LNF involves giving a student a page of
randomized upper and lower case letters and
having the student name as many letters as they
can in one minute.
As a test of early literacy, LNF has good
reliability and concurrent and predictive
validity, especially predictive validity
However, it can be easily MISUSED
If interpreted correctly, LNF can identify
students at risk for early reading difficulty and
get those students into well-rounded early
literacy instruction well suited to them,
BUT, if it is interpreted to mean that a student
low in LNF needs to just have a lot of
instructional time spent only learning letter
names (often taking time away from high quality
well-rounded early literacy instruction) it can
actually have a negative impact.

34
Test Utility

Is it easy to use, time efficient, and cheap? ?
Even if a test is reliable and valid, if it is
too difficult to use, too time consuming, or too
expensive it just wont get used
If a reliable and valid progress monitoring tool
took 30 minutes per child and you wanted to
monitor 10 students in your class every week,
would you use it?
However, if a test is easy and short and cheap
but isnt reliable or valid its still a waste
of time, no matter how short!

35
Test Utility (Contd)

Is it sensitive enough for the decisions you want
to make?
Can it detect the differences between groups of
kids or within an individual that you need to
help you make a decision?
If a progress monitoring tool can only show gains
of 1 point per month, is it sensitive enough to
help give you timely feedback on the students
response to your instruction?

36
Activity

On Characteristics of Assessment Tools for RTI
Worksheet
Make a list of tests you have learned about or
have seen used in the school setting (or are
currently in use in your school)
Can use all or some of the tools from the
Purposes of Assessment Worksheet for your list
Is the test reliable and valid FOR THE PURPOSE IT
IS BEING USED?
Is it quick and simple?
Is it inexpensive?
Can it be given often (has alternate forms, etc)?
Is it sensitive?

37
Characteristics of Assessment Tools for RTI
Name of tool Reliable Valid Quick simple Cheap Can be given often Sensitive to growth over short time

38
Some Help in Looking for Evidence

Measurement tools are reviewed at the following
sites
www.rti4success.org
www.studentprogress.org
These sites only review tests submitted, if it is
not on the list it doesnt mean it is bad, just
that it wasnt reviewed
Be sure you know the purpose of assessment
(screening, progress monitoring, etc) to best
interpret the information

39
Critical Features of Measurement and RTI

Screening
Progress Monitoring
Diagnostic Instructional Planning

39
40
Measurement and RTI Screening

Reliability coefficients of at least r .80.
Higher is better, especially for screening
specificity.
Well documented predictive validity
Evidence the criterion (cut score) being used is
reasonable and creates not too many false
positives (students identified as at risk who
arent) or false negatives (students who are at
risk who arent identified as such)
Brief, easy to use, affordable, and
results/reports are accessible almost immediately

41
Measurement and RTI Progress Monitoring

Reliability coefficients of r.90
Because you are looking at multiple data points
over time, it is possible to use a test with a
lower reliability (e.g. .80-.90), but wait until
you have several data points and use the combined
data to increase confidence in your decisions
Well documented treatment validity!

42
Msrmnt RTI Progress Monitoring (Contd)

Test and scores are very sensitive to increases
or decreases in student skills over time
Evidence of what slope of progress (how much
growth in a day, week or a month) is typical
under what conditions can greatly increase your
ability to make decisions
VERY brief, easy to use, affordable, alternate
forms, and results/reports are accessible
immediately

43
Measurement and RTI Diagnostic Assessment for
Instructional Planning

Reliability coefficients of r .80 ASSUMING you
are open to changing the instruction (formative
assessment) if your planning didnt work out as
you thought it might
Aligned with research on the development and
teaching of reading
Well documented treatment validity, utility for
instructional planning!
Time and cost efficient but specific enough to be
useful for designing effective interventions
Linked to standards and curriculum scope and
sequence

44
Msrmnt RTI Diagnostic Assessment for
Instructional Planning (Contd)

Many instructional planning tools have limited
information on reliability and validityLook for
tools that do have data.
If creating your own tests, use best practices in
test construction.
Overall be sure you are doing standardized
frequent progress monitoring and looking at
student engaged time as other sources of
information to ensure instruction is well
planned.

45
RTI, General Outcome Measures and Curriculum
Based Measurement

Many schools use Curriculum Based Measurement
(CBM) general outcome measures for screening and
progress monitoring
You dont have to use CBM, but many schools do
Most common CBM tool in Grades 1- 8 is Oral
Reading Fluency (ORF)
Measure of reading rate ( of words correct per
minute on a grade level passage) and a strong
indicator of overall reading skill, including
comprehension
Early Literacy Measures are also available such
as Nonsense Word Fluency (NWF), Phoneme
Segmentation Fluency (PSF), Letter Name Fluency
(LNF) and Letter Sound Fluency (LSF)

45
46
Why GOMs/CBM?

Typically meet the criteria needed for RTI
screening and progress monitoring
Reliable, valid, specific, sensitive, practical
Also, some utility for instructional planning
(e.g., grouping)
They are INDICATORS of whether there might be a
problem, not diagnostic!
Like taking your temperature or sticking a
toothpick into a cake
Oral reading fluency is a great INDICATOR of
reading decoding, fluency and reading
comprehension
Fluency based because automaticity helps
discriminate between students at different points
of learning a skil

46
47
GOMCBM DIBELS AIMSweb
DRAFT May 27, 2009
47
48
CBM Oral Reading Fluency

Give 3 grade-level passages using standardized
administration and scoring use median (middle)
score
3-second rule (tell the student the word point
to next word)
Discontinue rule (after 0 correct in first row,
if lt10 correct on 1st passage do not give other
passages)

Errors Not Errors
Hesitation for gt3 seconds Incorrect pronunciation for context Omitted Words Words out of order Repeated Sounds Self-Corrects Skipped Row Insertions Dialect/Articulation
48
49
Fluency and Comprehension
The purpose of reading is comprehension
A good measures of overall reading proficiency is
reading fluency because of its strong correlation
to measures of comprehension.
50
The Importance of Multiple Sources of Information

No ONE test is going to serve all purposes or
give you all the information you need.
Use MULTIPLE sources of data to make the best
decisions
Screening, progress monitoring, diagnostic, and
evaluative data from multiple sources and/or
across time
Teacher observation and more formal observations
Other pieces of relevant information such as
behavior, attendance, health, the curriculum and
instructional environment, etc.
Look for CONVERGENCE of data- places where
several sources of data point to the same
decision or conclusion

51
Articles Available with this Module

Shoemaker, J. (2006). Reliability and Validity
Stats crib sheet from Heartland AEA (Iowa)
Traditional and Modern Concepts of Validity.
ERIC/AE Digest
Also see articles specific to particular uses of
measurement in benchmark and progress monitoring
modules

52
Recommended Resources

American Psychological Association, American
Educational Research Association, National
Council on Measurement in Education. (1985).
Standards for educational and psychological
testing. Washington, DC American Psychological
Association.
Educational Measurement Text, e.g. texts by
Hogan, Marzano, or Salvia Ysseldyke, or a good
Educational Psychology text that covers
reliability, validity and utility of measurement

53
Web Resource on Measurement

Heartland (Iowa) website link with powerpoints on
common myths and confusions about assessment
http//www.aea11.k12.ia.us/assessment/mythbuster.h
tml

54
RTI Related Resources

National Center on RTI
http//www.rti4success.org/
RTI Action Network links for Assessment and
Universal Screening
http//www.rtinetwork.org
MN RTI Center
http//www.scred.k12.mn.us/ and click on link
National Center on Student Progress Monitoring
http//www.studentprogress.org/
Research Institute on Progress Monitoring
http//progressmonitoring.net/

54
55
RTI Related Resources (Contd)

National Association of School Psychologists
www.nasponline.org
National Association of State Directors of
Special Education (NADSE)
www.nasdse.org
Council of Administrators of Special Education
www.casecec.org
Office of Special Education Programs (OSEP)
toolkit and RTI materials
http//www.osepideasthatwork.org/toolkit/ta_respon
siveness_intervention.asp

56
Quiz

1. A purpose of assessment is what?
A.) Screening
B.) Diagnostic
C.) Progress Monitoring
D.) Evaluation
E.) All of the above
2. True or False? A test is useful for multiple
purposes as long as its manual or advertisement
says it is.

57
Quiz

3. The consistency of the measure is called its
what?
A.) Validity
B.) Reliability
C.) Criterion
D.) Sensitivity
4. If the test measures the construct it says it
measures it has?
A.) Validity
B.) Reliability
C.) Criterion
D.) Sensitivity

58
Quiz

True or False for each statement?
5.) Even if a test is not valid, it can still be
reliable.
6.) Even if a test is not reliable, it can still
be valid.
7.) Validity is not just about the testit has
to do with the test use and interpretation, so
even a valid test can be used for the wrong
reasons, misinterpreted, or misused.

59
The End ?

Note The MN RTI Center does not endorse any
particular product. Examples used are for
instructional purposes only.
Special Thanks
Thank you to Dr. Ann Casey, director of the MN
RTI Center, for her leadership
Thank you to Aimee Hochstein, Kristen Bouwman,
and Nathan Rowe, Minnesota State University
Moorhead graduate students, for editing, writing
quizzes, and enhancing the quality of these
training materials