RTI Measurement Overview: Measurement Concepts for RTI Decision Making - PowerPoint PPT Presentation

About This Presentation
Title:

RTI Measurement Overview: Measurement Concepts for RTI Decision Making

Description:

Title: PowerPoint Presentation Author: Peabody College Last modified by: stewart Created Date: 4/25/2002 4:07:35 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:192
Avg rating:3.0/5.0
Slides: 60
Provided by: Peab8
Learn more at: http://www.cehd.umn.edu
Category:

less

Transcript and Presenter's Notes

Title: RTI Measurement Overview: Measurement Concepts for RTI Decision Making


1
RTI Measurement Overview Measurement Concepts
for RTI Decision Making
  • A module for pre-service and in-service
    professional development
  • MN RTI Center
  • Author Lisa H. Stewart, PhD
  • Minnesota State University Moorhead
  • www.scred.k12.mn.us click on RTI Center

2
MN RTI Center Training Modules
  • This module was developed with funding from the
    MN legislature
  • It is part of a series of modules available from
    the MN RTI Center for use in preservice and
    inservice training

2
3
Overview
  • Purpose(s) of assessment
  • Characteristics of effective measurement for RTI
  • Critical features of measurement and RTI in the
    areas of screening, progress monitoring, and
    diagnostic instructional planning
  • CBM/GOMs as a frequently used RTI measurement
    tool
  • Multiple sources of information and convergence

4
Why Learn About Measurement?
  • In God we trust
  • All others must have data.
  • Dr. Stan Deno

4
5
Assessment
One of the Key Components in
RTI
Curriculum and Instruction
Assessment
School Wide Organization Problem Solving
Systems (Teams, Process, etc)
Adapted from Logan City School District, 2002
6
Measurement and Assessment
  • Schools have to make many choices about
    measurement tools and the process of gathering
    information used to make decisions (assessment)
  • We need different measurement tools for different
    purposes

7
Some Purposes of Assessment
  • Screening
  • Diagnostic - instructional planning
  • Monitoring student progress (formative)
  • Evaluation (summative)

8
Screening
  • Standardized measures given to all students to
  • Help identify students at-risk in a PROACTIVE way
  • Give feedback to the system about how students
    progress throughout the year at a gross (e.g., 3x
    per year) level
  • If students are on track in the fall are they
    still on track in the winter?
  • What is happening with students who started the
    year below target, are they catching up?
  • Give feedback to the system about changes from
    year to year
  • Is our new reading curriculum having the impact
    we were expecting?

DRAFT May 27, 2009
8
9
Diagnosis/Instructional Planning
  • Measures given to understand a students skill
    level (strengths and weaknesses) help guide
  • Instructional grouping
  • Where to place the student in the curriculum
    curricular materials
  • What skills are missing or weak and may need to
    be retaught or practiced and the level of support
    and explicitness needed
  • Development or selection of curriculum and
    targeted interventions

10
Monitoring Student Progress (Formative)
  • Informally this happens all the time and helps
    teachers adjust their teaching on the spot
  • More formalized progress monitoring involves
    standardized measures, tied to important
    educational outcomes, and given frequently (e.g.
    weekly) to
  • Prompt you to change what you are doing with a
    student if it is not working (formative
    assessment) so you are effective and efficient
    with your time and instruction
  • Make decisions about instructional goals,
    materials, levels, and groups
  • Aid in communication with parents
  • Document progress for special education students
    as required for periodic and annual reviews

11
Evaluation (Summative)
  • Measures used to provide a snapshot or summary
    of student skill at one particular point in time,
    often at the end of the instructional year or
    unit
  • E.g. state high stakes tests
  • "When the cook tastes the soup, thats formative
    when the guests taste the soup, thats
    summative."

12
One Test Can Serve More Than One Purpose
  • To the extent a test does more than one thing
    well, it is a more efficient use of student time
    and school resources
  • Example 1 Reading CBM measures of Oral Reading
    Fluency can be used for screening and progress
    monitoring
  • Example 2 the NWEA (MAP) test may be used for
    screening and instructional planning

13
Activity
  • On Measurement Overview Purposes of Assessment
    Worksheet
  • Make a list of all the tests you have learned
    about or have seen used in the school setting (or
    are currently in use in your school)
  • Try to decide what purpose(s) each test served

14
Assessment Tools and Purpose(s)
Name of Test Purpose(s) (Screening, Instructional Planning, Progress Monitoring, Program Eval.)







15
Buyer Beware
  • Although it is good if a test can serve more than
    one purpose, just because a test manual or
    advertisement SAYS it is useful for multiple
    purposes, doesnt mean the test actually IS
    useful for multiple purposes
  • Example Many tests designed for diagnostic
    purposes or for summative evaluation state they
    are also useful for progress monitoring, but are
    too time consuming, too costly, too unreliable,
    or too insensitive to changes in student skills
    to be of practical use for progress monitoring

16
Establishing a Measurement System
  • A core feature of RTI is identifying a
    measurement system
  • Screen large numbers of students
  • Identify students in need of additional
    intervention
  • Monitor students of concern more frequently
  • 1 to 4x per month
  • Typically weekly
  • Diagnostic testing used for instructional
    planning to help target interventions as needed

16
17
Characteristics of An Effective Measurement
System for RTI
valid reliable simple quick
inexpensive easily understood can be given
often sensitive to growth over short periods of
time
Credit K Gibbons, M Shinn
17
18
Technical Characteristics of Measurement Tools
  • Reliability- the consistency of the measure
  • If tested again right away or by a different
    person or with an alternate equivalent form of
    the test, the score should be similar
  • Allows us to have confidence in the score and use
    the score to generalize what we see today to
    other times and situations
  • If a student knows how to decode simple words on
    a sheet of paper at 8am this morning, we would
    expect him to be able to decode similar simple
    words at noon and the next day

19
Why is Reliability so Important?
  • Assume you have a test that decides whether or
    not you need to take (and pay for) a remedial
    math class in college that does not count toward
    graduation.
  • The test average score is 50 points.
  • The test has a cut off score of 35, so students
    who score below 35 have to take the remedial
    class.

20
Why is Reliability so Important? (Contd)
  • If the test is reliable, and you get a score of
    30, if you take another version of the test or
    take the test again a week later (without major
    studying or changing what you know!) you would
    likely get a score very close to 30.
  • If the test is not reliable, and you get a score
    of 30You might be able to take the test again or
    take another version of the test and get a score
    of 40or a score of 20!
  • If the test is unreliable we cant have much
    faith in the score and it becomes difficult to
    use the test to make decisions!

21
Validity
  • But what if the test IS reliable and you get a
    score of 30 but your math skills are much better
    than the score implies? What if you get a score
    of 30 but you dont really need a remedial math
    class?
  • Then the test has an issue with VALIDITY-
  • A test is valid only if the interpretation of the
    test scores are supported
  • A common definition of validity is that the test
    measures what it says it measures
  • Another definition is that a test is valid if it
    helps you make better decisions or leads to
    better outcomes than if you had never given the
    test

22
Types of Validity
  • There are many ways to try to demonstrate
    validity
  • Content validity
  • Criterion related validity concurrent and
    predictive
  • Treatment Validity
  • Construct Validity

23
Types of Validity (Contd)
  • Content validity
  • The test content is reasonable
  • Criterion related validity two types
  • Concurrent- the scores from this test are similar
    to scores from other tests that measure the
    same/similar thing
  • Predictive- the test scores from this test do a
    pretty good job of letting us know what score a
    student will get on another test in the future

24
Types of Validity (Contd)
  • Treatment Validity
  • If you use this test to decide about some
    treatment or intervention or instructional
    approach.
  • Do you make better decisions?
  • Do you have better goals? Planning? Student
    engagement?
  • Most importantly Are the outcomes for your
    students better?

25
Types of Validity (Contd)
  • Construct Validity
  • Does the test measure the theoretical trait or
    characteristic?
  • E.g. If the theory says children need to have a
    base of solid decoding skills before they will be
    fast and fluent readers of new text, do the
    scores on the reading test of decoding and
    fluency support that?
  • All other ways to try to document validity are in
    some way also addressing construct validity
    (content, criterion, treatment, etc.)

26
The NOT Validity Kind of Validity
  • Face validity is NOT really validity
  • Positive It looks good
  • Just because a test looks good or you (or your
    colleague) like to give it does not mean it gives
    you good information or is the best test to use
  • Negative I just dont like it
  • Just because a test isnt set up exactly how you
    like it does not mean it does NOT give you good
    information
  • Look for EVIDENCE of reliability and validity,
    dont rely on your reaction, or the reactions and
    testimonials of colleagues, alone.

27
Reliability and Validity
  • Just because a test is reliable does not mean it
    is valid
  • It may reliably give you an inaccurate score!
  • If a test is not reliable, it cannot be valid
  • No test or test score is perfectly reliable
  • We use test scores to help make a variety of
    decisions-- some low stakes and some high
    stakes decisions.
  • So how reliable is reliable enough?
  • It depends .

28
Measuring Reliability and Validity
  • Typically reliability and validity evidence
    involves comparing the test to itself or to other
    tests or outcomes
  • The statistic used to sum up that comparison is
    often a correlation ( r )
  • Correlations vary from r 0.0 to 1.0
  • The closer a correlation is to 1.0 the stronger
    the relationship or the better you can predict
    one score or outcome if you know the other one

29
How Reliable is Reliable Enough?
  • For important INDIVIDUAL decisions? r .90
  • For SCREENING decisions? r .80
    Salvia Yselldyke, 2006
  • Reliability is like money, as long as you have
    it, its not a problem, but if you dont, its a
    BIG problem! Fred Kurlinger

30
How Valid is Valid Enough?
Ranges Interpretation
.00-.20 Little/no validity
.21-.40 Below average validity
.41-.55 Average validity
.56-.80 Above average validity
.80-.99 Exceptional validity
Source Webb, MW, 1983 journal of reading, 26(5)
414-424
31
Looking at Validity With a Purpose in Mind
  • Predictive Validity is really important if you
    are using the test as a screening tool to predict
    which students are at risk or not at risk of
    reading difficulty
  • Treatment validity is really important if you are
    using the test in an effort to lead to some sort
    of improved outcome

32
Validity isnt Just About the Test
  • Validity has to do with the test use and
    interpretation, so even a valid test can be
    used for the wrong reasons or misinterpreted or
    misused
  • Example 1 A test score for an ELL student
    should reflect the students skills, not her
    ability to understand the directions and what is
    being asked
  • Example 2 on next slide

33
Validity isnt Just About the Test (Contd)
  • Example 2 Letter Naming Fluency (LNF)
  • LNF involves giving a student a page of
    randomized upper and lower case letters and
    having the student name as many letters as they
    can in one minute.
  • As a test of early literacy, LNF has good
    reliability and concurrent and predictive
    validity, especially predictive validity
  • However, it can be easily MISUSED
  • If interpreted correctly, LNF can identify
    students at risk for early reading difficulty and
    get those students into well-rounded early
    literacy instruction well suited to them,
  • BUT, if it is interpreted to mean that a student
    low in LNF needs to just have a lot of
    instructional time spent only learning letter
    names (often taking time away from high quality
    well-rounded early literacy instruction) it can
    actually have a negative impact.

34
Test Utility
  • Is it easy to use, time efficient, and cheap? ?
  • Even if a test is reliable and valid, if it is
    too difficult to use, too time consuming, or too
    expensive it just wont get used
  • If a reliable and valid progress monitoring tool
    took 30 minutes per child and you wanted to
    monitor 10 students in your class every week,
    would you use it?
  • However, if a test is easy and short and cheap
    but isnt reliable or valid its still a waste
    of time, no matter how short!

35
Test Utility (Contd)
  • Is it sensitive enough for the decisions you want
    to make?
  • Can it detect the differences between groups of
    kids or within an individual that you need to
    help you make a decision?
  • If a progress monitoring tool can only show gains
    of 1 point per month, is it sensitive enough to
    help give you timely feedback on the students
    response to your instruction?

36
Activity
  • On Characteristics of Assessment Tools for RTI
    Worksheet
  • Make a list of tests you have learned about or
    have seen used in the school setting (or are
    currently in use in your school)
  • Can use all or some of the tools from the
    Purposes of Assessment Worksheet for your list
  • Is the test reliable and valid FOR THE PURPOSE IT
    IS BEING USED?
  • Is it quick and simple?
  • Is it inexpensive?
  • Can it be given often (has alternate forms, etc)?
  • Is it sensitive?

37
Characteristics of Assessment Tools for RTI
Name of tool Reliable Valid Quick simple Cheap Can be given often Sensitive to growth over short time





38
Some Help in Looking for Evidence
  • Measurement tools are reviewed at the following
    sites
  • www.rti4success.org
  • www.studentprogress.org
  • These sites only review tests submitted, if it is
    not on the list it doesnt mean it is bad, just
    that it wasnt reviewed
  • Be sure you know the purpose of assessment
    (screening, progress monitoring, etc) to best
    interpret the information

39
Critical Features of Measurement and RTI
  • Screening
  • Progress Monitoring
  • Diagnostic Instructional Planning

39
40
Measurement and RTI Screening
  • Reliability coefficients of at least r .80.
    Higher is better, especially for screening
    specificity.
  • Well documented predictive validity
  • Evidence the criterion (cut score) being used is
    reasonable and creates not too many false
    positives (students identified as at risk who
    arent) or false negatives (students who are at
    risk who arent identified as such)
  • Brief, easy to use, affordable, and
    results/reports are accessible almost immediately

41
Measurement and RTI Progress Monitoring
  • Reliability coefficients of r.90
  • Because you are looking at multiple data points
    over time, it is possible to use a test with a
    lower reliability (e.g. .80-.90), but wait until
    you have several data points and use the combined
    data to increase confidence in your decisions
  • Well documented treatment validity!

42
Msrmnt RTI Progress Monitoring (Contd)
  • Test and scores are very sensitive to increases
    or decreases in student skills over time
  • Evidence of what slope of progress (how much
    growth in a day, week or a month) is typical
    under what conditions can greatly increase your
    ability to make decisions
  • VERY brief, easy to use, affordable, alternate
    forms, and results/reports are accessible
    immediately

43
Measurement and RTI Diagnostic Assessment for
Instructional Planning
  • Reliability coefficients of r .80 ASSUMING you
    are open to changing the instruction (formative
    assessment) if your planning didnt work out as
    you thought it might
  • Aligned with research on the development and
    teaching of reading
  • Well documented treatment validity, utility for
    instructional planning!
  • Time and cost efficient but specific enough to be
    useful for designing effective interventions
  • Linked to standards and curriculum scope and
    sequence

44
Msrmnt RTI Diagnostic Assessment for
Instructional Planning (Contd)
  • Many instructional planning tools have limited
    information on reliability and validityLook for
    tools that do have data.
  • If creating your own tests, use best practices in
    test construction.
  • Overall be sure you are doing standardized
    frequent progress monitoring and looking at
    student engaged time as other sources of
    information to ensure instruction is well
    planned.

45
RTI, General Outcome Measures and Curriculum
Based Measurement
  • Many schools use Curriculum Based Measurement
    (CBM) general outcome measures for screening and
    progress monitoring
  • You dont have to use CBM, but many schools do
  • Most common CBM tool in Grades 1- 8 is Oral
    Reading Fluency (ORF)
  • Measure of reading rate ( of words correct per
    minute on a grade level passage) and a strong
    indicator of overall reading skill, including
    comprehension
  • Early Literacy Measures are also available such
    as Nonsense Word Fluency (NWF), Phoneme
    Segmentation Fluency (PSF), Letter Name Fluency
    (LNF) and Letter Sound Fluency (LSF)

45
46
Why GOMs/CBM?
  • Typically meet the criteria needed for RTI
    screening and progress monitoring
  • Reliable, valid, specific, sensitive, practical
  • Also, some utility for instructional planning
    (e.g., grouping)
  • They are INDICATORS of whether there might be a
    problem, not diagnostic!
  • Like taking your temperature or sticking a
    toothpick into a cake
  • Oral reading fluency is a great INDICATOR of
    reading decoding, fluency and reading
    comprehension
  • Fluency based because automaticity helps
    discriminate between students at different points
    of learning a skil

46
47
GOMCBM DIBELS AIMSweb
DRAFT May 27, 2009
47
48
CBM Oral Reading Fluency
  • Give 3 grade-level passages using standardized
    administration and scoring use median (middle)
    score
  • 3-second rule (tell the student the word point
    to next word)
  • Discontinue rule (after 0 correct in first row,
    if lt10 correct on 1st passage do not give other
    passages)

Errors Not Errors
Hesitation for gt3 seconds Incorrect pronunciation for context Omitted Words Words out of order Repeated Sounds Self-Corrects Skipped Row Insertions Dialect/Articulation
48
49
Fluency and Comprehension
The purpose of reading is comprehension
A good measures of overall reading proficiency is
reading fluency because of its strong correlation
to measures of comprehension.
50
The Importance of Multiple Sources of Information
  • No ONE test is going to serve all purposes or
    give you all the information you need.
  • Use MULTIPLE sources of data to make the best
    decisions
  • Screening, progress monitoring, diagnostic, and
    evaluative data from multiple sources and/or
    across time
  • Teacher observation and more formal observations
  • Other pieces of relevant information such as
    behavior, attendance, health, the curriculum and
    instructional environment, etc.
  • Look for CONVERGENCE of data- places where
    several sources of data point to the same
    decision or conclusion

51
Articles Available with this Module
  • Shoemaker, J. (2006). Reliability and Validity
  • Stats crib sheet from Heartland AEA (Iowa)
  • Traditional and Modern Concepts of Validity.
    ERIC/AE Digest
  • Also see articles specific to particular uses of
    measurement in benchmark and progress monitoring
    modules

52
Recommended Resources
  • American Psychological Association, American
    Educational Research Association, National
    Council on Measurement in Education. (1985).
    Standards for educational and psychological
    testing. Washington, DC American Psychological
    Association.
  • Educational Measurement Text, e.g. texts by
    Hogan, Marzano, or Salvia Ysseldyke, or a good
    Educational Psychology text that covers
    reliability, validity and utility of measurement

53
Web Resource on Measurement
  • Heartland (Iowa) website link with powerpoints on
    common myths and confusions about assessment
  • http//www.aea11.k12.ia.us/assessment/mythbuster.h
    tml

54
RTI Related Resources
  • National Center on RTI
  • http//www.rti4success.org/
  • RTI Action Network links for Assessment and
    Universal Screening
  • http//www.rtinetwork.org
  • MN RTI Center
  • http//www.scred.k12.mn.us/ and click on link
  • National Center on Student Progress Monitoring
  • http//www.studentprogress.org/
  • Research Institute on Progress Monitoring
  • http//progressmonitoring.net/

54
55
RTI Related Resources (Contd)
  • National Association of School Psychologists
  • www.nasponline.org
  • National Association of State Directors of
    Special Education (NADSE)
  • www.nasdse.org
  • Council of Administrators of Special Education
  • www.casecec.org
  • Office of Special Education Programs (OSEP)
    toolkit and RTI materials
  • http//www.osepideasthatwork.org/toolkit/ta_respon
    siveness_intervention.asp

56
Quiz
  • 1. A purpose of assessment is what?
  • A.) Screening
  • B.) Diagnostic
  • C.) Progress Monitoring
  • D.) Evaluation
  • E.) All of the above
  • 2. True or False? A test is useful for multiple
    purposes as long as its manual or advertisement
    says it is.

57
Quiz
  • 3. The consistency of the measure is called its
    what?
  • A.) Validity
  • B.) Reliability
  • C.) Criterion
  • D.) Sensitivity
  • 4. If the test measures the construct it says it
    measures it has?
  • A.) Validity
  • B.) Reliability
  • C.) Criterion
  • D.) Sensitivity

58
Quiz
  • True or False for each statement?
  • 5.) Even if a test is not valid, it can still be
    reliable.
  • 6.) Even if a test is not reliable, it can still
    be valid.
  • 7.) Validity is not just about the testit has
    to do with the test use and interpretation, so
    even a valid test can be used for the wrong
    reasons, misinterpreted, or misused.

59
The End ?
  • Note The MN RTI Center does not endorse any
    particular product. Examples used are for
    instructional purposes only.
  • Special Thanks
  • Thank you to Dr. Ann Casey, director of the MN
    RTI Center, for her leadership
  • Thank you to Aimee Hochstein, Kristen Bouwman,
    and Nathan Rowe, Minnesota State University
    Moorhead graduate students, for editing, writing
    quizzes, and enhancing the quality of these
    training materials
Write a Comment
User Comments (0)
About PowerShow.com