Evaluation, Measurement and Assessment - PowerPoint PPT Presentation

About This Presentation
Title:

Evaluation, Measurement and Assessment

Description:

Title: PowerPoint Presentation Author: Katrina Aldrich Last modified by. Created Date: 12/1/2003 2:22:47 AM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:1245
Avg rating:3.0/5.0
Slides: 26
Provided by: Katrina58
Learn more at: http://people.uncw.edu
Category:

less

Transcript and Presenter's Notes

Title: Evaluation, Measurement and Assessment


1
Evaluation, Measurement and Assessment Cluster 14
2
Basic Terminology
  • Evaluation a judgment-decision making about
    performance
  • Measurement a number representing an evaluation
  • Assessment procedure to gather
    information(variety of them)
  • Norm-referenced test Testing in which scores
    are compared with the average performance of
    others
  • Criterion-referenced testing Testing in which
    score are compared to a fixed (set performance
    standard.) Measure the mastery of very specific
    objectives.
  • Example Drivers License Exam

3
Norm-Referenced Tests
  • Performance of others as basis for interpreting a
    persons raw score (actual number of correct test
    items)
  • Three types 1) Class 2) School District
    3) National
  • Score reflects general knowledge vs. mastery of
    specific skills and information
  • Uses measuring overall achievement and
    selection of few top candidates
  • Limitations
  • no indication of prerequisite knowledge for more
    advanced material has been mastered
  • less appropriate for measuring affective and
    psychomotor objectives
  • encourages competition and comparison scores

4
Criterion-Referenced Tests
  • Comparison with a fixed standard
  • Example Drivers License
  • Use Measure mastery of a very specific
    objective when goal is to achieve set standard
  • Limitations
  • absolute standards difficult to set in some areas
  • standards tend to be arbitrary
  • not appropriate comparison when others are
    valuable

5
Comparing Norm- and Criterion-Referenced Tests
  • Criterion-referenced
  • Mastery
  • Basic skills
  • Prerequisites
  • Affective
  • Psychomotor
  • Grouping for instruction
  • Norm-referenced
  • General ability
  • Range of ability
  • Large groups
  • Compares people to people-comparison groups
  • Selecting top candidates

6
What do Test Scores Mean?
  • Basic Concepts
  • Standardized test Tests given under uniform
    conditions and scored and reported according to
    uniform procedures. Items and instructions have
    been tried out and administered to norming sample
    group
  • Norming sample large sample of students serving
    as a comparison group for scoring standardized
    tests
  • Frequency distributions record showing how many
    scores fall into set groups, listing number of
    people who obtained particular scores
  • Central tendency Typical score for a group of
    scores. Three measures
  • Mean-average
  • Median-middle score
  • Mode/bimodal (two modes)-most frequent
  • Variability Degree of difference or deviation
    from the mean
  • Range difference between the highest and lowest
    score
  • Standard deviation measure of how widely the
    scores vary from the mean-further from the mean,
    greater SD
  • Normal Distribution Bell shaped curve is an
    example-Figure 39.2, p. 509

7
Frequency Distribution Histogram(Bar graph of a
frequency distribution)
8
Calculating the Standard Deviation
  • Calculate the mean c
  • Subtract the mean from each score (c-c)
  • Square each difference (c-c)2
  • Add all the squared differencesS(c-c)2
  • Divide by the number of scores S(c-c)2
  • N
  • Find the square root S(c-c)2
  • N

9
Normal Distributions
  • The bell curve
  • Mean, median, mode all at the center of the curve
  • 50 of scores above the mean
  • 50 of scores below the mean
  • 68 of scores within one standard deviation from
    the mean

10
Types of Scores
  • Percentile rank Percentage of those in the
    norming sample who scored at or below a raw score
  • Grade-equivalent Tells whether students are
    performing at levels equivalent with other
    students at their own age/grade level
  • averages obtained from different norming samples
    for each grade
  • different forms of test often used for different
    grades
  • high score indicates superior mastery of material
    at that grade level rather than the
    capacity/ability for doing advanced work
  • often misleading
  • Standard scores scores based on the standard
    deviation
  • z scores standard score indicating the number
    of standard deviations a person is above or below
    the mean-no negative numbers
  • T scores Standard score with a mean of 50 and a
    standard deviation of 10
  • Stanine scores Whole number scores from 1 to 9,
    each representing a wide range of raw scores.

11
Interpreting Test Scores
  • No test provides a perfect picture of ones
    abilities
  • Reliability Consistency of test results
  • Test-Retest Reliability-consistency of scores on
    2 separate administrations of the same test
  • Alternate-Form Reliability- consistency of scores
    on two equivalent versions of a test
  • Split-Half Reliability-degree to which all the
    test items measure the same abilities
  • True score Hypothetical mean of all of an
    individuals scores if repeated testing under
    ideal conditions
  • Standard error of measurement standard
    deviation of scores from hypothetical true score
    the smaller the standard error the more reliable
    the test
  • Confidence intervals Range of scores within
    which an individuals particular true score is
    likely to fall
  • Validity
  • Content-related-do test items reflect content
    addressed in class/texts
  • Criterion-PSAT and SAT-predictor of of
    performance based on prior measure
  • Construct-related-IQ, motivation-evidence
    gathered over years

See Guidelines, p. 514 Increasing Reliability
and Validity
12
Achievement Tests
  • Measure how much student has learned in specific
    content areas
  • Frequently used achievement tests
  • Group tests for identifying students who need
    more testing or for homogenous ability grouping
  • Individual tests for determining academic level
    or diagnosis of learning problems
  • The standardized scores reported
  • NS National Stanine Score
  • NCE National Curve Equivalent
  • SS Scale Score
  • NCR Raw score
  • NP National Percentile
  • Range
  • See Figure 40.1, p. 520-521

13
Diagnostic Tests
  • Identify strengths and weaknesses
  • Most often used by trained professionals
  • Elementary teachers may use for reading, math

Aptitude Tests
  • Measure abilities developed over years
  • Used to predict future performance
  • SAT/PSAT
  • ACT/SCAT
  • IQ and aptitude
  • Discussing test scores with families
  • Controversy continues over fairness, validity,
    biasness

14
Issues in Testing
  • Widespread testing (see Table 14.3, p. 534)
  • Accountability and high stakes testing-misuses,
    Table 40.3, p. 526
  • Testing teachers-accountability of student
    performance as well as teacher knowledge in
    teacher tests

See Point/Counterpoint, p. 525
Desired Characterstics of a Testing Program
1)Match the content standards of district 6) Include all students
2)Be part of a larger assessment plan 7) Provide appropriate remediation
3)Test complex thinking 8) Make sure all students have had adequate opportunity to learn material
4)Provide alternative assessment strategies for students with disabilities 9) Take into account the students language
5)Provide opportunities for retesting 10) Use test results FOR children, not AGAINST them
15
New Directions in Standardized Testing
  • Authentic assessments
  • Problem of how to assess complex, important,
    real-life outcomes
  • some states are developing/have developed
    authentic assessment procedures
  • Constructed-response-formats have students
    create, rather than select, responses demands
    more thoughtful scoring
  • Changes in the SAT-now have a writing component
  • Accommodating diversity in testing

16
Formative Assessments
  • 2 basic purposes 1) guide teachers in planning
    2) help to identify problem areas
  • Pretests
  • Aid teacher in planning-what learners know and
    dont know
  • Identify weaknesses diagnostic
  • Are not graded

Summative Assessments
  • Occurs at the end of instruction
  • Provides a summary of accomplishments
  • End of chapter, midterms, final exam
  • Purpose is to determine final achievement

17
Planning for Testing
  • Test frequently
  • Test soon after learning
  • Use cumulative questions
  • Preview ready-made tests

Objective Testing
  • Objective not open to many interpretations
  • Measures a broad range of material
  • Multiple choice most versatile
  • Lower and higher level items
  • Difficult to write well
  • Easy to score

18
Key Principles Writing Multiple Choice Questions
  • Clearly written stem
  • Present a single problem
  • Avoid unessential details
  • State the problem in positive terms
  • Use not, no, or except sparingly or mark
    them NOT , no, except
  • Do not test extremely fine discriminations
  • Put most wording in the stem
  • Check for grammatical match between stem and
    alternatives
  • Avoid exclusive and inclusive words all, every,
    only, never, none
  • Avoid two distracters with the same meaning
  • Avoid exact textbook language
  • Avoid overuse of all or none of the above
  • Use plausible distracters
  • Vary the position of the correct answer
  • Vary the length of correct answers long answers
    are often correct
  • Avoid obvious patterns in the position of your
    correct answer

19
Essay Testing
  • Requires students to create an answer
  • Most difficult part is judging quality of answers
  • Writing good, clear questions can be challenging
  • Essay tests focus on less material
  • Require a clear and precise task
  • Indicate the elements to be covered
  • Allow ample time for students to answer
  • Should be limited to complex learning objectives
  • Should include only a few questions

20
Evaluating Essays Dangers
  • Problems with subjective testing
  • Individual standards of the grader
  • Unreliability of scoring procedures
  • Bias wordy essays, neatly written with few
    grammatical errors often get more points and may
    completely off point

Evaluating Essays Methods
  • Construct a model answer
  • Give points for each part of the answer
  • Give points for organization
  • Compare answers on papers that you gave
    comparable grades
  • Grade all answers to one question before moving
    on to the next question/test
  • Have another teacher grade tests as a cross-check

21
Effects of Grades and Grading
  • Effects of Failure-can be positive or negative
    motivator
  • Effects of Feedback-
  • helpful if reason for mistake is clearly
    explained, in a positive constructive format, so
    that the same mistake is not repeated
  • encouraging, personalized written comments are
    appropriate
  • oral feedback and brief written comments for
    younger students
  • Grades and Motivation
  • grades can motivate real learning but appropriate
    objectives are the key
  • should reflect meaningful learning
  • working for a grade and working for learning
    should be the same
  • Grading and Reporting
  • Criterion-Referenced vs. Norm-Referenced

22
Criterion-Referenced
  • Mastery of objectives
  • Criteria for grades set in advance
  • Student determines what grade they want to
    receive
  • All students could receive an A

Norm-Referenced Grading
  • Grading on the curve
  • Students compared to other students
  • Average becomes the anchor for other grades
  • Fairness issue
  • Adjusting the curve

23
Point System and Percentage Grading
Point System and Percentage Grading
  • Point system for combining grades from many
    assignments
  • Points assigned according to assignments
    importance and students performance
  • Grades are influenced by level of difficulty of
    the test and concerns of the teacher
  • Percentage grading involves assigning grades
    based on how much knowledge each student has
    acquired
  • Grading symbols A-F commonly used to represent
    percentage categories
  • Grades are influenced by level of difficulty of
    tests/assignments and concerns of the individual
    teacher

Contract System and Rubrics
  • Specific types, quantity and quality of work
    required for each grade
  • Students contract to work for a grade-great
    start over
  • Can overemphasize quantity of work at the expense
    of quality
  • Revise Option Revise and improve work

24
Effort and Improvement Grades?
  • BIG question Should grades be based on how much
    a student improves or on the final level of
    learning?
  • Using improvement as a standard penalizes the
    best students who naturally improve the least
  • Individual Learning Expectations (ILE) system
    allows everyone to earn improvement points base
    don personal averages
  • Dual Marking system is a way to include effort in
    grades

25
Parent/Teacher Conferences
  • Make plenty of deposits starting on week two!
  • Plan ahead
  • Start positive
  • Use active listening and problem solving
  • Establish a partnership
  • Plan follow-up contacts
  • Tell the truth!
  • Be prepared with samples
  • End positive
Write a Comment
User Comments (0)
About PowerShow.com