Designing an assessment system - PowerPoint PPT Presentation

About This Presentation
Title:

Designing an assessment system

Description:

Title: Slide 1 Author: Julea Hardy Last modified by: Dylan Wiliam Created Date: 10/3/2006 10:33:42 AM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 28
Provided by: Jule171
Category:

less

Transcript and Presenter's Notes

Title: Designing an assessment system


1
(No Transcript)
2
Designing an assessment system
  • Presentation to the Scottish Qualifications
    Authority, August 2007
  • Dylan Wiliam
  • Institute of Education, University of London
  • www.dylanwiliam.net

3
Overview
  • The purposes of assessment
  • The structure of the assessment system
  • The locus of assessment
  • The extensiveness of the assessment
  • Assessment format
  • Scoring models
  • Quality issues
  • The role of teachers
  • Contextual issues

4
Functions of assessment
  • Three functions of assessment
  • For evaluating institutions (evaluative)
  • For describing individuals (summative)
  • For supporting learning
  • Monitoring learning Whether learning is taking
    place
  • Diagnosing (informing) learning What is not
    being learnt
  • Forming learning What to do about it
  • No system can easily support all three functions
  • Traditionally, we have grouped the first two, and
    ignored the third
  • Learning is sidelined summative and evaluative
    functions are weakened
  • Instead, we need to separate the first
    (evaluative) from the other two

5
The Lake Wobegon effect
All the women are strong, all the men are
good-looking, and all the children are above
average. Garrison Keillor
Scores

Time
6
Goodharts law
  • All performance indicators lose their usefulness
    when used as objects of policy
  • Privatization of British Rail
  • Targets in the Health Service
  • Bubble students in high-stakes settings

7
Reconciling different pressures
  • The high-stakes genie is out of the bottle, and
    we cannot put it back
  • The clearer you are about what you want, the more
    likely you are to get it, but the less likely it
    is to mean anything
  • The only thing left to us is to try to develop
    tests worth teaching to
  • This is fundamentally an issue of validity.

8
Validity
  • Validity is a property of inferences, not of
    assessments
  • One validates, not a test, but an interpretation
    of data arising from a specified procedure
    (Cronbach, 1971 emphasis in original)
  • No such thing as a valid (or indeed invalid)
    assessment
  • No such thing as a biased assessment
  • A pons asinorum for thinking about assessment

9
Threats to validity
  • Inadequate reliability
  • Construct-irrelevant variance
  • The assessment includes aspects that are
    irrelevant to the construct of interest
  • the assessment is too big
  • Construct under-representation
  • The assessment fails to include important aspects
    of the construct of interest
  • the assessment is too small
  • With clear construct definition all of these are
    technicalnot valueissues

10
Two key challenges
  • Construct-irrelevant variance
  • Sensitivity to instruction
  • Construct under-representation
  • Extensiveness of assessment

11
Sensitivity to instruction
1 year
Distribution of attainment on an item highly
sensitive to instruction
12
Sensitivity to instruction (2)
1 year
Distribution of attainment on an item moderately
sensitive to instruction
13
Sensitivity to instruction (3)
1 year
Distribution of attainment on an item relatively
insensitive to instruction
14
Sensitivity to instruction (4)
1 year
Distribution of attainment on an item completely
insensitive to instruction
15
Consequences (1)
16
Consequences (2)
17
Consequences (3)
18
Insensitivity to instruction
  • Primarily attributable to the fact that learning
    is slower than assumed
  • Exacerbated by the normal mechanisms of test
    development
  • Leads to erroneous attributions about the effects
    of schooling

19
A sensitivity to instruction index
Test Sensitivity index
IQ-type test (insensitive) 0
NAEP 6
TIMSS 8
ETS STEP tests (1957) 8
ITBS 10
Completely sensitive test 100

20
Extensiveness of assessment
  • Using teacher assessment in certification is
    attractive
  • Increases reliability (increased test time)
  • Increases validity (addresses aspects of
    construct under-representation)
  • But problematic
  • Lack of trust (Fox guarding the hen house)
  • Problems of biased inferences (construct-irrelevan
    t variance)
  • Can introduce new kinds of construct
    under-representation

21
The challenge
  • To design an assessment system that is
  • Distributed
  • So that evidence collection is not undertaken
    entirely at the end
  • Synoptic
  • So that learning has to accumulate

22
A possible model
  • All students are assessed at test time
  • Different students in the same class are assigned
    different tasks
  • The performance of the class defines an
    envelope of scores, e.g.
  • Advanced 5 students
  • Proficient 8 students
  • Basic 10 students
  • Below basic 2 students
  • Teacher allocates levels on the basis of
    whole-year performance

23
Benefits and problems
  • Benefits
  • The only way to teach to the test is to improve
    everyones performance on everything (which is
    what we want!)
  • Validity and reliability are enhanced
  • Problems
  • Students scores are not inspectable
  • Assumes student motivation

24
The effects of context
  • Beliefs about what constitutes learning
  • Beliefs in the reliability and validity of the
    results of various tools
  • A preference for and trust in numerical data,
    with bias towards a single number
  • Trust in the judgments and integrity of the
    teaching profession
  • Belief in the value of competition between
    students
  • Belief in the value of competition between
    schools
  • Belief that test results measure school
    effectiveness
  • Fear of national economic decline and educations
    role in this
  • Belief that the key to schools effectiveness is
    strong top-down management

25
Conclusion
  • There is no perfect assessment system anywhere.
    Each nations assessment system is exquisitely
    tuned to local constraints and affordances.
  • Assessment practices have impacts on teaching and
    learning which may be strongly amplified or
    attenuated by the national context.
  • The overall impact of particular assessment
    practices and initiatives is determined at least
    as much by culture and politics as it is by
    educational evidence and values.

26
Conclusion (2)
  • It is probably idle to draw up maps for the ideal
    assessment policy for a country, even although
    the principles and the evidence to support such
    an ideal might be clearly agreed within the
    expert community.
  • Instead, focus on those arguments and initiatives
    which are least offensive to existing assumptions
    and beliefs, and which will nevertheless serve to
    catalyze a shift in them while at the same time
    improving some aspects of present practice.

27
Questions?Comments?
Institute of Education University of London 20
Bedford Way London WC1H 0AL Tel 44 (0)20 7612
6000 Fax 44 (0)20 7612 6126 Email info_at_ioe.ac.uk
Write a Comment
User Comments (0)
About PowerShow.com