Designing an assessment system - PowerPoint PPT Presentation

About This Presentation

Title:

Designing an assessment system

Description:

Title: Slide 1 Author: Julea Hardy Last modified by: Dylan Wiliam Created Date: 10/3/2006 10:33:42 AM Document presentation format: On-screen Show Company – PowerPoint PPT presentation

Number of Views:29

Avg rating:3.0/5.0

Slides: 28

Provided by: Jule171

Learn more at: http://www.dylanwiliam.org

Category:

more less

Transcript and Presenter's Notes

Title: Designing an assessment system

1
(No Transcript)
2
Designing an assessment system

Presentation to the Scottish Qualifications
Authority, August 2007
Dylan Wiliam
Institute of Education, University of London
www.dylanwiliam.net

3
Overview

The purposes of assessment
The structure of the assessment system
The locus of assessment
The extensiveness of the assessment
Assessment format
Scoring models
Quality issues
The role of teachers
Contextual issues

4
Functions of assessment

Three functions of assessment
For evaluating institutions (evaluative)
For describing individuals (summative)
For supporting learning
Monitoring learning Whether learning is taking
place
Diagnosing (informing) learning What is not
being learnt
Forming learning What to do about it
No system can easily support all three functions
Traditionally, we have grouped the first two, and
ignored the third
Learning is sidelined summative and evaluative
functions are weakened
Instead, we need to separate the first
(evaluative) from the other two

5
The Lake Wobegon effect
All the women are strong, all the men are
good-looking, and all the children are above
average. Garrison Keillor
Scores

Time
6
Goodharts law

All performance indicators lose their usefulness
when used as objects of policy
Privatization of British Rail
Targets in the Health Service
Bubble students in high-stakes settings

7
Reconciling different pressures

The high-stakes genie is out of the bottle, and
we cannot put it back
The clearer you are about what you want, the more
likely you are to get it, but the less likely it
is to mean anything
The only thing left to us is to try to develop
tests worth teaching to
This is fundamentally an issue of validity.

8
Validity

Validity is a property of inferences, not of
assessments
One validates, not a test, but an interpretation
of data arising from a specified procedure
(Cronbach, 1971 emphasis in original)
No such thing as a valid (or indeed invalid)
assessment
No such thing as a biased assessment
A pons asinorum for thinking about assessment

9
Threats to validity

Inadequate reliability
Construct-irrelevant variance
The assessment includes aspects that are
irrelevant to the construct of interest
the assessment is too big
Construct under-representation
The assessment fails to include important aspects
of the construct of interest
the assessment is too small
With clear construct definition all of these are
technicalnot valueissues

10
Two key challenges

Construct-irrelevant variance
Sensitivity to instruction
Construct under-representation
Extensiveness of assessment

11
Sensitivity to instruction
1 year
Distribution of attainment on an item highly
sensitive to instruction
12
Sensitivity to instruction (2)
1 year
Distribution of attainment on an item moderately
sensitive to instruction
13
Sensitivity to instruction (3)
1 year
Distribution of attainment on an item relatively
insensitive to instruction
14
Sensitivity to instruction (4)
1 year
Distribution of attainment on an item completely
insensitive to instruction
15
Consequences (1)
16
Consequences (2)
17
Consequences (3)
18
Insensitivity to instruction

Primarily attributable to the fact that learning
is slower than assumed
Exacerbated by the normal mechanisms of test
development
Leads to erroneous attributions about the effects
of schooling

19
A sensitivity to instruction index
Test Sensitivity index
IQ-type test (insensitive) 0
NAEP 6
TIMSS 8
ETS STEP tests (1957) 8
ITBS 10
Completely sensitive test 100

20
Extensiveness of assessment

Using teacher assessment in certification is
attractive
Increases reliability (increased test time)
Increases validity (addresses aspects of
construct under-representation)
But problematic
Lack of trust (Fox guarding the hen house)
Problems of biased inferences (construct-irrelevan
t variance)
Can introduce new kinds of construct
under-representation

21
The challenge

To design an assessment system that is
Distributed
So that evidence collection is not undertaken
entirely at the end
Synoptic
So that learning has to accumulate

22
A possible model

All students are assessed at test time
Different students in the same class are assigned
different tasks
The performance of the class defines an
envelope of scores, e.g.
Advanced 5 students
Proficient 8 students
Basic 10 students
Below basic 2 students
Teacher allocates levels on the basis of
whole-year performance

23
Benefits and problems

Benefits
The only way to teach to the test is to improve
everyones performance on everything (which is
what we want!)
Validity and reliability are enhanced
Problems
Students scores are not inspectable
Assumes student motivation

24
The effects of context

Beliefs about what constitutes learning
Beliefs in the reliability and validity of the
results of various tools
A preference for and trust in numerical data,
with bias towards a single number
Trust in the judgments and integrity of the
teaching profession
Belief in the value of competition between
students
Belief in the value of competition between
schools
Belief that test results measure school
effectiveness
Fear of national economic decline and educations
role in this
Belief that the key to schools effectiveness is
strong top-down management

25
Conclusion

There is no perfect assessment system anywhere.
Each nations assessment system is exquisitely
tuned to local constraints and affordances.
Assessment practices have impacts on teaching and
learning which may be strongly amplified or
attenuated by the national context.
The overall impact of particular assessment
practices and initiatives is determined at least
as much by culture and politics as it is by
educational evidence and values.

26
Conclusion (2)

It is probably idle to draw up maps for the ideal
assessment policy for a country, even although
the principles and the evidence to support such
an ideal might be clearly agreed within the
expert community.
Instead, focus on those arguments and initiatives
which are least offensive to existing assumptions
and beliefs, and which will nevertheless serve to
catalyze a shift in them while at the same time
improving some aspects of present practice.

27
Questions?Comments?
Institute of Education University of London 20
Bedford Way London WC1H 0AL Tel 44 (0)20 7612
6000 Fax 44 (0)20 7612 6126 Email info_at_ioe.ac.uk

Write a Comment

User Comments (0)