THE GENERATION OF AUTOMATED STUDENT FEEDBACK FOR A COMPUTER-ADAPTIVE TEST - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

THE GENERATION OF AUTOMATED STUDENT FEEDBACK FOR A COMPUTER-ADAPTIVE TEST

Description:

School of Computer Science. Mariana Lilley. Dr. Trevor Barker. Dr. Carol Britton. Objectives ... evaluation of the CAT prototype (Lilley & Barker, 2002) ... – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 34
Provided by: mariana54
Category:

less

Transcript and Presenter's Notes

Title: THE GENERATION OF AUTOMATED STUDENT FEEDBACK FOR A COMPUTER-ADAPTIVE TEST


1
THE GENERATION OF AUTOMATED STUDENT FEEDBACK FOR
A COMPUTER-ADAPTIVE TEST
  • University of Hertfordshire
  • School of Computer Science
  • Mariana Lilley
  • Dr. Trevor Barker
  • Dr. Carol Britton

2
Objectives
  • Overview of ongoing research at the University of
    Hertfordshire on the use of computer-adaptive
    tests (CATs)
  • Our approach to the generation of automated
    feedback
  • Student attitude
  • Future work

3
Research overview
  • Research started in 2001.
  • Five empirical studies, involving over 350
    participants.
  • Findings suggest that computer-adaptive test
    (CAT) approach has the potential to offer a more
    consistent and accurate measurement of student
    proficiency levels than the one offered by
    non-adaptive computer-based tests (CBTs).
  • Statistical analysis of the data gathered to date
    suggests that the CAT approach is a fair measure
    of proficiency levels, producing higher
    test-retest correlations than either CBT or
    off-computer assessments.
  • More importantly, these results were observed in
    three different subject domains, namely English
    as a second language, Visual Basic programming
    and Human-Computer Interaction. This was taken
    to indicate that the approach can be transferred
    and generalised to different subject domains.

4
Traditional and adaptive approaches to testing
  • Computer-Based Tests (CBTs) mimic aspects of a
    paper-and-pencil test
  • Accuracy and speed of marking
  • Predefined set of questions presented to all
    participants and thus questions are not tailored
    for each individual student
  • Computer-Adaptive Tests (CATs) mimic aspects of
    an oral interview
  • Accuracy and speed of marking
  • Questions are dynamically selected and thus
    tailored according to student performance

5
Main benefits of the adaptive approach
  • Questions that are too easy or too difficult are
    likely to
  • Be demotivating
  • Provide little or no valuable information about
    student knowledge
  • Questions at the boundary of student knowledge
    are likely to
  • Be challenging
  • Be motivating
  • Provide lecturers with valuable information with
    regard to student ability
  • Beginning in the days when education was for the
    privileged few, the wise tutor would modify the
    oral examination of a student by judiciously
    choosing questions appropriate to the student's
    knowledge and ability (Wainer, 1990).

6
Computer-Adaptive Test
  • Based on Item Response Theory (IRT)
  • If a student answers a question correctly, the
    estimate of his/her ability is raised and a more
    difficult question is presented
  • If a student answers a question incorrectly, the
    estimate of his/her ability is lowered and an
    easier question follows
  • Can be of fixed or variable length
  • Score

7
Item Response Theory
  • Family of mathematical functions
  • Most well-known models for dichotomously scored
    questions
  • One-Parameter Logistic Model (1-PL)
  • Two-Parameter Logistic Model (2-PL)
  • Three-Parameter Logistic Model (3-PL).
  • In the CAT application introduced here
  • 3-PL Model
  • Fixed length

8
The 3-PL model from IRT
  • ?, represents student's ability
  • b, represents question's difficulty
  • a, represents question's discrimination
  • c, represents pseudo-chance

9
Level of difficulty
  • One of the underlying ideas within Bloom's
    taxonomy of cognitive skills (Anderson
    Krathwohl, 2001) is that tasks can be arranged in
    a hierarchy from less to more complex.

10
Feedback provided for the first and second
assessment sessions
  • Scores sent via email
  • Students seemed pleased to receive their scores
    via email
  • Some students reported that the score on its own
    provided learners with very little if any
    help in determining which part of the subject
    domain they should revise next or which topic
    they should prioritise
  • Student views were in line with the opinion of
    the experts who participated in the pedagogical
    evaluation of the CAT prototype (Lilley Barker,
    2002)

11
Feedback provided for the first and second
assessment sessions
  • To ltltStudent_Namegtgt
  • Your score for the Visual Basic Test 1 was
    ltltStudent_Scoregtgt.
  • This is an automated message from
  • The Programming_Module team

12
Assessment
  • Bachelor of Science (BSc) in Computer Science
  • 123 participants
  • The participants took the test on week 30 as part
    of their real assessment for the module
  • 6 non-adaptive questions followed by 14 adaptive
    ones
  • Human-Computer Interaction
  • Issues related to the use of sound at interfaces
  • Graphical representation at interfaces, focusing
    on the use of colour and images
  • User-centred approaches to requirements gathering
  • Design, prototyping and construction
  • Usability goals and User experience goals
  • Evaluation paradigms and techniques

13
Providing students with a copy of the test
  • A simple potential solution was to provide
    students with a copy of all questions they got
    wrong.
  • A major limitation of this approach was lack of
    explanation or comment on their performance.
  • It seemed unlikely that providing students with
    the answers to the questions they did not get
    right would foster research and/or reflection
    skills.
  • A further practical limitation of the approach
    was increased exposure of the objective questions
    stored in the database.
  • Re-use of questions is one of the perceived
    benefits of computer-assisted assessments
    (Freeman Lewis, 1998 Harvey Mogey, 1999).

14
Automated feedback using Item Response Theory
(IRT)
  • Overall proficiency level calculated as in
    previous assessments using the CAT application
    (i.e. using the 3-PL Model).
  • A proficiency level was calculated for each set
    of student responses for a given topic.
  • Questions answered incorrectly by each individual
    student identified.
  • Design and implementation of a feedback database
  • Feedback according to topic
  • Feedback according to question

15
Summary of overall performance
16
Summary of performance per topic
17
Feedback according to topic
18
Feedback according to topic
19
Feedback according to question
  • Section named Based on your test performance, we
    suggest the following areas for revision.
  • This section of the feedback document comprised a
    list of points for revision, based on the
    questions answered incorrectly by each individual
    student.
  • This feedback sentence did not reproduce the
    question itself.
  • The feedback sentence listed specific sections
    within the recommended reading and/or additional
    learning materials.
  • The same feedback sentence could be used for more
    than one question in the database.

20
Example of question
21
Example of question
22
Example of feedback sentence related to questions
regarding bit-depth
  • Do some independent research on bit depth (the
    number of bits per pixel allocated for storing
    indexed colour information in a graphics file).
    As a starting point, see http//www.microsoft.com/
    windowsxp/experiences/glossary_a-g.asp24-bitcolor
    . See also Chapter 5 from Principles of
    Interactive Multimedia, as section 5.6.4
    introduces important aspects related to the use
    of colour at interfaces.

23
Student attitude towards the feedback format
adopted
  • All students who participated in Assessment 3
    invited to express their views on the feedback
    format used (optional).
  • 58 students replied to our email (47).
  • Students asked to classify the feedback received
    as "very useful", "useful" or "not useful".
  • Students were also asked to present one positive
    and one negative aspect of the feedback provided.

24
Summary of positive aspects
25
Summary of negative aspects
26
Summary of problems with document layout and/or
type
27
Example of automated feedback generated
28
Feedback according to topic
29
Recommended action(s)
30
Discussion
  • Like Denton (2003), it is our belief that the
    potential benefits of automated feedback have not
    yet been fully explored by academic staff, even
    by those who are already making use of
    computer-assisted assessment tools.
  • Our initial ideas on how CATs/IRT can be used to
    provide students with personalised, meaningful
    feedback include
  • An ability estimation algorithm based on the
    Three-Parameter Logistic Model from IRT
  • A feedback database
  • Feedback sentences selected from the feedback
    database based on the ability level estimated and
    questions answered incorrectly
  • For each individual student only those sentences
    that applied to his or her test performance are
    selected
  • Selected feedback sentences added to a new Word
    document and sent to each individual student
    email account.

31
Discussion
  • Learners like to be assessed and value comments
    on their performance.
  • The investment of effort by learners necessitates
    comment from tutors.
  • As class sizes increase and more use is made of
    online formative and summative assessment
    methods, it becomes increasingly difficult to
    provide individual feedback in HE.
  • Students still value a human contribution to
    feedback, but they also realise that this is
    becoming rarer in their academic lives.
  • Student attitude to this approach was positive in
    general.
  • At the very least we have shown that our
    automated feedback method identifies areas of
    weakness and strength and provides useful advice
    for individual development.

32
Future work
  • Creation of one distinct feedback sentence per
    question.
  • It is anticipated that these sentences should
    resemble the actual question more than the
    current comments do.
  • "Would it be possible to attach the question and
    the correct answers from the test?"
  • Overall layout of the document will be reviewed
  • To facilitate the location of information on the
    feedback sheet (some learners reported that they
    did not intuitively locate their overall score in
    the feedback document)
  • The distribution of the feedback document as a
    PDF rather than Word (DOC) file is also being
    considered

33
Future work
  • Review our assumption in that performance in one
    topic area within a subject domain is the best
    indicator of performance in a related topic area
    in the same domain.
  • It is possible that students might have differing
    abilities in quite similar topic areas.
  • Impact on test length and/or proficiency level
    estimation.
  • Increase personalisation of the feedback, we are
    intending to compare learner performance in
    previous assessments with his or her performance
    in most current one.
Write a Comment
User Comments (0)
About PowerShow.com