How to Make a Test presentation

About This Presentation

Transcript and Presenter's Notes

Title: How to Make a Test

1
How to Make a Test Judge its Quality
2
Aim of the Talk

Acquaint teachers with the characteristics of a
good and objective test
See Item Analysis techniques that help to
improve the quality of a test by identifying
items that are candidates for retention, revision
or removal
clarify what concepts the examinees have and have
not mastered

3
Types of Tests

4
Focus of this Talk

The following guidelines apply most appropriately
to tests that are designed to identify
differences in achievement levels between
students (Norm-referenced tests)
Some of the criteria outlined either do not apply
or apply in somewhat different ways to tests
designed to measure mastery of content
(Criterion-referenced tests)

5
Important factors in judging a tests quality

Depend on the knowledge and judgment of the
teacher
Can be aided by various statistical analysis
techniques
6
1. Course Objectives

Does the test reflects course objectives?
Good Practices
Make a Test Plan
Content to be covered
Relative emphasis to be given to included topics
Teachers should exchange examinations for review
and constructive criticism
Teachers should not feel obligated to accept and
apply all the suggestions made by their
colleagues, as good teachers usually have their
own unique style and special abilities

7
2. Fairness to Students

A test is fair if it emphasizes the knowledge,
understanding and abilities that were emphasized
in the actual teaching of the course
There is no such thing as out-of-course if the
relevant concepts were covered in the class
Probably no such test has ever been taken that
was regarded as perfectly fair by all persons
taking it
Nevertheless, student feedback after the test is
very important e.g., ambiguity or confusion in
questions, figures, tables etc.

8
3. Conditions of Test Administration

9
4. Measure of Achievements

Students should be judged on their knowledge,
understanding, abilities and interests instead of
on the basis of what they remember or what they
read in preparation for the test
Knowledge of terms and isolated facts/trivia is a
low measure of achievement
For example, question like Explain the Ethernet
frame format or Define and explain the Two-Army
Problem do not measure important achievements
Majority of the questions should deal with
applications, understanding and generalizations
of the learned concepts

10
5. Time Limits

Tests should be work-limit tests rather than
time-limit tests
Students scores should depend on how much they
can do and not on how fast they can do it
Speed may be important in repetitive,
clerical-type operations, but it is not important
in critical or creative thinking or decision
making
Test time limits be generous enough for at least
90 of the students to attempt and complete all
questions in the test

11
6. Item Difficulty Index (p)

It is the proportion of students that answered
the item correctly
If almost all students get an item
correct/incorrect then the item is not very
efficient
For ideal MCQs, difficulty indices are about .50
to .70
For the test as a whole, the difficulty index
should be about midway between the expected
chance score and the maximum possible score
The p value varies with each class group that
takes the test

12
7. Item Discrimination Index (D)

It is a measure of an item's ability to
discriminate between good and poor students
Students in the top 27 in terms of total test
score are taken to be good students and vice
versa
The discrimination index is a basic measure of
the validity of an item
Validity Whether a student got an item correct
or not is due to their level of knowledge or
ability and not due to something else such as
chance or test bias

13
7. Item Discrimination Index (D)

How to interpret D
D can take on negative values and can range
between -1.00 and 1.00
D 1.00 is Perfect Positive Discriminator
Most psychometricians say that items yielding D
values of 0.30 and above are good discriminators
and worthy of retention for future exams
D value is unique to a group of examinees
An item with satisfactory discrimination for one
group may be unsatisfactory for another

14
8. Levels of Ability

For a test to distinguish clearly between
students at different levels of ability it must
yield scores of wide variability
The larger the standard deviation (s), the better
the test
A s value equal to one-sixth of the range between
the highest possible score and the expected
chance score is generally considered an
acceptable standard

15
9. Test Reliability

The reliability coefficient represents the
estimated correlation between the scores on the
test and scores on another equivalent test,
composed of different items, but designed to
measure the same kind of achievement
The highest possible value is 1.00
This level is difficult to achieve consistently
with homogeneous class groups and with items that
previously have not been administered, analyzed,
and revised
A reasonable goal for teachers to set is a
reliability estimate of .80

16
10. Accuracy of Scores

The accuracy of the scores is reflected by the
standard error of measurement (SEM), a statistic
computed using the standard deviation and the
reliability coefficient
If the SEM is 2 score points, for example, one
can say that about two-thirds of the scores
reported were within 2 points of each students
true score. About one-sixth of the students
received scores more than 2 points higher than
they should have received. The remaining
one-sixth received scores more than 2 points too
low
The SEM simply serves as an indication of how
much chance error remains in the scores from even
a good test

17
Conclusions

Item Analysis itself doesn't improves a test
Its main purpose is to serve as a guide to the
teacher
Teachers can conduct the analysis themselves but
usually the last five factors are (and should be)
implemented by a Evaluation and Examination
Department
The analysis techniques work reliably on classes
of 30 or more students

18
References

How to Judge the Quality of an Objective
Classroom Test Evaluation and Examination
Service, The University of Iowa
Haladyna, T.M. Downing, S.M. Rodriguez, M.C.
(2002). A review of multiple- choice item-writing
guidelines for classroom assessment. Applied
Measurement in Education, 15(3), 309-334
Zurawski, R. (1998). Making the Most of Exams
Procedures for Item Analysis. National Teaching
and Learning Forum, Vol. 7
Item Analysis Guidelines Scoring Office of
Michigan State University (http//scoring.msu.edu)
Wikipedia, the free encyclopedia

Write a Comment

User Comments (0)

About PowerShow.com

How to Make a Test PowerPoint PPT Presentation