How to Make a Test - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

How to Make a Test

Description:

How to Make a Test & Judge its Quality Aim of the Talk Acquaint teachers with the characteristics of a good and objective test See Item Analysis ... – PowerPoint PPT presentation

Number of Views:99
Avg rating:3.0/5.0
Slides: 19
Provided by: AliSa157
Category:
Tags: make | test | what

less

Transcript and Presenter's Notes

Title: How to Make a Test


1
How to Make a Test Judge its Quality
2
Aim of the Talk
  • Acquaint teachers with the characteristics of a
    good and objective test
  • See Item Analysis techniques that help to
  • improve the quality of a test by identifying
    items that are candidates for retention, revision
    or removal
  • clarify what concepts the examinees have and have
    not mastered

3
Types of Tests
  • Criterion-referenced tests
  • Norm-referenced tests
  • Standardized tests
  • Ipsative Tests

4
Focus of this Talk
  • The following guidelines apply most appropriately
    to tests that are designed to identify
    differences in achievement levels between
    students (Norm-referenced tests)
  • Some of the criteria outlined either do not apply
    or apply in somewhat different ways to tests
    designed to measure mastery of content
    (Criterion-referenced tests)

5
Important factors in judging a tests quality
  1. Course Objectives
  2. Fairness to Students
  3. Conditions of Administration
  4. Measure of Achievements
  5. Time Limits
  6. Difficulty Index
  7. Discrimination Index
  8. Levels of Ability
  9. Test Reliability
  10. Accuracy of Scores

Depend on the knowledge and judgment of the
teacher
Can be aided by various statistical analysis
techniques
6
1. Course Objectives
  • Does the test reflects course objectives?
  • Good Practices
  • Make a Test Plan
  • Content to be covered
  • Relative emphasis to be given to included topics
  • Teachers should exchange examinations for review
    and constructive criticism
  • Teachers should not feel obligated to accept and
    apply all the suggestions made by their
    colleagues, as good teachers usually have their
    own unique style and special abilities

7
2. Fairness to Students
  • A test is fair if it emphasizes the knowledge,
    understanding and abilities that were emphasized
    in the actual teaching of the course
  • There is no such thing as out-of-course if the
    relevant concepts were covered in the class
  • Probably no such test has ever been taken that
    was regarded as perfectly fair by all persons
    taking it
  • Nevertheless, student feedback after the test is
    very important e.g., ambiguity or confusion in
    questions, figures, tables etc.

8
3. Conditions of Test Administration
  • No confusion or disturbance during the test
  • Prevent cheating, use of unfair means
  • Satisfactory conditions of light, heat and
    comfort etc.
  • Again, student feedback can be helpful here

9
4. Measure of Achievements
  • Students should be judged on their knowledge,
    understanding, abilities and interests instead of
    on the basis of what they remember or what they
    read in preparation for the test
  • Knowledge of terms and isolated facts/trivia is a
    low measure of achievement
  • For example, question like Explain the Ethernet
    frame format or Define and explain the Two-Army
    Problem do not measure important achievements
  • Majority of the questions should deal with
    applications, understanding and generalizations
    of the learned concepts

10
5. Time Limits
  • Tests should be work-limit tests rather than
    time-limit tests
  • Students scores should depend on how much they
    can do and not on how fast they can do it
  • Speed may be important in repetitive,
    clerical-type operations, but it is not important
    in critical or creative thinking or decision
    making
  • Test time limits be generous enough for at least
    90 of the students to attempt and complete all
    questions in the test

11
6. Item Difficulty Index (p)
  • It is the proportion of students that answered
    the item correctly
  • If almost all students get an item
    correct/incorrect then the item is not very
    efficient
  • For ideal MCQs, difficulty indices are about .50
    to .70
  • For the test as a whole, the difficulty index
    should be about midway between the expected
    chance score and the maximum possible score
  • The p value varies with each class group that
    takes the test

12
7. Item Discrimination Index (D)
  • It is a measure of an item's ability to
    discriminate between good and poor students
  • Students in the top 27 in terms of total test
    score are taken to be good students and vice
    versa
  • The discrimination index is a basic measure of
    the validity of an item
  • Validity Whether a student got an item correct
    or not is due to their level of knowledge or
    ability and not due to something else such as
    chance or test bias

13
7. Item Discrimination Index (D)
  • How to interpret D
  • D can take on negative values and can range
    between -1.00 and 1.00
  • D 1.00 is Perfect Positive Discriminator
  • Most psychometricians say that items yielding D
    values of 0.30 and above are good discriminators
    and worthy of retention for future exams
  • D value is unique to a group of examinees
  • An item with satisfactory discrimination for one
    group may be unsatisfactory for another

14
8. Levels of Ability
  • For a test to distinguish clearly between
    students at different levels of ability it must
    yield scores of wide variability
  • The larger the standard deviation (s), the better
    the test
  • A s value equal to one-sixth of the range between
    the highest possible score and the expected
    chance score is generally considered an
    acceptable standard

15
9. Test Reliability
  • The reliability coefficient represents the
    estimated correlation between the scores on the
    test and scores on another equivalent test,
    composed of different items, but designed to
    measure the same kind of achievement
  • The highest possible value is 1.00
  • This level is difficult to achieve consistently
    with homogeneous class groups and with items that
    previously have not been administered, analyzed,
    and revised
  • A reasonable goal for teachers to set is a
    reliability estimate of .80

16
10. Accuracy of Scores
  • The accuracy of the scores is reflected by the
    standard error of measurement (SEM), a statistic
    computed using the standard deviation and the
    reliability coefficient
  • If the SEM is 2 score points, for example, one
    can say that about two-thirds of the scores
    reported were within 2 points of each students
    true score. About one-sixth of the students
    received scores more than 2 points higher than
    they should have received. The remaining
    one-sixth received scores more than 2 points too
    low
  • The SEM simply serves as an indication of how
    much chance error remains in the scores from even
    a good test

17
Conclusions
  • Item Analysis itself doesn't improves a test
  • Its main purpose is to serve as a guide to the
    teacher
  • Teachers can conduct the analysis themselves but
    usually the last five factors are (and should be)
    implemented by a Evaluation and Examination
    Department
  • The analysis techniques work reliably on classes
    of 30 or more students

18
References
  • How to Judge the Quality of an Objective
    Classroom Test Evaluation and Examination
    Service, The University of Iowa
  • Haladyna, T.M. Downing, S.M. Rodriguez, M.C.
    (2002). A review of multiple- choice item-writing
    guidelines for classroom assessment. Applied
    Measurement in Education, 15(3), 309-334
  • Zurawski, R. (1998). Making the Most of Exams
    Procedures for Item Analysis. National Teaching
    and Learning Forum, Vol. 7
  • Item Analysis Guidelines Scoring Office of
    Michigan State University (http//scoring.msu.edu)
  • Wikipedia, the free encyclopedia
Write a Comment
User Comments (0)
About PowerShow.com