VALIDITY, RELIABILITY - PowerPoint PPT Presentation

Loading...

PPT – VALIDITY, RELIABILITY PowerPoint presentation | free to download - id: 7a2a84-YjQ1O



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

VALIDITY, RELIABILITY

Description:

VALIDITY, RELIABILITY & PRACTICALITY Prof. Rosynella Cardozo Prof. Jonathan Magdalena – PowerPoint PPT presentation

Number of Views:198
Avg rating:3.0/5.0
Slides: 27
Provided by: Raja157
Learn more at: http://images.pcmac.org
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: VALIDITY, RELIABILITY


1
VALIDITY, RELIABILITY PRACTICALITY
  • Prof. Rosynella Cardozo
  • Prof. Jonathan Magdalena

2
QUALITIES OF MEASUREMENT DEVICES
  • Validity
  • Does it measure what it is supposed to measure?
  • Reliability
  • How representative is the measurement?
  • Practicality
  • Is it easy to construct, administer, score and
    interpret?
  • Backwash
  • What is the impact of the test on the
    teaching/learning process?

3
VALIDITY
  • The term validity refers to whether or not a
    test measures what it intends to measure.
  • On a test with high validity the items will be
    closely linked to the tests intended focus. For
    many certification and licensure tests this means
    that the items will be highly related to a
    specific job or occupation. If a test has poor
    validity then it does not measure the job-related
    content and competencies it ought to.
  • There are several ways to estimate the validity
    of a test, including content validity, construct
    validity, criterion-related validity (concurrent
    predictive) and face validity.

4
VALIDITY
  • Content related to objectives and their
    sampling.
  • Construct referring to the theory underlying
    the target.
  • Criterion related to concrete criteria in the
    real world. It can be concurrent or predictive.
  • Concurrent correlating high with another
    measure already validated.
  • Predictive Capable of anticipating some later
    measure.
  • Face related to the test overall appearance.

5
1. CONTENT VALIDITY
  • Content validity refers to the connections
    between the test items and the subject-related
    tasks. The test should evaluate only the content
    related to the field of study in a manner
    sufficiently representative, relevant, and
    comprehensible.

6
2. CONSTRUCT VALIDITY
  • It implies using the construct correctly
    (concepts, ideas, notions). Construct validity
    seeks agreement between a theoretical concept and
    a specific measuring device or procedure. For
    example, a test of intelligence nowadays must
    include measures of multiple intelligences,
    rather than just logical-mathematical and
    linguistic ability measures.

7
3. CRITERION-RELATED VALIDITY
  • Also referred to as instrumental validity, it
    states that the criteria should be clearly
    defined by the teacher in advance. It has to take
    into account other teachers criteria to be
    standardized and it also needs to demonstrate the
    accuracy of a measure or procedure compared to
    another measure or procedure which has already
    been demonstrated to be valid.

8
4. CONCURRENT VALIDITY
  • Concurrent validity is a statistical method
    using correlation, rather than a logical method.
  • Examinees who are known to be either masters
    or non-masters on the content measured by the
    test are identified before the test is
    administered. Once the tests have been scored,
    the relationship between the examinees status as
    either masters or non-masters and their
    performance (i.e., pass or fail) is estimated
    based on the test. This type of validity provides
    evidence that the test is classifying examinees
    correctly. The stronger the correlation is, the
    greater the concurrent validity of the test is.

9
5. PREDICTIVE VALIDITY
  • This is another statistical approach to validity
    that estimates the relationship of test scores to
    an examinee's future performance as a master or
    non-master. Predictive validity considers the
    question, "How well does the test predict
    examinees' future status as masters or
    non-masters?" For this type of validity, the
    correlation that is computed is based on the test
    results and the examinees later performance.
    This type of validity is especially useful for
    test purposes such as selection or admissions.

10
6. FACE VALIDITY
  • Like content validity, face validity is
    determined by a review of the items and not
    through the use of statistical analyses. Unlike
    content validity, face validity is not
    investigated through formal procedures. Instead,
    anyone who looks over the test, including
    examinees, may develop an informal opinion as to
    whether or not the test is measuring what it is
    supposed to measure. While it is clearly of some
    value to have the test appear to be valid, face
    validity alone is insufficient for establishing
    that the test is measuring what it claims to
    measure.

11
QUALITIES OF MEASUREMENT DEVICES
  • Validity
  • Does it measure what it is supposed to measure?
  • Reliability
  • How representative is the measurement?
  • Practicality
  • Is it easy to construct, administer, score and
    interpret?
  • Backwash
  • What is the impact of the test on the
    teaching/learning process?

12
RELIABILITY
  • Reliability is the extent to which an
    experiment, test, or any measuring procedure
    shows the same result on repeated trials. Without
    the agreement of independent observers able to
    replicate research procedures, or the ability to
    use research tools and procedures that produce
    consistent measurements, researchers would be
    unable to satisfactorily draw conclusions,
    formulate theories, or make claims about the
    generalizability of their research. For
    researchers, four key types of reliability are

13
RELIABILITY
  • Equivalency related to the co-occurrence of
    two items
  • Stability related to time consistency
  • Internal related to the instruments
  • Inter-rater related to the examiners
    criterion
  • Intra-rater related to the examiners
    criterion

14
1. EQUIVALENCY RELIABILITY
  • Equivalency reliability is the extent to which
    two items measure identical concepts at an
    identical level of difficulty. Equivalency
    reliability is determined by relating two sets of
    test scores to one another to highlight the
    degree of relationship or association. For
    example, a researcher studying university English
    students happened to notice that when some
    students were studying for finals, they got sick.
    Intrigued by this, the researcher attempted to
    observe how often, or to what degree, these two
    behaviors co-occurred throughout the academic
    year. The researcher used the results of the
    observations to assess the correlation between
    studying throughout the academic year and
    getting sick. The researcher concluded there
    was poor equivalency reliability between the two
    actions. In other words, studying was not a
    reliable predictor of getting sick.

15
2. STABILITY RELIABILITY
  • Stability reliability (sometimes called test,
    re-test reliability) is the agreement of
    measuring instruments over time. To determine
    stability, a measure or test is repeated on the
    same subjects at a future date. Results are
    compared and correlated with the initial test to
    give a measure of stability. This method of
    evaluating reliability is appropriate only if the
    phenomenon that the test measures is known to be
    stable over the interval between assessments. The
    possibility of practice effects should also be
    taken into account.

16
3. INTERNAL CONSISTENCY
  • Internal consistency is the extent to which
    tests or procedures assess the same
    characteristic, skill or quality. It is a measure
    of the precision between the measuring
    instruments used in a study. This type of
    reliability often helps researchers interpret
    data and predict the value of scores and the
    limits of the relationship among variables. For
    example, analyzing the internal reliability of
    the items on a vocabulary quiz will reveal the
    extent to which the quiz focuses on the
    examinees knowledge of words.

17
4. INTER-RATER RELIABILITY
  • Inter-rater reliability is the extent to which
    two or more individuals (coders or raters) agree.
    Inter-rater reliability assesses the consistency
    of how a measuring system is implemented. For
    example, when two or more teachers use a rating
    scale with which they are rating the students
    oral responses in an interview (1 being most
    negative, 5 being most positive). If one
    researcher gives a "1" to a student response,
    while another researcher gives a "5," obviously
    the inter-rater reliability would be
    inconsistent. Inter-rater reliability is
    dependent upon the ability of two or more
    individuals to be consistent. Training, education
    and monitoring skills can enhance inter-rater
    reliability.

18
4. INTRA-RATER RELIABILITY
  • Intra-rater reliability is a type of reliability
    assessment in which the same assessment is
    completed by the same rater on two or more
    occasions. These different ratings are then
    compared, generally by means of correlation.
    Since the same individual is completing both
    assessments, the rater's subsequent ratings are
    contaminated by knowledge of earlier ratings.

19
SOURCES OF ERROR
  • Examinee (is a human being)
  • Examiner (is a human being)
  • Examination (is designed by and for human beings)

20
RELATIONSHIP BETWEEN VALIDITY RELIABILITY
  • Validity and reliability are closely related.
  • A test cannot be considered valid unless the
    measurements resulting from it are reliable.
  • Likewise, results from a test can be reliable and
    not necessarily valid.

21
QUALITIES OF MEASUREMENT DEVICES
  • Validity
  • Does it measure what it is supposed to measure?
  • Reliability
  • How representative is the measurement?
  • Practicality
  • Is it easy to construct, administer, score and
    interpret?
  • Backwash
  • What is the impact of the test on the
    teaching/learning process?

22
PRACTICALITY
  • It refers to the economy of time, effort and
    money in testing. In other words, a test should
    be
  • Easy to design
  • Easy to administer
  • Easy to mark
  • Easy to interpret (the results)

23
QUALITIES OF MEASUREMENT DEVICES
  • Validity
  • Does it measure what it is supposed to measure?
  • Reliability
  • How representative is the measurement?
  • Practicality
  • Is it easy to construct, administer, score and
    interpret?
  • Backwash
  • What is the impact of the test on the
    teaching/learning process?

24
BACKWASH EFFECT
  • Backwash effect (also known as washback) is the
    influence of testing on teaching and learning. It
    is also the potential impact that the form and
    content of a test may have on learners
    conception of what is being assessed (language
    proficiency) and what it involves. Therefore,
    test designers, delivers and raters have a
    particular responsibility, considering that the
    testing process may have a substantial impact,
    either positive or negative.

25
LEVELS OF BACKWASH
  • It is believed that backwash is a subset of a
    tests impact on society, educational systems and
    individuals. Thus, test impact operates at two
    levels
  • The micro level (the effect of the test on
    individual students and teachers)
  • The macro level (the impact of the test on
    society and the educational system)
  • Bachman and Palmer (1996)

26
  • THANKS
About PowerShow.com