The Research Consumer Evaluates Measurement Reliability and Validity - PowerPoint PPT Presentation

About This Presentation
Title:

The Research Consumer Evaluates Measurement Reliability and Validity

Description:

Chapter 6. The Research Consumer Evaluates Measurement Reliability and Validity Evidence that Matters: Reliable Measurements Evidence that matters is collected from ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 13
Provided by: Arlen51
Category:

less

Transcript and Presenter's Notes

Title: The Research Consumer Evaluates Measurement Reliability and Validity


1
Chapter 6.
  • The Research Consumer Evaluates Measurement
    Reliability and Validity

2
Evidence that Matters Reliable Measurements
  • Evidence that matters is collected from reliable,
    valid, responsive, and interpretable
    measuresmethods of collecting information---of
    participant characteristics and program process,
    outcomes, impact and costs.
  • For research findings to count, they must come
    from measures that have the capacity to
    consistently and accurately detect changes in
    program participants knowledge, attitudes and
    behavior.
  • A reliable measure is a consistent one.
  • A measure of quality of life, for example, is
    reliable if, on average, it produces the same
    information from the same people today and two
    weeks from now.

3
Reliability, Reproducibility, and Precision
  • A reliable measure is reproducible and precise
    Each time it is used it produces the same value.
  • A beam scale can measure body weight precisely,
    but a questionnaire about good citizenship is
    likely to produce values that vary from person to
    person and even from time to time.
  • A measure (e.g., of good citizenship) cannot be
    perfectly precise it its underlying concept is
    imprecise (e.g., because differing definitions of
    good citizenship).
  • This imprecision is the gateway to random
    (chance) error.
  • Error comes from three sources variability in
    the measure itself, variability in the
    respondents, and variability in the observer.

4
Reliability Types
  • Test-retest reliability
  • A measure has test-retest reliability if the
    correlation or reliability coefficient between
    scores from time-to-time is high.
  • Internal Consistency Reliability.
  • Internal consistency is an indicator of the
    cohesion of the items in a single measure.
  • All items in an internally consistent measure
    actually assess the same idea or concept.
  • One example of internal consistency might be a
    test of two questions. The first statement says
    "You almost always feel like smoking." The second
    question says "You almost never feel like
    smoking." If a person agrees with the first and
    disagrees with the second, the test has internal
    consistency.

5
Reliability Types (Continued)
  • Split-half Reliability
  • To estimate split-half reliability, the
    researcher divides a measure into two equal
    halves (say by choosing all odd numbered
    questions to be in the first and all even
    numbered questions to be in the second half).
  • Then using the researcher calculates the
    correlation between the two halves.
  • Alternate-form Reliability
  • Refers to the extent to which two instruments
    measure the same concepts at the same level of
    difficulty.

6
Reliability Types (Continued)
  • Intra-rater reliability
  • Refers to the extent to which an individuals
    observations are consistent over time.
  • If you score the quality of 10 evaluation
    reports at time 1, for example, and then re-score
    them 2 weeks later, your intrra-rater reliability
    will be perfect if the two sets of scores are in
    perfect agreement.

7
Reliability Types (Continued)
  • Inter-rater reliability
  • Refers to the extent to which two or more
    observers or measurements agree with one another.
    Suppose you and a co-worker score the quality of
    10 evaluation reports. If you and your
    co-worker have identical scores for each of the
    10 reports, you inter-rater reliability will be
    perfect.
  • A commonly used method for determining the
    agreement between observations and observers
    results in a statistic called kappa.

8
Measurement Validity
  • Validity refers to the degree to which a measure
    assesses what it is supposed to measure.
  • Measurement validity is not the same thing as
    internal and the concepts of external validity we
    discussed in connection with research design
  • Measurement validity refers to the extent to
    which a measure or instrument provides data that
    accurately represents the concepts of interest.

9
Validity Types
  • Content validity
  • Refers to the extent to which a measure
    thoroughly and appropriately assesses the skills
    or characteristics it is intended to measure.
  • Face validity
  • Refers to how a measure appears on the surface
    Does it seem to cover all the important domains?
    ask all the needed questions?
  • Face validity is established by experts in the
    field who are asked to review a measure and to
    comment on its coverage. Face validity is the
    weakest type because it does not have theoretical
    or research support.

10
Validity Types (Continued)
  • Predictive validity
  • Predictive validity refers to the extent to which
    a measure forecasts future performance.
  • A graduate school entry examination that predicts
    who will do well in graduate school (as measured,
    for example, by grades) has predictive validity.
  • Concurrent validity
  • Concurrent validity is demonstrated when two
    measures agree with one another, or a new measure
    compares favorably with one that is already
    considered valid.
  • Construct validity
  • Construct validity is established experimentally
    to demonstrate that a measure distinguishes
    between people who do and do not have certain
    characteristics.
  • To demonstrate constructive validity for a
    measure of competent teaching, you need proof
    that teachers who do well on the measure are
    competent whereas teachers who do poorly are
    incompetent.

11
Sensitivity and Specificity
  • Sensitivity and Specificity
  • Sensitivity and specificity are two terms that
    are used in connection with screening and
    diagnostic tests and measures to detect disease
    .
  • Sensitivity refers to the proportion of people
    with disease who have a positive test result.
  • A sensitive measure will correctly detect disease
    among people who have the disease.
  • A sensitive measure is a valid measure.
  • What happens when people without the disease get
    a positive test anyway, as sometimes happens?
    That is called a false positive. Insensitive,
    invalid measures lead to false positives.

12
Sensitivity and Specificity
  • Specificity
  • Specificity refers to the proportion of people
    without disease who have a negative test result.
  • Measures with poor specificity lead to false
    negatives. They invalidly classify people as not
    having a disease when in fact they actually do.
Write a Comment
User Comments (0)
About PowerShow.com