Introduction to Measurement - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Introduction to Measurement

Description:

Major Types of Assessment in Schools. More frequently used: ... Administer the instrument to the sample in a standardized fashion. ... – PowerPoint PPT presentation

Number of Views:35
Avg rating:3.0/5.0
Slides: 40
Provided by: publi8
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Measurement


1
Introduction to Measurement
2
Goals of Workshop
  • Reviewing assessment concepts
  • Reviewing instruments used in norming process
  • Getting an overview of the secondary and
    elementary normative samples
  • Learning how to use the manuals in interpreting
    students scores.

3
ASSESSMENT
  • The process of collecting data for the purpose of
    making decisions about students
  • Its a process and typically involves multiple
    sources and methods.
  • Assessment is in service of a goal or purpose.
  • The data we collect will be used to support some
    type of decision (e.g., monitoring, intervention,
    placement)

4
Major Types of Assessment in Schools
  • More frequently used
  • Achievement how well is child doing in
    curriculum?
  • Aptitude what is this childs intellectual and
    other capabilities?
  • Behavior Is the childs behavior affecting
    learning?
  • Less frequently used
  • Teacher competence Is teacher actually imparting
    knowledge?
  • Classroom environment Are classroom conditions
    conducive to learning?
  • Other concerns home, community,...

5
Types of Tests
  • Norm-referenced
  • Comparison of performance to a specified
    population/set of individuals
  • Individually-referenced
  • Comparisons to self
  • Criterion-referenced
  • Comparison of performance to mastery of a content
    area what does the student know?
  • The data in the manual will allow you to do look
    at norms and at individual growth.

6
MAJOR CONCEPTS
  • Nomothetic and Idiographic
  • Samples
  • Norms
  • Standardized Administration
  • Reliability
  • Validity

7
Nomothethic
  • Relating to the abstract, the universal, the
    general.
  • Nomothetic assessment focuses on the group as a
    unit.
  • Refers to finding principles that are applicable
    on a broad level.
  • For example, boys report higher math
    self-concepts than girls girls report more
    depressive symptoms than boys..

8
Idiographic
  • Relating to the concrete, the individual, the
    unique
  • Idiographic assessment focuses on the individual
    student
  • What type of phonemic awareness skills does Joe
    possess?

9
Populations and Samples I
  • A population consists of all the representatives
    of a particular domain that you are interested in
  • The domain could be people, behavior, curriculum
    (e.g. reading, math, spelling, ...

10
Populations and Samples II
  • A sample is a subgroup that you actually draw
    from the population of interest
  • Ideally, you want your sample to represent your
    population
  • people polled or examined, test content,
    manifestations of behavior

11
Random Samples
  • A sample in which each member of the population
    had an equal and independent chance of being
    selected.
  • Random samples are important because the idea is
    to have a sample that represents the population
    fairly an unbiased sample.
  • A sample can be used to represent the population.

12
Probability Samples I
  • Sampling in which elements are drawn according to
    some known probability structure.
  • Random samples are subcases of probability
    samples.
  • Probability samples are typically used in
    conjunction with subgroups (e.g., ethnicity,
    socioeconomic status, gender).

13
Probability Samples II
  • Probability samples using subgroups are also
    referred to as stratified samples.
  • Standardization samples are typically probability
    or stratified samples.
  • Standardization samples need to represent
    population because the samples results will be
    used to create norms against which all members of
    population will be compared.

14
Norms I
  • Norms are examples of how the average
    individual performs.
  • Many of the tests and rating scales that are used
    to compare children in the US are
    norm-referenced.
  • An individual childs performance is compared to
    the norms established using a representative
    sample.

15
Norms II
  • For the score on a normed instrument to be valid,
    the person being assessed must belong to the
    population for which the test was normed
  • If we wish to apply the test to another group of
    people, we need to establish norms for the new
    group

16
Norms III
  • To create new norms, we need to do a number of
    things
  • Get a representative sample of new population
  • Administer the instrument to the sample in a
    standardized fashion.
  • Examine the reliability and validity of the
    instrument with that new sample
  • Determine how we are going to report on scores
    and create the appropriate tables

17
Standardized Administration
  • All measurement has error.
  • Standardized administration is one way to reduce
    error due to examiner/clinician effects.
  • For example, consider these questions with
    different facial expressions and tone
  • Please define a noun for me -)
  • DEFINE a noun if you can ? - (

18
Distributions
  • Any group of scores can arranged in a
    distribution from highest to lowest
  • 10, 3, 31, 100, 17, 4
  • 3, 4, 10, 17, 31, 100

19
Normal Curve
  • Many distributions of human traits form a normal
    curve
  • Most cases cluster near middle, with fewer
    individuals at extremes symmetrical
  • We know how the population is distributed based
    on the normal curve

20
Ways of reporting scores
  • Mean, standard deviation
  • Distribution of scores
  • 68.26 1 95.44 2 99.72 3
  • Stanines (1, 2, 3, 4, 5, 6, 7, 8, 9)
  • Standard scores - linear transformations of
    scores, but easier to interpret
  • Percentile ranks
  • Box and Whisker Plots

21
Percentiles
  • A way of reporting where a person falls on a
    distribution.
  • The percentile rank of a score tells you how many
    people obtained a score equal to or lower than
    that score.
  • So if we have a score at the 23rd tile and
    another at the 69th tile, which score is higher?

22
Percentiles 2
  • Is a high percentile always better than a low
    percentile?
  • It depends on what you are measuring.
  • For example.
  • Box and whisker plots are visual displays r
    graphic representation of the shape of a
    distribution using percentiles.

23
(No Transcript)
24
Correlation
  • We need to understand the correlation coefficient
    to understand the manual
  • The correlation coefficient, r, quantifies the
    relationship between two sets of scores.
  • A correlation coefficient can have a range from
    -1 to 1.
  • Zero means the two sets of scores are not
    related.
  • One means the two sets of scores are identical (a
    perfect correlation)

25
Correlation 2
  • Correlations can be positive or negative.
  • A correlation tells us that as one set of
    scores increases, the second set of scores also
    increases. they can be negative. Examples?
  • A negative correlation tells us that as one set
    of scores increases, the other set decreases.
    Think of some examples of variables with negative
    rs.
  • The absolute value of a correlation indicates the
    strength of the relationship. Thus .55 is equal
    in strength to -.55.

26
How would you describe the correlations shown by
these charts?
27
Correlation 4
  • .25, .70, -.40, .55, -.87, .58, .05
  • Order these from strongest to weakest
  • -.87, .70, .58, .57, -.40, .25, .05
  • We will meet 3 different types of correlation
    coefficients today
  • Reliability coefficients - Definitions?
  • Validity coefficients
  • Pattern coefficients

28
Reliability
  • Reliability addresses the stability, consistency,
    or reproducibility of scores.
  • Internal consistency
  • Split half, Cronbachs alpha
  • Test-retest
  • Parallel forms
  • Inter-rater

29
Reliability 2
  • Internal Consistency
  • How do the items on a scale relate to one
    another? Are respondents relating to them in the
    same way?
  • Test-retest
  • How do respondents scores at Time 1 relate to
    their scores at Time 2?

30
Reliability 3
  • Parallel forms
  • Begin by creating at least two versions of the
    exam. How do respondents performance on one
    version compare to their performance on another
    version
  • Inter-rater
  • Connected to ratings of behavior. How does one
    raters scores compare to anothers?

31
Validity
  • Validity addresses the accuracy or truthfulness
    of scores. Are they measuring what we want them
    to?
  • Content
  • Criterion - Concurrent
  • Criterion - Predictive
  • Construct
  • Face

32
Content Validity
  • Is the assessment tool representative of the
    domain (behavior, curriculum) being measured?
  • An assessment tool is scrutinized for its (a)
    completeness or representativeness, (b)
    appropriateness, (c) format, and (d) bias
  • E.g., MSPAS

33
Criterion-related Validity
  • What is the correlation between our instrument,
    scale, or test and another variable that measures
    the same thing, or measures something that is
    very close to ours?
  • In concurrent validity, we compare scores on the
    instrument we are validating to scores on another
    variable that are obtained at the same time.
  • In predictive validity, we compare scores on the
    instrument we are validating to scores on another
    variable that are obtained at some future time.

34
Structural Validity
  • Used when an instrument has multiple scales.
  • Asks the question, Which items go together best?
  • For example, how would you group these items from
    the Self-Description Questionnaire?
  • 3. I am hopeless in English classes.
  • 5. Overall, I am no good.
  • 7. I look forward to mathematics class.
  • 15. I feel that my life is not very useful.
  • 24. I get good marks in English.
  • 28. I hate mathematics.

35
Structural Validity 2
  • We expect the English items (3, 24), Math items
    (7, 28) and global items (5, 15) to group
    together.
  • The items that group together make up a new
    composite variable we call a factor.
  • We want each item to correlate highly with the
    factor it clusters on, and less well with other
    factors.
  • Typically, we accept item-factor coefficients
    from about .30 and higher.

36
What can we say about the structural validity of
the SDQ given these scores?
37
Construct Validity
  • Overarching construct Is the instrument
    measuring what it is supposed to?
  • Dependent on reliability, content and
    criterion-related validity.
  • We also look at some other types of validity
    evidence some times
  • Convergent validity r with similar construct
  • Discriminant validity r with unrelated construct
  • Structural validity What is the structure of the
    scores on this instrument?

38
Statistical Significance
  • When we examine group differences in science, we
    want to make objective rather than subjective
    decisions.
  • We use statistics to let us know if the
    difference we are observing occurs by chance.
  • In psychology, we typically set our alpha or
    error rate at 5 (i.e., .05), and we conclude
    that if a difference was likely less than 5 of
    the time, that difference is statistically
    significant.

39
Statistical Significance 2
  • When our statistical test tells us that our
    difference is statistically significant (i.e., lt
    .05).
  • Statistical significance is affected by a number
    of variables, including sample size. The larger
    the sample, the easier it is to achieve
    statistical significance.
  • We also look at the magnitude of the difference
    (or effect size).
  • A difference may be statistically significant,
    but have a small effect size.
  • .10 to . 30 small effect .40 to .60 medium
    effect gt .60 large effect.
Write a Comment
User Comments (0)
About PowerShow.com