Title: Obtaining International Benchmarks for States Through Statistical Linking: Presentation at the Insti
1Obtaining International Benchmarks for States
Through Statistical LinkingPresentation at the
Institute of Education Sciences (IES)National
Center for Education Statistics (NCES)
- Gary W. Phillips
- Chief Scientist
- American Institutes for Research
- May 30, 2008
This paper is intended to promote the exchange of
ideas among researchers and policy makers. The
views expressed in it are part of ongoing
research and analysis and do not necessarily
reflect the position of the National Center for
Education Statistics, the Institute of Education
Sciences, or the U.S. Department of Education.
2Two ways of obtaining International Benchmarks
for States
- Hard Way
- Each State is administered PISA, TIMSS or PIRLS,
and each country is administered PISA, TIMSS or
PIRLS. - Very Expensive and impractical.
- Easy Way (statistical linking)
- Each State Is administered NAEP, and each country
is administered TIMSS or PIRLS. - Inexpensive and practical.
3PISA Cannot be Statistically Linked to NAEP
- PISA is an age-based assessment (with grade 10 as
the modal grade), NAEP is grade-based (grade 8)
so PISA results cannot be statistically linked to
NAEP. - Even if NAEP conducted a special study to make it
possible to link NAEP and PISA (e.g., an
age-based special assessment of NAEP with a modal
grade of 10), State-NAEP still cannot be compared
to PISA because State-NAEP is still in the 8th
grade.
4TIMSS was Purposefully Designed to be
Statistically Linkable to NAEP
- Same grades as NAEP
- Similar content standards to NAEP
- Similar test design
- Matrix sampling of cognitive test items
- Policy/research related Background questions
- Similar nationally representative sampling
- Similar scaling (item-response theory)
- Similar analysis methods (plausible values)
5What is Statistical Linking?
- If you link test X to Y, this means
- We are expressing the scores on one test (X) in
terms of the metric of another test (Y) - For example, we are expressing the scores on NAEP
(X) in terms of the metric of TIMSS (Y) - Four types of linking equating, calibration,
projection, moderation.
6Assumptions in Linking
7Why is it Important to Statistically Link NAEP
to TIMSS or PIRLS?
- Allows policy makers to see how the U.S. (as a
whole), the states and school districts stack up
against the rest of the world. - Provides a common metric that is familiar to US
policy makers. Its like converting world
currencies to dollars. - Seeing the results of TIMSS or PIRLS through the
lens of NAEP Achievement Levels provides a
familiar benchmark to interpret international
educational performance
8What are the Ideal Data Requirements for
Statistically Linking NAEP to International
Assessments?
- NAEP and the international assessment must be
administered within the United States to groups
of students which are - Randomly equivalent
- Nationally representative
- In the same grades
- In the same year
- NAEP and the international assessment must cover
similar (but not identical) content
9Possible Statistical Linkages With NAEP During
This Decade
10Using NAEP linked to TIMSS as an International
Benchmark in Mathematics (Phillips, Chance
Favors the Prepared Mind, AIR, 2007)
11Using NAEP linked to TIMSS as an International
Benchmark in Science (Phillips, Chance Favors
the Prepared Mind, AIR, 2007)
12One potential way of Obtaining State Estimates
for PISA is Through Small Area Estimation
- If a survey has been carried out for the nation
as a whole, the sample size may be too small for
each state to generate accurate state estimates
from the collected data. - To deal with this problem, it may be possible to
supplement the survey data with auxiliary data
(such as the CCD) and use regression modeling
techniques in order to obtain state estimates.
13Statistical Linking Versus Small Area Estimation
- Statistical Linking - Two assessments (e.g., NAEP
and TIMSS), measuring similar content, are
administered to randomly equivalent national
samples and the scales are statistically linked
(sort of like converting Celsius to Fahrenheit). - Small Area Estimation One assessment (e.g.,
PISA) is administered to a national sample and
auxiliary data (such as the CCD) is used with
regression modeling techniques to obtain state
estimates.