Title: Public Perceptions and Professional Judgment: Working Together to Better Understand the Myths
1Public Perceptions and Professional Judgment
Working Together to Better Understand the Myths
The Reality of Testing The Current
Accountability System
- David Abrams
- Assistant Commissioner For Standards,
- Assessment and Reporting
- Albany Reading Council
- College of Saint Rose
- January 13, 2005
2Testing Assessment Converging/Diverging
Critical Perspectives
- Curriculum Instruction
- NYS Learning Standards Performance Indicators
- Quality First Teaching/Teacher Qualifications/Pro
fessional Development - Measurement
- Summative/Formative Multiple Measures
Alignment - Large Scale Assessment Programs Tests of
Demonstrated Quality (NYSTP K-12) - Teacher Administrator Familiarity w/Testing
Principles Measurement Theory
3Testing Assessment Converging/Diverging
Critical Perspectives
- Policy
- Commissioners Regulations
- Accountability Systems State (NYSED) NCLB
(Federal)
4NYSTP Quality Assessment
Data Driven Schools Utilization of Data to
Inform Curriculum, Instruction, Assessment
(Hard, Fluid Numbers) Continuous Improvement
5Comprehensive Assessment System Multiple
Measures/Data Sources
- NYSTP K-8 Testing Regents Examinations/RCTs
Alternate Assessment NYSESLAT (Large Scale
Assessment) - External Standardized Assessments, e.g. TerraNova
AP (Large Scale Assessment) - Formative Assessment
- Internally Designed, Local Assessment (Teacher
Generated or purchased/Mid-Terms-Final
Exams-Standardized Tests, Course Grades) - What do you want to learn from the data and are
you sure you are asking the right questions
regarding the type of assessments you are
reviewing? - This is a process of Comparative Analysis
6 Why Test in Grades 3 Through 8?
- Initially, mandated by federal government
- Also presents the opportunity to
- Evaluate the implementation of the learning
standards annually - Measure student progress
- Gather information about student readiness for
study at the next level
7No Child Left Behind (NCLB)
- Specifies that statewide tests must
- Address the depth and breadth of the state
content standards - Be valid, reliable, and of high technical quality
- Be designed to provide a coherent system across
grades and subjects
8 What Will These Tests Look Like?
- The NYS tests are designed to measure student
achievement in English Language Arts (ELA) and
mathematics in grades 3 through 8. - The tests reflect New York State content/process
standards in each grade and subject area. - Signal priority content
- Are instructionally sensitive
- For ELA independent writing prompts have been
removed editing paragraph has been added. - Tests in both subjects will be similar in format
to existing Grade 4 and 8 assessments.
9Grade 3 ELA Test Design
- Session 1 (Reading)
- Format
- 3 to 4 passages (literary and informational)
- 20 multiple choice items
- 1 constructed response item
- Standards 1, 2, 3 measured
- Session 2 (Listening/Writing)
- Format
- 1 listening selection (literary)
- 4 multiple choice items
- 2 constructed response items
- 1 editing paragraph
- Standards 1, 2, 3 measured
10Grade 4 ELA Test Design
- Session 1 (Reading)
- Format
- 4 to 5 passages (literary and informational)
- 28 multiple choice items
- Standards 1, 2, 3 measured
- Session 2 (Listening/Writing)
- Format
- 1 listening selection (literary)
- 2 constructed response items
- 1 extended response item
- Standard 2 measured
- Session 3 (Reading/Writing)
- Format
- 2 paired passages
- 3 constructed response items
- 1 extended response item
- Standard 3 measured
11Grade 5 ELA Test Design
- Session 1 (Reading)
- Format
- 3 to 4 reading passages (literary and
informational) - 20 multiple choice reading items
- 1 constructed response item
- Standards 1, 2, 3 measured
- Session 2 (Listening/Writing)
- Format
- 1 listening selection (informational)
- 4 multiple choice listening items
- 1 constructed response item
- 1 editing paragraph
- Standards 1 and 3 measured
12Grade 6 ELA Test Design
- Session 1 (Reading)
- Format
- 4 to 5 passages (literary and informational)
- 26 multiple choice items
- Standards 1, 2, 3 measured
- Session 2 (Listening/Writing)
- Format
- 1 listening selection (literary)
- 3 constructed response items
- 1 extended response item
- Standard 2 measured
- Session 3 (Reading/Writing)
- Format
- 2 paired passages
- 3 constructed response items
- 1 extended response item
- Standard 3 measured
13Grade 7 ELA Test Design
- Session 1 (Reading)
- Format
- 4 to 5 passages (literary and informational)
- 26 multiple choice items
- 2 constructed response items
- Standards 1, 2, 3 measured
- Session 2 (Listening/Writing)
- Format
- 1 listening selection (informational)
- 4 multiple choice items
- 2 constructed response items
- 1 editing paragraph
- Standards 1and 3 measured
14Grade 8 ELA Test Design
- Session 1 (Reading)
- Format
- 4 to 5 passages (literary and informational)
- 26 multiple choice items
- Standards 1, 2, 3 measured
- Session 2 (Listening/Writing)
- Format
- 1 listening selection (informational)
- 3 constructed response items
- 1 extended response item
- Standard 1 measured
- Session 3 (Reading/Writing)
- Format
- 2 paired passages
- 3 constructed response items
- 1 extended response item
- Standard 3 measured
- Sessions 1 and 2 will be given on one day
15Testing Times for ELA Mathematics
16Testing Times for ELA Mathematics
Sessions 1 and 2 will be given on one day
both ELA and math.
17Field-Testing Update
- ELA February 7-11-No Change From Original
Schedule - Math Late May-Change from Original Schedule Due
to Demands of Test Development and the Approval
of New Math Standards
18Psychometric Architecture
- Yearly testing affords a rare opportunity to
measure student growth over time. - SED is looking at the Vertical Scaling designs
- To create Vertical Scales, the tests must be
linked and share common items from grade to
grade only multiple-choice items will be used.
19Psychometric Architecture
- Tests will be standard-set by NYS teachers after
initial administration in 2006.
20Test Measurement Principles
- Based on how the test results are to be used, is
there adequate evidence of the propositions to
document the validity of the inferences for
students taking the test? - Is there adequate evidence of reliability of the
test scores for proposed use?
21Test Measurement Principles
- What is the purpose for which the test is being
used? - What information, besides the test, is being
collected to inform this purpose? - What are the particular propositions that need to
be true to support the inferences drawn from the
test scores for a given use?
22Test Measurement Principles
- Is there adequate evidence of fairness in
validity and reliability to document that the
test score inferences are accurate and meaningful
for all groups of students taking the test? - Is there adequate evidence that cutscores have
been properly established and that they will be
used in ways that will provide accurate and
meaningful information for all test takers? - Source The Use of Tests as Part of High-Stakes
Decision- Making for Students A Resource Guide
for Educators and Policy-Makers. U.S.
Department of Education Office for Civil Rights.
2000.
23Recommended Resources
AERA, APA, NCME (1999). Standards for
educational and psychological testing. Washington
D.C. AERA. Cizek, Gregory J. (Ed.) (2001).
Setting Performance Standards Concepts, Methods,
Perspectives. Mahwah, NJ Lawrence Erlbaum
Associates. Tindal, Gerald Thomas M. Haladyna
(Eds.) (2002). Large-Scale assessment programs
for all students. Mahwah, NJ Lawrence Erlbaum
Associates. Educational Measurement Issues and
Practice National Council on Measurement In
Education Quarterly Journal.