Title: Effects of Question Format on 2005 8th Grade Science WASL Scores
1Effects of Question Format on 2005 8th Grade
Science WASL Scores
2A Big Thank-you!
- WERA
- Pete Bylsma
- Andrea Meld
- Roy Beven
- Yoonsun Lee
- Joe Willhoft
- North Central ESD
3Todays Presentation
- National trends in assessment
- Washington State trends
- My research on the science WASL
- A look at the literature to try to explain
research results - Take-home messages
4National Trends in Science and Mathematics
Assessments
Placing More Emphasis On
Compared To
- Assessing what is valued in science professional
community (inquiry, application) - Assessing tightly integrated knowledge linked to
application - Involving teachers and professionals in test
development
- What is easily measured
- Discrete bits of knowledge
- Off-the-shelf commercial tests
5Improvements in theNational Assessment of
Educational Progress (NAEP)
- Items grouped into thematic blocks with rich
context. - Real-world application.
- Emphasizes integrated knowledge rather than bits
of information.
6 The NAEP Results
- Lower omission rates on thematically grouped
items compared to stand-alone m/c items. - Increased student motivation to try item
- Increased student engagement
- (Silver, et al., 2000 Kenney Lindquist, 2000)
7Washingtons Science Standards Strands
8Washingtons Science Strands
92 Science WASL Question Types
- Mostly Scenario Type
- Rich Context
- Clear, authentic task
- 5 to 6 multiple-choice, short or
extended-constructed response items
- Few Stand-Alone Type
-
- Discreet bits of knowledge
- 1 multiple-choice or short-constructed response
item
103 Item Response Formats
- Extended Constructed Response (ECR)
- Students write 3-4 sentences
- Short Constructed Response (SCR)
- Students write 1-2 sentences
- Multiple-choice (M/C)
113 Categories of Factors That Affect Student
Achievement Scores
- (The Student) Model of Cognition
- Culture
- Gender, Ethnicity
- Individual differences
- (The Test Item) Observation
- Item format
- Interpretation
- Measurement model
- (IRT, Bayes Nets)
12The Test Item - Observation
- Girls scored much lower on m/c compared to boys
(Jones et al., 1992) - Girls scored higher on constructed response
compared to boys (Zenisky et al., 2004) - Underrepresented groups score higher on
performance-like formats (Stecher et al., 2000) - Embedded Context Increased comprehension
(Solano-Flores, 2002 Zumbach Reimann, 2002)
13States 2005 Science WASL Scores
14Statement of Problem
- Is the science WASL
- accurately measuring
- what students know?
15Hypothesis
- Contextual, real-world scenarios make information
accessible to all ethnicities (cultural
validity). - Clear, authentic tasks within scenario questions
unpacks prior knowledge for ALL students - Gender neutral extended and short constructed
response formatsnot just m/c
16Research Questions
- On the 2005 8th grade science WASL
- Is there any significant difference in
performance between gender and/or ethnic groups - on stand-alone question types?
- 2) on scenario question types?
17Methods - Instrument
- OSPI provided results from 8th grade 2005 science
WASL - Entire population N 81,690
- Invalid records excluded (e.g. cheating)
- Incomplete records excluded (e.g. gender or
ethnicity omitted) - Actual population N 77,692
18Methods - Analysis
- MANOVA follow-up ANOVAs
- Dependent Variable
- scenario score points
- stand-alone score points
- Independent Variables
- gender
- ethnicity
19Methods - Analysis
- Analysis I
- All item response formats
- Analysis II
- Multiple-choice response formats only
- Effect Size (Cohens d)
- Magnitude of differences
20Results
21Stand-Alone Question Type
Gender Groups NO Ethnic Subgroups
YES Ethnicity x Gender-YES Gender Very
small Ethnicity x Gender very small Ethnicity
Small to Moderate Between White,Asian,MultiRacial
AND AI/AN, HPI, Black, Hispanic groups
- Analysis Of Variance
- Significant Differences?
- Effect Size
22Scenario Question Type
Gender Groups NO Ethnic Subgroups
YES Ethnicity x Gender-YES Gender Very
small Ethnicity x Gender very small Ethnicity
Large Effect Size Between White,Asian,MultiRacial
AND AI/AN, HPI, Black, Hispanic groups
- Analysis Of Variance
- Significant Differences?
- Effect Size
23Result 1
- The achievement gap
- between ethnic subgroups
- is LARGER
- on SCENARIO
- vs. stand-alone question types.
24Result 2
- More students
- received MORE points
- on STAND-ALONE question
- types compared to
- scenario question types.
25Result 3
- A new achievement gap
- between boys and girls
- IS CREATED
- when extended
- constructed response items
- were removed.
26Three(3) Prevailing ThemesIn the Literature
toHelp Explain Differences in Student
Achievement
27THEME I - Individual Differences
- Expert/Novice Theory
- (Alexander, 2003 Chi, 1988)
- Novice-Dependent on working memory limits.
- Expert-Fluent. Freed-up w.memory to focus on
meaning/execution of problem.
28THEME II - Opportunity To LearnQuality Teaching
Learning (Darling-Hammond, 2000)
- There are differences between schools in
students exposure to knowledge or OTL - Deep understanding of science strategic
processing knowledge often requires direct
instruction lots of practice (Garner, 1987) - OTL are often compromised in high-need schools
(lack of PD support, supplies)
29Theme III - Attributes of Items
- Passage Length (Davies, 1988)
- 2) Academic Vocabulary (Schaftel et al., 2006)
- 3) Degree of Knowledge Transfer (Chi et
al., 1987)
- 4) Ambiguity Complexity in Performance-Like
Items (Haydel, 2003) - 5) Science Strand Type (Bruschi Anderson, 1994)
- 6) Instructional Sensitivity of Item (DAgostino
et al., 2007)
30Sensitivity of Items to Variations in Classroom
Instruction
Standards
The Test Gap
The Learning Gap
Some item response formats are more sensitive to
variations in classroom instruction than others.
(DAgostino et al., 2007)
31Translating This Into Classroom Practice
- Inspired to dig deeper into detailed learning
progressions from novice to expert. - Use these principals in your formative assessment
process can identify where students need rich
feedback - Many teachers are creating common
Classroom-Based-Assessments (CBA) for quarterly
benchmarking.
32To Go Classroom Based Assessment (CBA) Creation
Checklist
- Because not all items are created equal.
33Lessons to Go
- Use all 3 item response types in your
classroom-based assessments (CBAs). - Keep passage length at a minimum to tease apart
content knowledge from reading ability and
working memory limitations.
34Lessons to Go
- Use the same academic vocabulary in the classroom
and on your CBAs that is on the WASL. - Use embedded context in a way that is similar to
how students learned the material.
35Suggestions for Future Research
- 1- Do similar patterns within question types
exist between Schools? Classrooms? - 2-Deeper examination of performance variance at
the item level. What level of strategic
processing knowledge is assumed compared to
content knowledge? - 3- Students perceptions of assessment items
(think-aloud protocol). - 4- Do the same patterns exist independent of
reading proficiency?
36References Page 1
- Alexander, P. A. (2003). The development of
expertise The journey from acclimation to
proficiency. Educational Researcher, 32(8),
10-14. - Anderson, J. R. (1990). Cognitive Psychology and
Its Implications (3rd ed.). New York W.H.
Freeman - Bruschi, B. A., Anderson, B. T. (1994). Gender
and ethnic differences in science achievement of
nine-, thirteen-, and seventeen-year-old
students. Paper presented at the Eastern
Educational Research Association, Sarasota, FL. - Chi, M. T., Glaser, R., Farr, M. J. (1988). The
Nature of Expertise. Hillsdale, NJ Lawrence
Erlbaum Associates. - Cohen, D. K., Hill, H. C. (2000). Instructional
policy and classroom performance The mathematics
reform in California. Teachers College Record,
102(2), 294-343. - D'Agostino, J. V., Welsh, M. E., Corson, M. E.
(2007). Instructional sensitivity of a state's
standards-based asssessment. Educational
Assessment, 12, 1-22.Darling-Hammond, L. (2000).
Teacher quality and student achievement A review
of state policy evidence. Seattle Center for the
Study of Teaching and Policy, University of
Washington.
37References Page 2
- de Ribaupierre, A., Rieben, L. (1995).
Individual and situational variability in
cognitive development. Educational Psycologist,
30(1), 5-14. - Garner (1987). Garner, R. (1990). When children
and adults do not use learning strategies
Towards a theory of settings. Review of
Educational Research, 60, 517-529. - Haydel, A. M. (2003). Using cognitive analysis to
understand motivational and situational
influences in science achievement. Paper
presented at the AERA, Chicago, Il. - Shaftel, J., Belton-Kocher, E., Glasnapp, D.
Poggio, J. (2006). The impact of language
characteristics in mathematics test items on the
performance of English language learners and
students with disabilities. Educational
Assessment, 11(2), 105-126.Marshall (1995). - Woltz, D. J. (2003). Implicit cognitive processes
as aptitudes for learning. Educational
Psycologist, 38(2), 95-104.