NCLB and Growth Models: In Conflict or in Concert - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

NCLB and Growth Models: In Conflict or in Concert

Description:

Models are consistent with policy goals. Models are integrated as a ... Models are implemented in a manner consistent with the values of educational research ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 42
Provided by: Marti3
Category:

less

Transcript and Presenter's Notes

Title: NCLB and Growth Models: In Conflict or in Concert


1
NCLB and Growth Models In Conflict or in Concert?
  • Susan L. Rigney, United States Department of
    Education
  • Joseph A. Martineau, Michigan Department of
    Education
  • Presented at the MARCES conference on
  • Longitudinal Modeling of Student Achievement
  • College Park, MD
  • November 7, 2005

2
Introduction
  • In response to your concerns about giving
    schools credit for improving student achievement,
    we are also considering the idea of a growth
    model
  • Margaret Spellings
  • 9/13/05

3
Author Perspectives
  • Sue Rigney
  • Education Specialist in the office of Student
    Assessment and School Accountability (Title I) at
    the U. S. Department of Education.
  • Primary responsibility monitoring state
    compliance with the standards, assessment and
    accountability requirements of NCLB
  • Secondary responsibility contributing to
    ongoing discussion, clarification and
    implementation of policies related to assessment
    and accountability.

4
Author Perspectives
  • Joseph Martineau
  • Psychometrician for the Michigan Office of
    Educational Assessment and Accountability.
  • Primary concerns congruence of accountability
    systems with values of educational research
    adequacy of statistical psychometric
    methodology
  • His secondary concerns philosophy and policy of
    accountability in terms of both practicality and
    feasibility
  • Authorship should not be construed as an
    endorsement of NCLB as a whole.

5
In conflict?
  • CRS says
  • Substantial interestin the possible use of
    individual/cohort growth models Such AYP models
    are not consistent with certain statutory
    provisions of NCLB as currently interpreted by
    USED
  • But, NCLB (Sec 4) says
  • The Secretary shall take such steps as are
    necessary to provide for the orderly transition
    to, and implementation of, programs authorized by
    this Act

6
In concert?
  • USED Growth Model Study Group
  • IES grant for longitudinal data systems
  • State Accountability Workbook Amendments

7
Types of Models
  • Definitions developed by a State collaborative
    through CCSSO (Goldschmidt et al, 2005)
  • Definitions
  • Cross-sectional models
  • Status Models
  • Improvement Models
  • Longitudinal Models
  • Growth Models
  • Residual Growth (RG) Models
  • Commonly labeled Value Added Models
  • Why we use the term RG

8
The Intersection of Policy and Growth Models
  • 3-8 Assessments Provide Longitudinal Data
  • Safe Harbor
  • Use of Improvement Index in AYP
  • CCSSO SCASS Activities
  • USED Assistant Secretary Luce

9
Systemic CoherenceA Standard for Evaluating
Models
  • Three broad principles of systemic coherence
  • Models are consistent with policy goals
  • Models are integrated as a part of a consistent
    system of content standards, assessments,
    performance standards, and accountability
    criteria
  • Models are implemented in a manner consistent
    with the values of educational research

10
1. Standards-based
  • Assessments must cover depth and breadth
  • Results expressed in terms of performance levels
  • Proficient is most influential component of AYP

11
2. All Students
  • Participate (95 rule)
  • Results reported for all
  • AYP Not all Visible
  • Full Academic Year
  • Minimum n
  • LEP exemption for ELA test
  • Held to same standards
  • Alternate based on alternate achievement standards

12
3. School Improvement
  • Annual Measurable Objectives
  • Increased in 2004-05
  • Adjustment for transition in 2005-06
  • School accountable for subgroups
  • More visible in 2005-06
  • Consequences
  • Can/should growth moderate consequences?

13
Consistency of Content Standards, Assessments,
Performance Standards, and Accountability Criteria
  • Accountability based on academic indicators
  • Peer Review of State Assessment Systems
  • Alignment
  • Performance descriptors
  • Alternate assessments

14
Coherent Assessment System
  • State assessments
  • Rational, coherent design
  • Relative contribution of different tests
  • Matrix forms equivalent
  • Comparability
  • English vs Spanish
  • Computer vs paper pencil
  • Local assessments
  • Aligned, equivalent, comparable results for
    subgroups, aggregable

15
Results understandable
  • Educators know what to do
  • Articulation across grades
  • Articulation across performance levels
  • A progression matrix that show
  • Proficient is different from basic because
  • Proficient in third grade is different form
    proficient in fourth grade because
  • Administrators know how to allocate resources

16
Consistency with Values of Educational Research
  • As defined by Gregory N. Derry1.
  • Free flow of information Curiosity
  • Replicability
  • Thorough peer review
  • Improvement
  • Honesty and Open-mindedness
  • Willingness to consider multiple alternatives
  • Scrupulous investigations of weaknesses
  • Flexibility to adopt feasible improvements

1 Professor of Physics at Loyola University and
author of What Science Is and How It Works
(Princeton University Press, 1999)
17
Attributes of Systemic Coherence Applicable in
this Context
  • Alignment of standards and assessments
  • The same performance standards for all
  • Inclusion of all student groups
  • Explicit tracking of achievement gaps
  • Appropriate statistical and psychometric models
  • A program of ongoing research
  • Consistency of reports with all other attributes

18
1. Alignment of Standards and Assessments
  • Foundation of validity of school accountability
    decisions
  • USED expects independent verification of
  • Full range of content standards?
  • Address content and process skills?
  • Same degree and pattern of emphasis?
  • Scores reflect full range of achievement?
  • Procedures to maintain/improve?

19
Alignment methods
  • Alignment Methodology
  • Webb (SCASS TILSA)
  • Porter (SCASS SEC)
  • Achieve
  • Buros
  • Methods do not address articulation across grades
  • JM Current instantiations of independent
    review may underestimate alignment

20
2. The Same Standards for All Students
  • Grade-level achievement standards
  • Except for students with most significant
    cognitive disabilities (1)
  • All students proficient by 2013-14
  • What about growth toward proficient?
  • What about length of time in system?
  • Proposals to balance fairness toward both
    educators and student groups should also be a
    part of any plan to implement growth models for
    accountability purposes. Fairness toward one
    should not be sacrificed for fairness toward the
    other.

21
2. The Same Standards for All Students
  • JM The NCLB expectation that all students will
    be proficient by a given date seems unreasonable.
    The recognition that there will always be
    individual differences among students (and
    aggregate differences across schools in their
    intake populations) should also be incorporated
    in setting policy targets.
  • SR Safe harbor recognizes that adequate yearly
    progress may be met with less than 100 meeting
    annual and long-range goals.
  • JM The safe harbor provision of NCLB is a good
    beginning, but does not fully account for these
    realities.

22
2. The Same Standards for All Students
  • JM The punitive nature of NCLB consequences can
    actually undermine policy objectives by adding
    turbulence to schools serving low-achieving
    students.
  • SR The pressures of accountability have resulted
    in remarkable successes (Ed Trust), and there are
    multiple safeguards to prevent Type I error.
  • JM The multiple safeguards are an important
    starts, but policies encouraging more assistance
    in and attraction of highly effective educators
    to low-achieving schools is more likely to
    support the policy objectives.
  • SR NCLB funds are available for recruitment and
    retention bonuses, and data indicate that states
    are beginning to use these funds in this way.

23
Implications for growth model
  • Expectation of same growth for all maintains
    achievement gap
  • Expectation of 12 months growth in 1 year
    maintains achievement gap
  • Expectation of normative growth maintains
    achievement gap

24
3. Inclusion of All Student Groups
  • Missing data means missing students
  • How many missing students does it take to
    compromise validity?
  • Robustness to missing data does not imply that it
    is OK to leave out data where it can reasonably
    be obtained

25
4. Explicitly Tracking Achievement Gaps
  • Closing the achievement gap is a
  • Policy objective
  • Matter of ethics
  • Attainable
  • Tracking the achievement gap makes inequities
    publicly visible

26
4. Explicitly Tracking Achievement Gaps,
continued
  • Separate models from those used to track
    attainment of growth targets
  • Include in the model variables defining
    policy-defined subgroups
  • Interaction of grade with subgroup variables
  • Simple graphical representation of the results

27
5. Appropriate Statistical and Psychometric Models
  • Statistical concerns
  • Match of model to data structure
  • Violations of assumption
  • Do random effects models cheat?
  • How do we integrate results from alternate
    assessments?
  • What is the sample, and what is the population?
  • Different models needed for different purposes
  • Meeting growth targets
  • Tracking achievement gaps
  • Primary research

28
5. Appropriate Statistical and Psychometric Models
  • Statistical concerns
  • Are the models correlational or causal? The
    mandated data collection is correlations.
  • JM The mandated policy uses are more causal.
    The descriptive statistics are used to label
    schools as in need of improvement, and if
    students are not achieving reasonable goals, it
    is hard to argue with this label. However, the
    distinction between schools in need of
    improvement and ineffective educators is unlikely
    to be either fathomed or appreciated by many
    people. The nature of NCLB consequences invites
    this unfounded interpretation.
  • SR The statute provides substantial resources
    for professional development and instructional
    materials in order to help educators meet the
    extraordinary needs of the children they serve.

29
5. Appropriate Statistical and Psychometric
Models, continued
  • Unwarranted assumptions
  • No equating error
  • Vertical Doran (2005)
  • Horizontal not studied, but most assessments
    only have a few anchor items in common across
    years
  • Interval level scale
  • If using scale scores, most models assume equal
    interval measurement
  • Psychometrically suspect
  • Effects not well studied

30
5. Appropriate Statistical and Psychometric
Models, continued
  • Unwarranted assumptions, continued
  • A single continuous scale on the same construct
    across grades (vertical or developmental scales)
  • Mathematical demonstrations (Martineau, 2004, in
    press)
  • We purposely build content shift into our
    assessments across grades
  • High correlations among sub-constructs do not
    take care of the problem
  • Students where growth is occurring outside the
    curriculum-defined range for the grade are not
    measured well
  • Effects of prior schools/grades become attributed
    to later schools/grades
  • Practically significant effects of the
    misattributions occur in all reasonably
    conceivable assessment scenarios
  • Empirical validation (Lockwood et al, under peer
    review)
  • Subscales of math assessment, greater variability
    within teacher across subscales than across
    teachers within subscale.
  • Low correlations in value added across
    subscales
  • The sub-content matters tremendously

31
5. Appropriate Statistical and Psychometric
Models, continued
  • Unwarranted assumptions, continued
  • We need to account for equating error
  • We need to study the effects of the
    interval-level measurement assumption and either
  • Validate the assumption, or
  • Not make the assumption
  • We need to either
  • Develop psychometric models that can account for
    change in content across grades, or
  • Not assume the same content across grades
  • Analytical models that avoid scale assumptions
  • Hills Value Table approach (this conference)
  • Betebenner transition matrix approach (2005)
  • Standards-based interpretations, can use baseline
    data

32
6. An Ongoing Program of Research
  • A turbulent field (in its adolescence, to quote
    Lissitz)
  • Large-scale implementation in a turbulent field
    requires extraordinary flexibility to keep up
    with the state of the art
  • And yet, too much flexibility can thwart useful
    interpretation of trend data

33
7. Consistency of Reports with Other Attributes
  • Responsive to instruction?
  • Understandable to stakeholders?
  • Grounded in policy aims?
  • Valid reliable?

34
Setting standards for growth
  • Whats reasonable?
  • vs
  • What do we hope to accomplish?
  • Whats fair?

35
Growth school consequences
36
Conclusions
  • Can we add growth?
  • Yes!
  • Should we add growth?
  • Yes, where there is an evaluative framework tied
    to policy objectives, a systemic approach, and
    alignment with the values of educational research
  • Must we add growth?
  • An option, not a requirement because of the
    extraordinary necessary infrastructure

37
Recommendations for Policymakers
  • Understand the basic differences between models
    Run simulations with real data
  • Understand the limitations
  • Listen to practitioners
  • Listen to methodologists
  • Anticipate cost/benefits
  • Lack of stability corrupts meaning
  • Do not over-specify the details in statute
  • This field moves ahead quickly
  • Flexibility to implement advances is key

38
Recommendations for Accountability Implementation
Staff
  • State Directors give your staff time to write it
    up!!
  • Require greater detail in the Technical Manuals
    that allows for comprehensive review of the
    procedures
  • Explain it (as much as you can) to your
    legislators and Congresspersons
  • Challenge assumptions
  • Status quo is good
  • Change is good
  • Resource assumptions
  • Claims of proponents

39
Recommendations for Technical Researchers
  • Validity need not conflict with transparency
  • Validity
  • Maintain sufficient complexity to produce valid
    results
  • Transparency for non-technical stakeholders
  • Simple, but accurate reports
  • Grounded interpretations
  • Transparency for technical stakeholders
  • Comprehensive documentation of the entire system,
    including psychometric and statistical models
  • Facilitation of replication
  • Facilitation of primary research on strengths and
    weaknesses

40
Recommendations for Technical Researchers
  • Pay systemic attention to
  • Assumptions of psychometric models
  • Assumptions of content standard models
  • Assumptions of statistical models
  • Think carefully about what the models can tell us
    and cannot tell us about instruction, curriculum,
    and student development
  • Develop simple graphical representations of the
    model and its important concepts for policymaker
    consumption
  • Become involved in public policy forums as a
    community lobby in order to promote appropriate
    interpretation of data.
  • We cannot give our cautions, wash our hands of
    how the data is used, and stand on the outside of
    the political process

41
Recommendations for All Stakeholders
  • Realize that with all of the high stakes
    surrounding accountability uses of student
    achievement data, there are forces that can work
    against community interests
  • Economic benefits, reputations, and other
    personal investments can cause proponents of
    specific systems to avoid scrupulous
    investigations of the shortcomings of those
    systems and/or the benefits of competing
    approaches
  • Willingness to be and accountability for being
    rigorously honest and open-minded about multiple
    approaches is an essential part of improving and
    evaluating growth-based accountability systems
Write a Comment
User Comments (0)
About PowerShow.com