Title: Alternative Assessment Approaches to Meeting Accountability Mandates: Issues and Initial Findings
1Alternative Assessment Approaches to Meeting
Accountability Mandates Issues and Initial
Findings
- Conference on Research Innovations in Early
Intervention - Spring 2006
- Contact Kristie Pretti-Frontczak
(kprettif_at_kent.edu) with questions/concerns - http//fpsrv.dl.kent.edu/ecis/Web/Research/CRIEI2
0Presentations.htm
2Panel Participants
- Discussant
- Diane Bricker, Professor Emeritus - University of
Oregon - Presenters
- Mary McLean, Professor - University of
Wisconsin-Milwaukee - Jennifer Grisham-Brown, Associate Professor
University of Kentucky - Kristie Pretti-Frontczak, Associate Professor
Kent State University - Additional Discussants
- Rena Hallam, Assistant Professor - University of
Tennessee-Knoxville - Tony Ledet, Assistant Professor - Southeastern
Louisiana University Kristen Missall, Assistant
Professor University of Kentucky
3Five Accountability Issues
- Creating an Unconnected Dual System
- Alignment with Standards
- Multi-Population Programs
- Authenticity
- Haste Makes Waste
4Desired Results Projects
- California Department of Education
5California Department of Education
- Child Development Division
Special Education Division
Desired Results System
6Desired Results Developmental Profile (DRDP)
- An observational assessment instrument was
developed at UCLA that could assess development
relative to the four Desired Results for children
in infant/toddler, preschool and after-school
state-funded programs
7At the Core of the DR System
Desired Results Developmental Profile
observation-based tool
DRDP
Adaptations for observing and reporting the
progress of children with disabilities
DRDP access
8Comparison of DRDP and DRDP access
- DRDP
- Is funded by CDD
- Is an observational instrument
- Is used with children who are enrolled in CDD
funded programs
- DRDP access
- Is funded by SED
- Is the same tool as DRDP, but includes
adaptations for children with disabilities - Is used for children with IFSPs or IEPs in
CDE/SED programs
9Change in measurement model (2002)
- Berkeley Evaluation and Assessment Research
Center (BEAR) - Observational assessment in typical environment
(authentic assessment) - Item Response Theory
- (Multidimensional Random Coefficients
Multinomial Logit Model)
10Universally Designed Assessments
- Assessments that are designed and developed from
the beginning to be accessible and valid for the
widest range of students, including students with
disabilities and English-language learners - http//education.umn.edu/nceo
11DRDP Pilots/Field Tests included children with
disabilities
- 1999-2000 pilots included 333 children with
disabilities (birth through five years) - 2001 pilot included 264 children with
disabilities (0-5) - 2003 field test included 625 children with
disabilities (0-5) - 2005 calibration study included 880 children with
disabilities (birth through five years)
12Universal Design
- Remove all non-construct oriented sensory,
emotional and physical barriers to children
demonstrating competence in a particular area - Does the child have to say the names of five
objects or can he communicate the names of
objects (perhaps using an assistive device)?
13Desired Results access
- Adaptations for children with disabilities have
been developed so that the DRDP will more
accurately reflect child abilities rather than
the impact of disability - The DRDP access includes a set of adaptations
that allows children with disabilities to
participate in the same assessment as their peers.
14Core Adaptations
- Allow a child to use an augmentative
communication device or communication system in
place of spoken language. - Allow a child to use an alternative mode to
produce written language. - Allow adequate time for a child who needs more
time for moving, responding, or processing
information. - Provide the visual supports the child might need
to see (lighting, visual contrast, or visual
aids).
15Core Adaptations
- Use assistive equipment or devices that the
child typically uses in daily routines and
activities. - Ensure functional positioning for the child with
a physical disability. - Provide sensory support if needed.
- Allow for alternative response modes to complete
a task.
16Change in IDEA requirements
- 2002-03 APR reporting developmental status
- IDEA 04 and the SPP developmental progress,
baselines and targets
17IDEA 04 and the SPP
- Need to report developmental progress between
entry and exit for all 3,4,5 in three broad
outcome areas - Need to report progress in relationship to same
age peers - Need to establish baseline data and targets for
six years
18DRDP access Meets Federal Requirements
19Development of the Birth-to-Five Instrument
20Desired Results access Birth-to-Five Instrument
- Data from previous field studies indicated
- The need for a continuous birth to five
instrument - The need to closely examine the linkages between
the DRDP Infant/Toddler and Preschool instruments
21Desired Results accessBirth-to-Five Instrument
- The DRDP access Birth-to-Five instrument
- Measures the same indicators as the DRDP
- An Indicator is a developmental domain that shows
progress towards a Desired Result - Will calibrate with the DRDP
- Allows children with disabilities to be assessed
across a birth to five continuum.
22Current Studies DRDP access
- Fall, 2005 Time 1 calibration study
- Spring, 2006 Time 2 calibration study
- Spring, 2006 Typical child study
23Organization of the DRDP
- The DRDP access is arranged as follows
- Desired Results
- Indicators
- Measures
- Developmental Levels
- Descriptors
- Examples
24Desired Results
- DR1 Children are Personally and Socially
Competent - SELF Children show self-awareness and a
positive self-concept. - SOC Children demonstrate effective social and
interpersonal skills. - REG Children demonstrate effective
self-regulation in their behavior. - LANG Children show growing abilities in
communication and language.
25Desired Results
- DR2 Children are Effective Learners
- LRN Children show interest, motivation, and
persistence in their approaches to learning. - COG Children show cognitive competence and
problem-solving skills through play and daily
activities - MATH Children demonstrate competence in
real-life mathematical concepts. - LIT Children demonstrate emerging literacy
skills.
26Desired Results
- DR3 Children Show Physical and Motor
- Competence
- MOT Children demonstrate an increased
proficiency in motor skills. - DR4 Children are Safe and Healthy
- SH Children show an emerging awareness and
practice of safe and healthy behavior.
27Mark the Highest Level of Mastery
1. Mark the Mastery Level
28Internal Consistency MML reliability for PreK
Instrument
29Special Education Preschool Demographics in
California
43 White 41 Latino 8 African American 7
Asian 1 Native American 1 Pacific Islander
30Implications for the Instrument
- The DRDP access ensures that the childs primary
mode of communication is used during the
assessment (e.g. augmentative communication
device, sign language, etc.) - The DRDP access supports the use of home language.
31Desired Results access Project Guidelines for
IEP Teams
- A one-page Decision-Making Guide was
- written to help IEP teams
- 1) determine the assessment that is most
appropriate for a child with disabilities DRDP
or DRDP access 0-5 - 2) Identify adaptations for the service delivery
environment and the observational assessments
32Project LINKJennifer Grisham-BrownRena Hallam
- Head Start/University Partnership grant
- Purpose to build the capacity of Head Start
programs to link child assessment and curriculum
to support positive outcomes for preschool
children - Focus on mandated Head Start Child Outcomes
- Concepts of Print
- Oral Language
- Phonological Awareness
- Concepts of Number
33Rationale for Project LINK
- Dissatisfaction with standardized assessment for
preschoolers - Disconnect between current assessment practices
and Head Start Child Outcomes - Recommended practices for assessment of young
children
34Project LINK Model
Activity- Based Assessments
Individual Learning Goals/Plans
Group Curriculum Plans
Ongoing Data Collection (Portfolio)
35Activity-Based Protocols
- Identify assessment activities
- Engaging/high interest activities
- Can assess an array of skills across domains
- Part of school schedule
- Determine skills
- Identify outcomes for program (Head Start
Outcomes Framework) - Identify developmental continuum that evidences
Head Start Outcomes (AEPS Bricker,
Pretti-Frontczak, Johnson, Strake, 2002) - Identify materials
- Developmentally/age appropriate
- Relate to skills to be assessed
- Relate to assessment activities
36Assessment Activities
- Play dough
- Snack
- Outdoor
- Manipulatives
- Book-Reading/Story
- Book About Me
- Dramatic Play
37Assessment Activities Protocol Example
38Individualized Child Plans
- Linked to Activity-Based Assessments
- Included family goals and input
- Focused on embedding skills into typical
activities and routines - Planned for ongoing documentation and collection
of evidence related to individual child goals
39Example of Individualized Child Plan
40Curriculum Planning Form
- Integrated individualized plans
- Used curriculum webbing to support integration of
learning areas - Planned for ongoing data collection connected
specifically to planned activities
41Ongoing Monitoring Portfolio Development
- Based on individualized goals
- Use of Work Sampling System (Meisels,
Dichtelmiller, Jablon, Marsden, 2001) - Evidence that documents individuals child
progress in target area over time
42Issues - Implementation
- Shifting paradigms at multiple levels classroom
program - Moving from assessment days to assessment every
day - Teachers need intensive support to implement
authentic assessment model - Teachers need tools to support the link between
assessment and curriculum - Understanding the developmental continuum that
undergirds child standards - Understanding of this assessment in relationship
to other purposes/types of assessment (screening,
diagnostic, monitoring progress)
43Challenges - Reporting
- Logistics data entry and analysis
- Reporting data
- Comparing individual children over time
- Educating programs regarding the differences
between criterion-referenced and norm-referenced
assessments - Exploring the use of aggregated
criterion-referenced data
44Does Authentic Assessment Yield Reliable and
Valid Data?
- Need to determine the reliability and validity of
this type of assessment model - Designed and implemented set of three studies
- Inter-rater reliability
- Fidelity
- Concurrent Validity
45Inter-Rater Reliability
- Subjects
- 7 Head Start Teachers
- 7 Head Start Teaching Assistants
- Method
- Practiced scoring AEPS items from video
- Scored AEPS items Checked against master score
provided by author - Results
- 7 of 7 teachers reached reliability at 80 or
higher (range 85 - 93) - 5 of 7 teaching assistants reached reliability at
80 or higher (range 75 - 90)
46Fidelity Study
- Subjects
- Six (6) Head Start teachers/teaching assistants
who reached 80 or higher on interrater
reliability study - Method
- Used fidelity measure to check teachers
implementation of authentic assessment within
seven (7) planned activities - Six (6) Authentic Assessment Variables
- set up and preparation decision making
materials choice embedding and procedure - Procedures
- Observed participants collecting AEPS data
during each 7 small group activities - Observed participants 7 times for up to 10
minutes per activity
47Average Ratings on Six Authentic Assessment
Variables across Observations and Activities by
Teacher
48Average Ratings on Six Authentic Assessment
Variables across Observations for Seven Different
Activities
49Concurrent Validity
- Purpose
- To examine the concurrent validity between a
traditional norm-referenced standardized test
(BDI-2) and an curriculum-based assessment
(AEPS) - Subjects
- 31 Head Start children
- Ranged in age from 48 months to 67 months
(M60.68, SD4.65) - Methods
- Six trained graduate students administered the
BDI-2 and six trained Head start teachers
administered the AEPS during a two-week period.
Conducted seven (7) bivariate 2-tailed
correlations (Pearsons and Spearmans) - Results
- Five correlations suggested a moderate to good
relationship between the BDI-2 and the AEPS - Two correlations suggested a fair relationship
between the BDI-2 and the AEPS
50Concurrent Validity Results
- Adaptive
- Self Care items from the BDI (M 66.03, SD
6.67) were moderately correlated with Adaptive
items from the AEPS (M 62.03, SD 13.57), r
.57, n 31, p .01. - Social
- Personal Social items from the BDI (M 175.15,
SD 22.74) had a fair correlation with Social
items from the AEPS (M 80.06, SD 16.33), r
.50, n 31, p .01. - Communication
- Communication items from the BDI (M 121.06, SD
16.22) were moderately correlated with Social
Communication items from the AEPS (M 88.61, SD
14.20), r .54, n 31, p .01.
51Concurrent Validity Results Continued
- Motor
- Gross Motor items from the BDI (M 82.76, SD
4.70) had a fair correlation with Gross Motor
items from the AEPS (M 30.10, SD 6.62), r
.48, n 31, p .01. - Fine Motor items from the BDI (M 52.45, SD
5.30) were moderately correlated with Fine Motor
items from the AEPS (M 26.39, SD 5.68), r
.58, n 31, p .01. - Perceptual Motor items from the BDI (M 27.73,
SD 3.63) were moderately correlated with Fine
Motor items from the AEPS (M 26.39, SD 5.68),
r .58, n 31, p .01. - Cognitive
- Cognitive items from the BDI (M 135.85, SD
23.44) were moderately correlated with Cognitive
items from the AEPS (M 81.26, SD 24.26), r
.71, n 31, p .01.
52Synthesis and Recommendations
- Rigorous implementation of curriculum-based
assessments requires extensive professional
development and support of instructional staff - Findings suggest that CBAs, when implemented with
rigor, have the potential to provide meaningful
child progress data for program evaluation and
accountability purposes
53Teachers Accuracy in Assessing Preschoolers
Cognitive SkillsUsing Observational Assessment
- Kurt Kowalski
- California State University San Bernardino
- Rhonda Douglas-Brown
- University of Cincinnati
- Kristie Pretti-Frontczak
- Kent State University
54Need
- Because observational assessments are
increasingly being used for accountability
purposes which require a higher standard of
precision than less high stakes assessments
(Shepard et al., 1998 Schweinhart, 2001),
research is urgently needed to investigate the
accuracy of these types of measures
55Current Study
- Study designed to investigate the accuracy of
teachers assessments of childrens skills and
abilities using observational assessment - Examined the degree of agreement between
assessments of childrens Language and Literacy
and Early Math skills made by their teachers
using an observational assessment instrument and
assessments of the same skills made by
researchers using a demand performance
instrument.
56Measures
- Observational Measure - Galileo Systems Scales
(Bergan, Bergan, Rattee, Feld, 2001) - Language Literacy-Revised Ages 3-5 (n68 items
full scale) - Early Math-Revised Ages 3-5 (n68 items full
scale) - Demand Performance Measure
- Items that could be readily assessed in
individual, one-session, performance-based
interviews with children were selected from the
Galileo Systems scales and converted into demand
performance tasks to create two performance
measures - Language Literacy (n21 items)
- Early Math (n23 items).
- Items varied in difficulty and knowledge domain
assessed. - Standardized sets of materials for administering
tasks were also developed (e.g., index cards with
printed objects, books, manipulatives, etc.). - The performance measures were piloted with
preschoolers in two regions of the state and
revised accordingly.
57Examples of Items
- Language Literacy
- The Galileo System
- Writes using scribble form.
- Writes using scribble form with some letter-like
shapes. - Writes her/his name without assistance.
- Performance Measure
- Heres a piece of paper and a crayon. Write your
name on this paper for me. - Early Math
- The Galileo System
- Counts to find how many are in a group lt 6.
- Counts to find how many are in a group lt 11.
- Makes two equal groups of objects (e.g., blocks).
- Performance Measure
- Present child with 3 counting bears. How many
things do we have here? - Present child with 10 counting bears. Now how
many are there? - Present 10 counting bears. Line up in straight
row. Can you split these bears into 2 piles so
that you have the same number of bears that I
have? Prompt, if necessary. Lets share. Make
2 groups so that I get the same number of bears
that you get.
58Characteristics of the Performance Measures
- Scale Reliability
- Language Literacy (? .81)
- Early Math (? .75 )
- Inter-rater Reliability
- Second scorers observed and coded the performance
of children for approximately 23 of the sample
(n28). - Language Literacy
- 93.4 agreement (range 82.1-100)
- 87 corrected for chance agreement (range
64.2-100) - Kappa .78 (range .34-1.00)
- Early Math
- 93.5 agreement (range 82.1-100)
- 87 corrected for chance agreement (range
64.2-100) - Kappa .79 (range .46-1.00)
59Procedures
- Trained research assistants visited sites across
the state - collected data teachers entered into the relevant
observation scales of the Galileo System and
- administered the Performance Measures.
- In order to ensure that the most up-to-date
information was obtained from the Galileo System,
data were collected during the 2 weeks prior to
and following a state mandated entry date. - Order of administration of Performance Measures
was counterbalanced across assessment domains.
60Participants
- 122 children
- ranged in age from 3 to 6 years (M4 years, 11
months) - 100 in state-funded Head Start programs
- 66 teachers
- Areas in which children are served
- 47 urban
- 41 suburban/small town
- 11 rural
- Representation by use of the Galileo System
- 38 first-year users
- 32 second-year users
- 23 third-year users
61Results
- Correlations and measures of agreement between
the Galileo System and the Performance Measures
were moderate. - Language Literacy (n19 items)
- r (n122) .60, plt.001
- 71 agreement (range 51.6-97.5)
- 41 corrected for chance agreement (range
3.2-95) - Kappa .18 (range -.08-.54)
- Early Math (n22 items)
- r (n122) .47, plt.001
- 66 agreement (range 41-95.8)
- 33 correct for chance agreement (range
-1.8-91.6) - Kappa .11 (range -.08-.44)
- Agreement for the Language Literacy scale was
significantly greater than for the Early Math
scale, p lt .001.
62Discordance between Measures
- Types of discordance
- Underestimations
- Instances when teachers recorded that children
had NOT learned a skill or ability, but the child
demonstrated it during the Performance Measure - Greater for Language Literacy (M 19) in
comparison to Early Math (M17), plt.05. - This was true even though the overall percent of
disagreements was greater for Early Math (M34)
in comparison to Language Literacy (M29),
plt.001. - Overestimations
- Instances when teachers indicated that children
HAD learned a skill or ability, but the child did
NOT demonstrate it during the Performance
Measure. - Greater for Early Math (M17) than Language
Literacy (M10), plt.001. - When discordance occurred for Language
Literacy, it was more likely to be due to
underestimations than overestimations (plt.001). - For Early Math, both types of discordance were
equally likely to occur.
63Discordance between MeasuresLanguage Literacy
64Discordance between MeasuresEarly Math
65Conclusions
- Overall, levels of concordance were moderate.
- In the domain in which teachers were most
conservative in attributing abilities to
children, Language Literacy, there was the most
amount of agreement between data teachers entered
into the Galileo System and the Performance
Measure (71). - In the domain in which teachers were most
generous in attributing abilities to children,
Early Math, there was the least amount of
agreement between the data teachers entered into
the Galileo System and the Performance Measure
(66). - Reliability
- Teachers using the naturalistic observation
instrument (the Galileo System) are not providing
inflated estimates of childrens skills and
abilities. - However, they may be underestimating childrens
skills and abilities in the domain of Language
Literacy.
66Implications
- The acceptability of the moderate degree of
concordance as an indicator of reliability in the
current study may depend on the purpose of the
assessment. - Planning
- Tracking child progress
- Accountability
- Reliability of teachers assessment of
preschoolers cognitive developmental skills and
abilities using naturalistic observation may
improve by - Using assessments with strong psychometric
properties and stated criteria - providing teachers with instruction (e.g.,
manual) in how to set up the classroom
environment to properly observe skills and
abilities that were consistently underestimated - conducting inter-rater reliability training
sessions to ensure the intended interpretation of
items and - providing teachers with more professional
development opportunities.