Title: Standardized Testing and California Schools
1Standardized Testing and California Schools API
Scores
2Lets Start Thinking
- 1. Where is the best place to examine direct data
about student learning? - 2. List at least three advantages and three
disadvantages to using standardized assessment
tools. - 3. List at least three advantages and three
disadvantages to using local or homegrown
assessment tools. - 4. What are some advantages to embedded
assessment?
3Whats the Deal with Testing?
- ?As a society, we like numbers. If sometime can
be quantified, it is viewed as valid or more
scientific. If it cannot be quantified, we view
the activity with suspicion. - ?Machine scoring of a test is fast, efficient,
and cheap. - ?Hand scoring of a test is slow, time consuming,
and very expensive.
4Lessons from the Past
- ? Mass testing came about in the late 1800s /
early 1900s. - ? Originally used to decide who was qualified to
attend universities and who was bound to work in
factories. - ? Attempted to model the efficient factory
methods of Henry Ford test should be easy,
cheap, and work for everyone. - ? Early IQ Tests (the Alpha-Beta Tests) were
developed for the U.S. Army as a way to decide
the career path of new recruits. - ? Early test also developed to determine which
immigrants could enter the U.S.
5Standardized Tests Whats the Difference?
- Criterion-Referenced Test
- ?Criterion-referenced tests, also called mastery
tests, compare a person's performance to a set of
objectives. Anyone who meets the criterion can
get a high score. - ?Everyone knows what the benchmarks / objectives
are and can attain mastery to meet them. - ?It is possible for ALL the test takers to
achieve 100 mastery.
6Standardized Tests Whats the Difference?
- Norm-Referenced Test
- ?Norm-referenced tests compare an individual's
performance with the performance of others. - ?They are designed to yield a normal curve, with
50 of test takers scoring above the 50th
percentile and 50 scoring below it, so half the
test takers MUST pass and half the test takers
MUST fail - ?The test makers design the test with questions
that MOST people will get incorrect. - ?If too many people get a question correct, or
too many score well, then test questions are
thrown out until they achieve a normal curve
again.
7Interpreting Test Scores (some definitions)
- Raw score. This is the number of items the
student answered correctly. It is used to
calculate the other, more useful scores. - Stanine. One of nine equal sections of the normal
curve. Stanines can be easily averaged and
compared from test to test, but are less precise
than other scores. - Normal curve equivalent (NCE). For these scores,
the normal curve is divided into equal units
ranging from 1 to 99, with an average of 50.
These can be averaged and compared from test to
test or year to year.
8Normal Curve
- Half of the test takers are grouped into the
passing region of the curve and half into the
failing region of the curve. - So by definition, half the test takers MUST
fail, i.e. be below the 50th percentile.
9State/School Goals
- So when a school says that their goal is to have
70 of their students above the 50th percentile,
is this possible? - Well, yes, but it would mean that another school
would have to have 70 of their students below
the 50th percentile.
10Closer to Home San Diego City Schools (SDCS)
- In 2001, SDCS officials reported that as a
district (second largest in the state), they had
66 of their students above the 50th percentile
on the SAT/9 test for 2000. - The news media reported the shame of SDCS
because 1/3 of their students where below the
50th percentile. - Was this a fair report??
11MEASUREMENT AND EVALUATIONCRITERION- VERSUS
NORM-REFERENCED TESTING
- Many educators and members of the public fail to
grasp the distinctions between criterion-reference
d and norm-referenced testing. It is common to
hear the two types of testing referred to as if
they serve the same purposes, or shared the same
characteristics. Much confusion can be eliminated
if the basic differences are understood. - The following is adapted from Popham, J. W.
(1975). Educational evaluation. Englewood Cliffs,
New Jersey Prentice-Hall, Inc.
12MEASUREMENT AND EVALUATIONCRITERION- VERSUS
NORM-REFERENCED TESTING
Dimension Criterion-ReferencedTests Norm-ReferencedTests
Purpose To determine whether each student has achieved specific skills or concepts. To find out how much students know before instruction begins and after it has finished. To rank each student with respect to theachievement of others in broad areas of knowledge. To discriminate between high and low achievers.
13MEASUREMENT AND EVALUATIONCRITERION- VERSUS
NORM-REFERENCED TESTING
Dimension Criterion-ReferencedTests Norm-ReferencedTests
Content Measures specific skills which make up a designated curriculum. These skills are identified by teachers and curriculum experts. Each skill is expressed as an instructional objective. Measures broad skill areas sampled from a variety of textbooks, syllabi, and the judgments of curriculum experts.
14MEASUREMENT AND EVALUATIONCRITERION- VERSUS
NORM-REFERENCED TESTING
Dimension Criterion-ReferencedTests Norm-ReferencedTests
ItemCharacteristics Each skill is tested by at least four items in order to obtain an adequate sample of student performance and to minimize the effect of guessing. The items which test any given skill are parallel in difficulty. Each skill is usually tested by less than four items. Items vary in difficulty. Items are selected that discriminate between high and low achievers.
15MEASUREMENT AND EVALUATIONCRITERION- VERSUS
NORM-REFERENCED TESTING
Dimension Criterion-ReferencedTests Norm-ReferencedTests
ScoreInterpretation Each individual is compared with a preset standard for acceptable achievement. The performance of other examinees is irrelevant. A student's score is usually expressed as a percentage. Student achievement is reported for individual skills. Each individual is compared with other examinees and assigned a score--usually expressed as a percentile, a grade equivalent score, or a stanine. Student achievement is reported for broad skill areas, although some norm-referenced tests do report student achievement for individual skills.
16Tests Currently Used in California
- ?California Achievement Test 6th Edition
(CAT/6) National Norm Referenced Test - California Standards Test (CST) State Norm
Referenced Test w/ Scaled Scores - ?Golden State Exam Criterion Referenced Test
- ?CA-High School Exit Exam (CA-HSEE) Criterion
Referenced Test
17Testing Case In Point
18Testing Case In Point
- In this scenario we will use a fictitious
norm-referenced test being given a a single
high school.
19Testing Case In Point
- John and his fellow students at Anywhere High
School are given the Lets Achieve Test version
1 (LAT/1). - The LAT/1 is a norm-referenced test.
20Testing Case In Point
- John does not perform well on the test, compared
to the other test takers. - He scores below the 50th percentile and is
classified below grade level. - John spends the next school year getting extra
tutoring, staying after school, and going to
Saturday tutoring sessions.
21Testing Case In Point
- The following school year on the LAT/1, John
performs better than he did the previous year. - However, because of a school-wide focus on the
test, all the other students in the school also
perform better. - As a result, Johns norm-reference test score is
still below the 50th percentile and he is still
classified as below grade level.
22Academic Performance Index (API)
- The API score was originated to provide a
systematic method to rank order schools based on
a number of criteria. It is to measure academic
growth and performance of a school. The schools
would receive a rank compared to ALL other
schools in the state and a second ranking
comparing them to SIMILAR schools around the
state.
23Early Proposed API Criteria (1999)
- ?Test Results (SAT/9) 60 of score
- ?Attendance Rates
- ?Graduation Rates
- ?Other statewide test results (GSE, CA-HSEE)
- From 1999 to 2002 ONLY the SAT/9 Test results are
used to calculate 100 of a schools API score.
24Current API Criteria (baseline set in 2002)
- ? California Achievement Test (CAT/6) about 12
of score. Includes mathematics, reading,
language, science - ? California Standards Test (CST) about 73 of
score.Includes mathematics, science, language
arts, social science - ? CA- High School Exit Exam (CA-HSEE) about 15
of score. - Eventually API scores will also include
graduation and attendance rates from schools as
part of the overall score.
25Consider This
- So, does this system adequately measure the
success of CA students? - Does it reflect the learning that is happening in
CA classrooms?
26Some Questions
- What are the appropriate uses of Norm-reference
tests? Criterion-reference tests? - How should these test be used at the
state/district/school level? - What role does testing play in looking at school
performance? Student performance? Teacher
performance?
27The Real Question We Should Ask
- Testing is a reality that is here to stay.
- It has been legislated by the state of CA under
the STAR system and by the federal government by
the NCLB Act. - So we should really be askingHow do we use
these tools to support students and their
learning in CA schools?