Title: Evaluating educational technology outcomes: Stories from the trenches using the "error model" of res
1Evaluating educational technology outcomes
Stories from the trenches using the "error model"
of research
Jason Ravitz ltjason_at_bie.orggt Buck Institute for
Education, www.bie.orgBEAR Center Seminar,
November 19, 2002 University of California,
Berkeley
- We may not have a sophisticated aqueduct system,
but I have buckets on the ground to catch the
rain - Art Wolinsky, Teacher
Work funded by J.K. Kathryn Albertsons
Foundation www.jkaf.org
2The problemHow do we know if technology is
having an impact on teaching and learning in
schools?
- The area of technology research that is regarded
as both most important and most poorly addressed
in general practicethe investigation of the
effects of technology-enabled innovations on
student learning. (Haertel Means, 2000)
3Technology enabled innovations
- Is there an aggregate effect of technology?
(After all these years? Is Cuban still right?
(Becker Ravitz, 2001) - Technology is only one variablebenefits are tied
up in their circumstances and backgrounds of
teachers and learners - When does technology have a positive impact?
- How general is a positive experience with
technology?
4Program Evaluation
- The systematic process of asking critical
questions, collecting appropriate information,
analyzing, interpreting and using the information
in order to improve programs and be accountable
for positive, equitable results and resources
invested. - Source
- Univ. of Wisconsin-Extension,
- Program Development and Evaluation
- Research with Value Added!
5Purposes of EvaluationMark, Henry Julnes
(2000)
- 1. Assessment of merit and worth
- 2. Program and organizational improvement
- 3. Oversight and compliance
- 4. Knowledge development
- Without knowledge development, the others are
probably meaningless. You have to understand
what you are evaluating and avoid erroneous
conclusions.
6Error Model of Research
- An elegant approach provides the foundation for
scientific inquiry - Many statistics have a literal percent reduction
in error interpretation. - Helps us understand
- Research Design
- Inference
- Prediction
- Measurement (Katzer, 1981)
7Research is conducted to learn something about
the world. Acknowledging and controlling for
error allows this goal to be achieved. (Katzer,
p. 70)
- What you see A (truth) B (bias) C (noise)
- Uncontolled observations are likely to lead to
erroneous conclusions (p. 69) - Anyone can collect data. What matters is
collecting data you can believe. - Maximize the truth by minimizing error (bias and
noise) do this through methods and reasoning
8Error Model Research Design
- Error is what prevents you from seeing the truth
- What you see Truth Error
- Error consists of systematic error (bias) and
random error (noise) - What you see Truth Bias Noise
- By controlling for error (bias and noise) you
improve your knowledge
9Process of Inquiry
- Asking a good question (the hardest part!)
- Identifying EXISTING sources of information
(literature review) including known sources of
error! - Plan data collection to address known sources of
error --better than randomization! - Plan to address remaining error and unknown
sources of error (randomization, experimental
design)
10Error Understanding and Prediction
- Contributions to the field can be understood as
reduction of error. - This can be expressed mathematically, using the
familiar and generic forumula - Percent reduction in error (PRE)
- Error without knowledge Error with knowledge
- Error without knowledge
- Example statistics with literal PRE
interpretationscorrelations R2, regressions R2,
standardized residuals (variance forumula)
Gutmans Lambda, Gamma, Yules Q, Goodman
Kruskals Tau
11Relationship defined in error terms
- A relationship is present between two variables
when knowledge of one allows you to make a better
prediction of the other than you could make
without that knowledge. - Statistics tell you
- Is there a relationship?What direction?
- How strong?How likely caused by chance?
- Reason tells you
- What does it mean? Is it important? Is it
biased?
12Researchers have to address
- ACCURACY are the answers and results correct,
and - If so,
- GENERALITY to what people, objects, events,
times or conditions do the answers apply?
(Katzer, 1981) - Both are prone to error and can be addressed
empirically and using reason. - Possible Technology Questions
- Under what conditions/designed-or-not is it
useful or not? (Accuracy) - How widely can the findings be applied?
(Generality)
13There is an inverse relationship between error
and knowledge
Ravitz, 2002
14The Prediction Game
- The prediction game is an easy way to think about
statistics and the research process using the
error model. It demonstrates this guiding
principle - The better information and better knowledge about
a topic you have the fewer errors you will make.
- Note. This game does not mean you have to go
around predicting things. It means if you were
making an important prediction you would want to
consider possible sources of error and gather
information to help you avoid making these errors!
15Height Example
- Imagine this. Across the room there is a person
whose identity is concealed by a dark screen.
You know nothing about who the screen conceals
and you cannot see the person at all. - If you had to make your best guess, what strategy
would you use to guess this persons height?
Remember, you know nothing about this person. - This is the same as making your best guess about
the impact of technology in schools, if you had
no data.
16What information do you want?
- What information would you want in order to make
an educated guess about a strangers height?
Would you want to know if the person is - a male or female?
- how tall their parents are?
- their weight?
- if they want to be a professional basketball
player? - How confident are you that this information would
this help you guess this persons height more
accurately?
17Assessing your knowledge
- What if you found out the person is only six
years old? - The usefulness of one piece of information can
depend on another. In this case, knowing age is
rather essential. - With more knowledge about the person (say, age
and weight) you would improve your accuracy
reducing the error in your prediction.
18A textbook example
- How would you judge the value of a textbook for
your students? - By looking at the cover?
- Reading the table of contents?
- Reading a few key sections?
- As you collected more information you would feel
more confident in making a judgment about the
quality of the book. - Additional information
- Experiences of others
- Intended uses
- Actual uses in the classroom
- Who uses the textbook and who does it help?
- Reading level
- Cultural perspective
- Historical perspective
- Scientific perspective
- Conclusion It may be effective for some purposes
and not for others. If you had different
criteria for quality (biases) you might draw
different conclusions. The more you know about
the textbook (and potential sources of bias), the
less error you will make in judging its value.
19Error Model Research Design
- Use Lit review to identify sources of bias
- What information would you want to have?
- Control for systematic bias
- How could you have more confidence?
- (is better than/preceeds)
- Randomizing remaining error
- Systematic Error (Bias) is more problematic than
Random Error. It cannot be quantified and
treated mathematically.
204 sources of bias
- Measurement
- Researcher
- Subject (selection)
- Context
- It is better to address bias systematically than
converting it to noise. Randomization converts
all error to random error, it does not remove
error! (Katzer, 1981)
21Purpose of Research Design To identify and
prevent biasand (then) minimize noise
- Literature review (make an argument based on
reasoning that sources of bias have been
addressed) - Turning biasing variable into a constant
(argument for factual accuracy, at expense of
generality) - Include the biasing variable in the study!
(argument for factual accuracy and generality) - Randomization -- methodological case for factual
accuracy (random assignment) and generality
(random sampling)
22Error Measurement
- The validity of a measure is inversely related to
the amount of error (systematic and random) in
that measurement. - Large unknown biases cause the most difficulty
- Reliability does not deal with possible biases
(just like statistical significance cannot save a
bad study, reliable measure need not be valid) - Error grows exponentially throughout a study
principle of squared variance. Must be
removed as early as possible. - Katzer, 1981
23Error Inference
- Inferential statistics do NOT address bias, only
noise. - There is no relationship between probability and
importance. A good statistical result cannot
save a poorly designed study. - People who conduct research need to be
well-informed, clever, creative, methodologically
competent, and sometimes a bit lucky to obtain
important results. In contrast, low probability
results can be obtained by simply increasing the
number of people or objects used in the study
Katzer, 1981 my emphasis - I.e., you can BUY statistical significance!
The hard part is determining the accuracy and
generalizability finding using conceptual and
empirical analysis.
24Replications
- True confidence in the research findings
requires replications it requires agreement
among the results of other studies (Katzer, p.
67)
25How do we know if technology is having an impact?
- What would it look like? If I were a visitor,
what would I see? Invite others to give their
perspectives. Check your ideas with others. (Be
sensible, direct, and use proxy measures when
necessary). UW, Extension, Cooperative
Extension, p. 2-51
26What information would you require in order to
answer questions about educational technology
impacts?
- In order to confidently characterize technology
impacts in schools, what would you want to know?
- What would you ask a school principal to quickly
and confidently assess the impact of technology
use in her school? - List your top 3 questions
- ____
- ____
- ____
- How would the answers to these questions improve
your knowledge? - How confident are you that with this information
you would understand the role technology is
playing and its impact on learning?
27Logic Models
- Inputs
- What we invest
- Outputs
- What we do
- Who we reach
- Outcomes
- Short term results are (teacher school change?)
- Medium term results (student learning?)
- Long term results (student lifelong learning?)
- Source UW Extension Logic Model. E.
Taylor-Powell, 1998, University of
Wisconsin-Extension
28Technology is Multi-Level
SCHOOL LEVEL
School Outputs Engagement Learning Achievement
School Processes Decision-making (using
technology) Academic Social Climate
School Inputs Structural characteristics Student
composition Resources (technology)
Classroom Inputs Student composition Teacher
background Resources (technology)
Classroom Processes Curriculum Instructional
strategies (using technology)
Classroom Outputs Engagement Learning Achievement
CLASSROOM LEVEL
Student Outcomes Engagement Learning Achievement
Student Experiences Class activities Homework Use
of computers
Student Background Demographics Family
background Academic background
STUDENT LEVEL
Source Rumberg, 2000
29You cant study everything
- Depending on what data is deemed relevant, one
can tell very different stories about technology
use in schools. - No single set of numbers can ever tell the whole
story. Data is always collected or presented
selectively. - Remember to ask yourself
- What question is being asked (about schools,
teachers or students)? - Is this the right question?
- What can help you answer the question more
confidently?
30Major sources of error
- Not everything that counts can be counted
(UW-Extension, Cooperative Extension, p. 2-52) - Under-representation of technology supported
skills in traditional testing formats (Haertel
Means, 2000 Messick, 1998 Russell, 2000) - No fair comparison groups
- need for performance assessments that ALLOW but
do not REQUIRE technology use (Becker Lovitts,
2000) - Who does tons of work without using technology
(Coalition of Essential Schools, NASCD schools,
besides Co-NECT (Becker Ravitz, 1999) - Most paper authors stress the need for better
and more comprehensive measures of the
implementation of technology innovations and the
context or contexts in which they are expected to
functionPaper authors recommend combining
various methodologies in order to increase the
richness, accuracy, and reliability of
implementation data. (Haertel Means, 2000
Check all of these!)
31Educational Technology Research and Evaluation is
Error Prone
- Errors in are the likely result of
- Conceptualization/Design Fuzzy logic and goals
- Measurement Fuzzy measures
- Inference Complex situatedness
- Understanding Prediction Complex causality
- BUT can be worth it!....
- If a choice has to be made, it is much better to
obtain approximate answers to important research
questions than to obtain precise answers to
trivial questions (Katzer, p. 70).
32Shift the focus from teachers to student learning
- Teacher knowledge and skills and use of
technology does not directly address students
experience - Meaningful student use is much harder to predict
and to achieve than teacher knowledge and skills
(Ravitz, 1999, Buck Institte for Education ,
2002b) - Idaho research focuses on
- Longitudinal data from teacher professional
development program (Teaching with Technology
study) - Student achievement and technology use across the
state (Opportunity 1 study)
33Teacher Use Beliefs and Practices
- Teachers beliefs and practices are related to
the objectives for their technology use and the
extent of their software use - Coupled with a classroom cluster of computers and
technology expertise, a constructivist-oriented
social studies teacher is more likely to use
simulation software than others. - This is where you would predict there would be
more likely to be an impact
34Source Becker, AERA, 2002 Teaching, Learning
Computing 1998
35Possible Model of Teacher Teacher Technology Use
and Practices (Teaching, Learning Computing
1998)
Support Staff Development for Technology
Technology Decision- Making Structures
Technology Investment Practices
Access to Technology
Computer Use Background
Pattern of Computer Use
Teaching Philosophy
Educational Background Teaching Experiences
Current Teaching Practices Recent Changes in
Practice
Role Orientation
Practices of Other Teachers
Teaching Responsibilities
Staff Development in Constructivist Practice
Professional Work Culture
36Technology use and student achievement example
- Imagine you believe the most important things to
consider before drawing any conclusions about the
relationship of school technology use to
achievement are - student prior learning,
- use of computers on their own time, and
- use of computers in school.
- If these variables have a relationship with
learning, you could predict differences in
learning based on these conditions.
37Getting to learning outcomes Tough conceptual
work
Source Ravitz, 2002.
3 variables 8 conditions(easier to use
continuous measures for analyses, and counts for
descriptives)
38Idaho study School size, Income, Achievement
Computer Use
If one were not careful, one might conclude that
computer use is either unimportant or is
negatively related to achievement. At the same
time, there is a strong relationship between home
use and student achievement at school
Source Ravitz, Mergendoller Rush (2002)
39Student Computer Skills are Correlated with Use
in Home more strongly than with Use at School.
Users in BOTH locations report the most skills
40Idaho Student Software Capability related to
home use more than school use, but there is an
additive effect
Source Buck Institute for Education (2002a)
41Computer Skills AchievementHigher Achieving
Students Report More Computer Skills
Computer Skills Mean2 Sd .5
Test Score Quintiles (Within Gender Grade)
42School size in Idaho a suppressor variable on
the relationships between achievement and
technology use?
Source Buck Institute for Education (2002a)
43Idaho Teacher Measures and School Achievement
Gains
Source Buck Institute for Education (2002a)
44Idaho Teacher Measures and Achievement Gains
Source Buck Institute for Education (2002a)
45Idaho within school student scores and student
software capability
Source Buck Institute for Education (2002a)
46Warnings
- Here are some key points to keep in mind
- Failing to take sources of error into account can
lead to erroneous conclusions. Judgment and
experience play more of a role than is often
believed. Researchers and readers of research
have to decide if the results are correct, and if
so, to whom they apply. Statistics cannot
compensate for a poorly designed study, and they
cannot tell you if the finding is important.
(Katzer, 1981) - Averages for groups do not reflect on
individuals. Even if certain subgroups perform
lower than others, there is still often a full
range of performance within those subgroups. One
should not draw conclusions from group averages
via stereotyping when judging individuals, or
generalize from experiences with individuals to
others of the same group, (Popham, 2002)
47Other interests
- Problem-based Economics study (Buck Institute for
Education) - Online education Netcourse on Technology-supporte
d Assessment, WBI chapter on evaluation - Distance Scholarship cumulative use of
tools/resources by investigators contributing to
shared knowledge via a distance (i.e.,
replication studies, RD collaboration between
faculty, students, and developers) - www.bie.org
- www.bie.org/Ravitz
484 Modes of Inquiry
- 1. Description of experience Counts, s,
Means, Variances, Exploratory analyses (Hartwig
Dearing, 1979), qualitative validation - (weve been here) 2. Classification of underlying
structure - Correlations, reliability, factor
analyses - (we are moving here) 3. Causal analysis of
underlying mechanisms Sources of variance,
regression, alternative explanations - (and trying to get here) 4. Values Inquiry What
will better society? - (implications and policy arguments)
- Mark, Henry Julnes (2000)
49Teaching with Technology (TWT)Indicators of
effectiveness
- Reported Helpfulness and Attitude Changes
- Changes in Technology Training Requests
- Changes in Objectives for Computer Use
- Changes in Beliefs about Teaching and Learning
- Changes in Technology Skills
- Enthusiasm Reality effects occur, depending
on timing of data collection (Buck Institute for
Education, 2002b)
50References 1
- Baker, E. (1998, November). Understanding
Educational Quality Where Validity Meets
Technology. William H. Angoff Memorial Lecture
Series. Princeton, NJ Educational Testing
Service. WWW Document. Available
http//www.ets.org/research/pic/angoff5.pdf - Becker, H., Lovitts, B. (2000). A Project-Based
Assessment Model for Judging the Effects of
Technology Use in Comparison Group Studies. In
Haertel Means (Eds.), Stronger Designs for
Research on Educational Uses of Technology
Conclusion and Implications. Menlo Park, CA SRI
International. Available http//www.sri.com/poli
cy/designkt/found.html - Becker, H. Ravitz, J. (2001, April). Computer
Use by Teachers Are Cuban's Predictions Correct?
Paper presented at the 2001 Annual Meeting of
the American Educational Research Association,
Seattle. Available http//www.bie.org/Ravitz/ae
ra01.pdf - Black, P., Wiliam, D. (198). Inside the Black
Box, Phi Delta Kappan, October, 139-148.
Available http//www.pdkintl.org/kappan/kbla9810.
htm - Buck Institute for Education (2002a).
Opportunity One Technology Initiative Evaluation
for the J.A. and
Kathryn Albertson Foundation WWW Document.
Available http//www.bie.org/research/tech/large-
albertson.php - Buck Institute for Education (2002). Teaching
With Technology A Statewide ProfessionaL
Development Program Evaluation for the J.A. and
Kathryn Albertson Foundation. WWW Document.
Available http//www.bie.org/research/tech/twtfi
nal.php - Center on Education Policy. (2001, April). It
Takes More Than Testing Closing the Achievement
Gap. WWW Document. Available
http//www.ctredpol.org/improvingpublicschools/clo
singachievementgap.pdf - Haertel, G., Means, B. (2000). Stronger Designs
for Research on Educational Uses of Technology
Conclusion and Implications. Menlo Park, CA SRI
International. WWW Document. Available
http//www.sri.com/policy/designkt/found.html - Hartwig, F. Dearing, B. (1979). Exploratory
data analysis. Newbury Park, CA Sage
Publications. - Katzer, J. (1981) Understanding the Research
Process An Analysis of Error. In Busha,
Charles H., ed. A Library Science Research Reader
and Bibliographic Guide. Littleton, CO Libraries
Unlimited, pp. 51-71.
51References - 2
- Messick, S. (1998). The interplay of evidence and
consequences in the validation of performance
assessments. Educational Researcher, 23(2), 13-23 - Popham, J. (2002). Preparing for the coming
avalanche of accountability tests. Harvard
Education Letter, 18(3), pp 1-3. May/June. - Taylor-Powell, E. (2002, September). The Logic
Model A Program Performance Framework.
University of Wisconsin Cooperative Extension,
Madison, Wisconsin, http//www.uwex.edu/ces/pdand
e - Ravitz, J. (2002, June). Demystifying data about
technology impacts in schools. Paper presented
at the National Educational Computing Conference.
San Antonio, TX. WWW Dcoument. Available
http//www.bie.org/Ravitz/ - Ravitz, J., Mergendoller, J. Rush, W. (2002,
April). Cautionary tales about correlations
between student computer use and
academicachievement. Paper presented at annual
meeting of the American Educational Research
Association. New Orleans, LA. WWW Dcoument.
http//www.bie.org/research/tech/large-whatschool.
php - Russell, M. (2000). Its Time to Upgrade Tests
and Administration Procedures for the New
Millennium. Secretarys Conference on Educational
Technology. Washington, DC U.S. Department of
Education. WWW Document. Available
http//www.ed.gov/Technology/techconf/2000/russell
_paper.html - Rumberg, R. (2000). A multi-level, longitudinal
approach to evaluating the effectiveness of
educational technology. Paper presented at the
Design meeting on Effectiveness of Educational
Technology held at SRI International, Menlo Park,
California. February 25-26. - Teaching, Learning Computing 1998 WWW
Documents. http//www.crito.uci.edu/TLC - U.S. Department of Education, Office of the Under
Secretary, Planning and Evaluation Service,
Elementary and Secondary Division (2000,
October). Does professional development change
teaching practice? Results from a three year
study. U.S. Department of Education, Planning
and Evaluation Service. Doc2000-04. WWW
Document. Available http//www.ed.gov/offices/O
US/PES/school_improvement.htmlsubepdp2 - University of Wisconsin-Extension
http//www.uwex.edu/ces/pdand