Emulating human essay scoring with machine learning methods Darrell Laham Tom Landauer Peter Foltz Cognitive Systems: Human Cognitive Models in System Design June 30, 2003 - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Emulating human essay scoring with machine learning methods Darrell Laham Tom Landauer Peter Foltz Cognitive Systems: Human Cognitive Models in System Design June 30, 2003

Description:

The physician is in surgery. The doctor operates on the patient. Latent ... devices, and high-speed processors to cater for the needs of the computers under ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 27
Provided by: UNM79
Category:

less

Transcript and Presenter's Notes

Title: Emulating human essay scoring with machine learning methods Darrell Laham Tom Landauer Peter Foltz Cognitive Systems: Human Cognitive Models in System Design June 30, 2003


1
Emulating human essay scoring with machine
learning methodsDarrell LahamTom
LandauerPeter FoltzCognitive Systems Human
Cognitive Models in System DesignJune 30, 2003
2
  • Marcia Derr, Ph.D.
  • Scott Dooley
  • Terry Drissell
  • Dave Farnham
  • Peter Foltz, Ph.D.
  • Shawn Frederickson
  • Brent Halsey
  • Pat Hilton-Suiter
  • Darrell Laham, Ph.D.
  • Tom Landauer, Ph.D
  • Karen Lochbaum, Ph.D.
  • Dian Martin
  • Jeff Nock
  • Jim Parker
  • Randy Sparks, Ph.D.
  • Lynn Streeter, Ph.D

3
Taxonomy of essay assessment
  • Writing Assessment Types
  • Composition (Language Arts)
  • Does the writer write well?
  • Exposition (Content Areas, e.g. history)
  • Does the writer understand the topic?
  • Levels of Assessment
  • 1. Holistic Scoring
  • 2. Trait and Componential Scoring
  • 3. Annotation
  • 4. Situated Value Judgments
  • Which levels are open to automated scoring?

4
Taxonomy of essay assessment
Level 1
Level 2
Level 3
Level 4
Analytics
Annotations
Language Arts (composition)
Local Errors
Trait Scores
Levels of Assessment
Situated Value Judgments
Holistic Score
Content Areas (exposition)
Truth Values
Knowledge
5
Architecture of scoring systems
  • Intelligent Essay Assessor technologies
  • Latent Semantic Analysis for scoring quality of
    content and providing tutorial feedback
  • Style Mechanics measures for scoring and
    validation of essay as appropriate for task
  • Student essays written to directed prompts
  • Constructed-response alternative to
    multiple-choice for domain knowledge assessment
  • Directed essay questions or summaries
  • Reliable, objective, consistent and immediate
  • Used as second reader, formative evaluations,
    diagnostic tutorials, interactive textbooks

6
Architecture of scoring systems
7
Latent Semantic Analysis
  • LSA is both a psychological theory of knowledge
    representation and a computational modeling and
    application tool
  • LSA learns the relationships between text
    documents and their constituent words (terms)
    when trained on large numbers of background texts
    (thousands to millions)
  • Each term, document, or new combination of terms
    (new document) is represented as a point in a
    high dimensional Semantic Space (300-500
    dimensions, not 2 or 3)
  • LSA effectively measures semantic content against
    prescribed standards of quality based on human
    judgments
  • Extensive and varied research shows LSA judgments
    of similarity agree well with human judgments

8
Meaning Based Representation
  • LSA is NOT simple co-occurrence
  • Over 99 of word pairs whose similarities are
    induced never appear together in a context
    (paragraph)
  • Synonyms are rarely seen in the same context
  • LSA is NOT simple keyword matching
  • LSA operates on the deep level (latent) meaning
    of words rather than the surface characteristics
    (exact matches)

9
doctor physician surgeon lawyer attorney
doctor 1
physician 0.61 1
surgeon 0.64 0.65 1
lawyer 0.06 0.06 0.13 1
attorney 0.03 0.05 0.09 0.73 1
10
The doctor operates on the patient. The physician is in surgery. He is the car doctor.
The doctor operates on the patient. 1
The physician is in surgery. 0.86 1
He is the car doctor. 0.49 0.35 1
11
Latent Semantic Analysis
12
What features of LSA are most important?
  • It is a fully automated model of memory
  • Training data of same magnitude as human
    experience
  • It begins with first-order local associations
    between a stimulus and other temporally
    contiguous stimuli
  • Represents concepts and contexts (episodes) in
    same way
  • Conjointly learns about concepts from their
    natural contexts and contexts from their
    constituent concepts
  • No explicit hand coding of rules or features
  • Induction stage for generalization
  • High dimensional vector mathematics offer
    neurologically plausible computations
  • Not claimed to be a comprehensive model

13
What features of LSA are ad hoc?
  • Based on performance in applications, not
    requirements of cognitive models
  • Singular Value Decomposition (SVD) as induction
    mechanism
  • Many other candidate algorithms have emerged
  • SVD can solve (750K X 10M matrix for 300
    dimensions on 8 node Beowulf in 20-30 hours)
  • Emphasis on easily parsable symbol systems, e.g.
    text
  • Text is relatively easy to work with compared to
    visual data
  • Now applied to other symbol systems, e.g. genetic
    codes
  • Text pre-processing specifics
  • Local log, global entropy weighting
  • Similarity metrics (Cosine, Euclidean Distance,
    etc.)

14
Performance assessment of system
15
Performance assessment of system
16
Performance assessment of system
17
Performance assessment of system
18
Performance assessment of system
19
Performance assessment of system
20
Performance assessment of system
21
Performance assessment of system
22
Performance assessment of system
  • Focus is on quality of content as judged by
    people rather than on measures of surface
    features keywords
  • Uses background knowledge of domain in assessment
    in addition to previously scored essays
  • Measures what students are saying rather than
    just how well they are saying it
  • Does best when linked to course student learning
    materials provides formative assessment of
    domain knowledge with tutorial feedback rather
    than just a simple overall score
  • Requires fewer training essays (100 vs. 500)
  • More difficult to coach student in ways to
    receive artificially high score (e.g. use
    semi-colons or say Thus and Therefore)
  • Models do NOT use any count variables (Word
    count, etc.)

23
Performance assessment of system
24
Performance assessment of system
25
Performance assessment of system
MAINFRAMES Mainframes are primarily referred to
large computers with rapid, advanced processing
capabilities that can execute and perform tasks
equivalent to many Personal Computers (PCs)
machines networked together. It is characterized
with high quantity Random Access Memory (RAM),
very large secondary storage devices, and
high-speed processors to cater for the needs of
the computers under its service. Consisting of
advanced components, mainframes have the
capability of running multiple large applications
required by many and most enterprises and
organizations. This is one of its advantages.
Mainframes are also suitable to cater for those
applications (programs) or files that are of very
high demand by its users (clients). Examples of
such organizations and enterprises using
mainframes are online shopping websites such as
Ebay, Amazon, and computing-giant Microsoft.
MAINFRAMES Mainframes usually are referred those
computers with fast, advanced processing
capabilities that could perform by itself tasks
that may require a lot of Personal Computers (PC)
Machines. Usually mainframes would have lots of
RAMs, very large secondary storage devices, and
very fast processors to cater for the needs of
those computers under its service. Due to the
advanced components mainframes have, these
computers have the capability of running multiple
large applications required by most enterprises,
which is one of its advantage. Mainframes are
also suitable to cater for those applications or
files that are of very large demand by its users
(clients). Examples of these include the large
online shopping websites -i.e. Ebay, Amazon,
Microsoft, etc.
26
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com