Automatic Assessment of Spoken Modern Standard Arabic - PowerPoint PPT Presentation

About This Presentation
Title:

Automatic Assessment of Spoken Modern Standard Arabic

Description:

SPANISH. DUTCH. speech. report. Communication Network. Delivery. Interface. California ... Accurate test of Arabic listening & speaking ... Natives by Countries ... – PowerPoint PPT presentation

Number of Views:211
Avg rating:3.0/5.0
Slides: 25
Provided by: Jar39
Category:

less

Transcript and Presenter's Notes

Title: Automatic Assessment of Spoken Modern Standard Arabic


1
Automatic Assessment ofSpoken Modern Standard
Arabic
  • NAACL
  • Boulder, Colorado
  • 5 June 2009
  • Pearson Knowledge Technologies
  • Palo Alto, California
  • Jian Cheng
  • Jared Bernstein
  • Ulrike Pado
  • Masa Suzuki

2
Outline
  • Pearson Knowledge Technologies
  • How Versant tests operate
  • 2. Versant Arabic Test (development)
  • 3. Validation evidence
  • 4. Predictive accuracy

3
Pearson Knowledge Tech. (PKT)
  • (KAT Ordinate) are now PKT
  • KAT LSA, Essay Scoring, Write-to-Learn, PTE,
    etc.
  • Ordinate Versant, ORF for NCES, VersaReader,
    PTE, etc.)
  • PKT is part of Pearson
  • Pearson FT, Economist, Penguin, Longman,
    PsychCorp, etc
  • PearsonKT is in Boulder, Colorado and Palo Alto,
    California.

4
Test delivery
Scoring system
ENGLISH
speech
Database tests, prompts, responses
ARABIC
Delivery Interface
Communication Network
DUTCH
report
SPANISH
California
Anywhere
5
How Versant tests operate
The trains been delayed by one hour
Test Delivery Server
Versant Database
Scoring
6
Versant Arabic Test
  • DLI purpose
  • 1000 students at DLI need predictive speaking
    tests
  • Requirements
  • Accurate test of Arabic listening speaking
  • Convenient to use at DLI and worldwide (ILR is
    costly)
  • Suitable for repeated formative testing
  • High peak capacity for mass screening

7
Construct Comparison
  • OPI Construct Oral Proficiency as manifest in
    an Oral Proficiency Interview, is compatible with
    communicative competence as reflected in the
    functional level and/or complexity of content
    accurately produced.
  • Versant Construct facility in spoken language
    the ability to understand spoken language and
    speak appropriately in response at a
    conversational pace on everyday topics.

8
Versant Arabic Test
Test Structure
Part A Reading Part B Repeat -1 Part C Short
Answers Part D Sentence Builds Part E Repeat
-2 Part F Passage Retelling
9
Versant Scoring
10
How Versants are developed (1)
ScaleEstimates
NativeJudges
scale scores
Criteria
Internal
Ordinate System
Versant Scores
NativeScribes
transcripts
Validation
(Versant Arabic Test)
External
Recorded Items
Item Text
ILR Scores
Arabic Natives
Concurrent ILR Interviews
Arabic Learners
Native TestDevelopers
Test Spec
11
Arabic Challenges Voweling
  • kutubu al-waladi the books of the boy
  • kataba al-waladu wrote the boysubj
  • No disambiguating short vowels written
  • Vowels carry phonetic information
  • Vowels carry grammar information

12
Complex Morphology
ziyaarat
naa
li
  • for visit of us for our visit
  • Complicates lexicon lookup, frequency estimates
  • Short Arabic items are harder than English
    items with the same number of words

13
Development Run-time Processes
  • Compilation of expectation and runtime flow

14
Training data sources
Prompt Voices and Training Samples
Prompt Voices Prompt Voices Prompt Voices Prompt Voices Prompt Voices Prompt Voices Prompt Voices Prompt Voices
Country Egypt Iraq Jordan Morocco Lebanon Palestine Syria
Voices F, M F, M M F M F, M F, M
Native Data Native Data Native Data Native Data Native Data  
Egypt Syria Iraq Palestine Other Total
484 281 179 187 517 1648
Learner Data Learner Data  
DLI Non-DLI Total
1120 552 1672
15
Validation Criteria
  • Reliability
  • Scores are consistent
  • Validity
  • Native and non-native speakers should be clearly
    distinct
  • MSA and dialect speakers should be
    distinct(since were testing MSA)
  • Machine scores should predict human scores

16
Reliability
Score Split-Half Reliability (N 134) Test Retest Reliability (N 100)
Overall 0.98 0.97
Sentence Mastery 0.97 0.96
Vocabulary 0.89 0.82
Fluency 0.97 0.96
Pronunciation 0.96 0.94
17
Native Non-Native Scores
18
Natives by Countries
19
Educated Uneducated Speakers
Cumulative Density
Arabic Overall Score
20
Machine Human Comparison
Score Correlation(N 134)
Overall 0.97
Sentence Mastery 0.97
Vocabulary 0.96
Fluency 0.84
Pronunciation 0.83
21
How Versants Compare to OPIs
ILR OPI Score (logits)
N 118 r 0.87
Versant Arabic Overall Score
22
Spanish English Versant Human
Spanish
English
N 37r 0.92
N 151r 0.86
23
Summary
  • Versant Arabic Test (VAT) is in operation
  • Based on a large and wide body of transcribed
    spoken material
  • VAT is available on demand
  • Returns consistent, accurate scores that reflect
    real-time skills with MSA
  • VAT can triage or screen for OPI tests

24
  • ???????

Thanks to Waheed Samy, Naima Bousofara Omar, Eli
Andrews,Mohamed Al-Saffar, Nazir Kikhia, Rula
Kikhia,and Linda Istanbullifor item development
and data collection/transcription in Arabic,and
to Andy Freeman for providing diacritic markings.
Write a Comment
User Comments (0)
About PowerShow.com