The TBALL Project Data Collection: Making a Young Children's Speech Corpus

About This Presentation

Title:

The TBALL Project Data Collection: Making a Young Children's Speech Corpus

Description:

The TBALL Project Data Collection: Making a Young Children's Speech Corpus ... 15 children recorded per day (~1.9 hours of recorded speech per day) ... – PowerPoint PPT presentation

Number of Views:113

Avg rating:3.0/5.0

Slides: 2

Provided by: sail8

Category:

more less

Transcript and Presenter's Notes

Title: The TBALL Project Data Collection: Making a Young Children's Speech Corpus

1
The TBALL Project Data Collection Making a
Young Children's Speech Corpus Abe Kazemzadeh,
Hong You, Markus Iseli, Barbara Jones,
Xiaodong Cui, Margaret Heritage, Patti Price,
Elaine Anderson, Shrikanth Narayanan, and Abeer
Alwan University of Southern California,
University of California Los Angeles, and
PPRICE Speech and Language Techology
Results and Observations

Data Collection Motivation
Establish a corpus for studying child and
non-native speech.
Build speech applications for under-represented
populations.
Analyze pronunciation variation.
Test bed for our target child-computer
interface.
Test hardware, animations, timing, vocabulary,
etc.
Measure children's ability with respect to grade
level and other factors

Project Goals
Automation of literacy assessment measures
using speech and language technology.
Development of standards and methods for
reliable, objective assessment.
One-on-one interaction with child, which leaves
teachers with more time for teaching.
Focused on fair assessment robust to dialect
variation including nonnative speakers.
Support for teacher feedback and database
records.

Native Language Distribution of Recorded Subjects
Transcriptions
Wizard of Oz Interface

Enhanced ARPABET symbols to represent phenomena
peculiar to non-native and children's speech
Dental stops
Unaspirated voiceless stops
Negative VOT (prevoiced) stops
Lispy /s/
Glottalized /t/
Long frication of /f/
Trill
Syllabic Consonants
With a convention to represent vowel space with
respect to English vowels
Non-native vowels defined by the two nearest
English vowels, with the highest vowel first
(e.g., /iyih/).
82 phone-level transcriber agreement.
Transcribers started with 100 overlap (each
file transcribed twice), 25 after agreement was
established.
Sentences are transcribed by word-level
alignments, with phone-level detail if there was
pronunciation variation.

techie
tester

Language Background Effects
Difficulty associating words with pictures.
Sometimes reading sentences was performed
better than individual words by children who
could read in Spanish but not English.
Sounding out words with Spanish letter-to-sound
rules.

Corpus Statistics
256 Children recorded.
30,000 utterances.
40 hours of speech.
13 GB of speech data sampled at 44.1 kHz.

child

Age/Grade Effects
Position in school year is important (children
learn a lot between the beginning and end of the
school year).
Younger children are more timid.
Less social and reading experience.
Less exposure to computers.

Reading Tactics
Sounding out words generally helped children.
Mispronunciations when a subword portion is
confused with another word (e.g., once, using).
Confusion with the different sounds an
orthographic symbol may have (e.g., now
pronounced as no).

Our recording setup was similar to our target
application
Two visible operators
The techie controls the presentation of stimuli
and monitors recording quality.
The tester instructs and guides the child.
The target application would include just the
child and computer.
20 min. per child maximum.
A secure database stores child demographic info
and speech recordings
Age, grade, English level,
native language, language used at home, language
used with friends,
parents' native language, parents' birthplaces.
Accommodations for children.
15 children recorded per day (1.9 hours of
recorded speech per day).
Stimuli pictures, colors, alphabet, numbers,
words, and sentences.

Acknowledgements

Higher Level Phenomena
Using a, an, some in picture naming.
Perhaps due to grammatical differences in English
and Spanish.
Verb tense changes when reading sentences.
Formation of contractions from long forms (but
not vice-versa).
Reanalysis of sentence after the child realizes
a mistake he/she has made.

Pronunciation Variation
Read speech is slower.
Long breaks in fricatives followed by stops
(e.g., s-tart).
Long liquids, nasals, and fricatives.
Syllables spoken slowly (e.g., a-long).
Final consonants may be delayed or dropped
(e.g., par-t or par- ).
Difficulty with am and an in isolation.
At times, children speak in an exaggerated
voice.

This project is supported in part by the NSF. In
addition, this work would not be possible with
out the hard work of transcribers Daylen Riggs
and Nathan Go the patience and bilingualism of
Kimberly Reynolds and Blanca Martinez the
careful recordings of Erdem Unal, Vivek
Rangarajan, Shiva Sundaram, Yirong Yang, Jinjin
Ye, and Yijian Bai and the planning of Larry
Casey and Christy Boscardin.

Write a Comment

User Comments (0)