CALL Practice II - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

CALL Practice II

Description:

Challenges for CALL applications for speaking. Technical storage ... products: examples of prescriptive artefact design' vs. vehicles for theory testing ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 37
Provided by: will120
Category:

less

Transcript and Presenter's Notes

Title: CALL Practice II


1
CALL Practice II
  • Xiaofei Lu
  • APLNG 588
  • October 18, 2007

2
Agenda
  • Assignments 4 and 5
  • Hincks (2003)
  • Naber (2003)

3
Hincks (2003)
  • Introduction
  • Speech processing for CALL
  • Theoretical background
  • Test of effectiveness of Talk to Me
  • Results
  • Discussion

3
4
Introduction
  • Challenges for CALL applications for speaking
  • Technical storage transfer, equipment, noise
  • Linguistic variability, complex analyses
  • Pronunciation training outside of the classroom
  • Monolingual vs. mixed-language classroom
  • Traditional vs. digital language labs
  • Teacher feedback vs. automatic feedback

5
Speech processing for CALL
  • Computer-assisted feedback using signal analysis
    software
  • Perception and production of target sounds in L2
  • Audio-visual feedback vs. audio feedback alone
  • Computer as a tool for visualization
  • Teacher provides essential feedback

6
Constraints on using ASR in LL
  • NLP tech, computing power, accented users
  • Quantified vs. corrective, constructive feedback
  • Segmental vs. prosodic level feedback
  • Age and gender model of speaking voices
  • Speech databases and ASR engines costly

7
Strategies for using ASR in LL
  • Talk to Me by Aurolog
  • 6 dialogue sequences 30 QA screens each
  • Visual, audio, or video enhancements
  • User response limited to 3 possibilities
  • Vocal tract animations for phoneme production
  • Record/playback model for W S level training
  • Visual display production score deviant sounds
  • Levels of difficulty adjustable by user

8
Strategies for using ASR in LL
  • TriplePlayPlus and Accent Coach
  • EduSpeak
  • ASR in dictation systems
  • NaturallySpeaking
  • Recognition of native vs. foreign-accented speech
  • Errors perceived by humans vs. machine
    misrecognitions
  • Adaptation of ASR with non-native phonetic models

9
Evaluating language with ASR
  • Versant tests
  • 10-minute test of spoken English over the phone
  • Model trained by native speech adapted for
    non-native speech
  • Technology forced alignment, pronunciation
    dictionary, HMM ASR, expected-response network
  • Correlation between scores and human measures
  • Cucchiarini et al. (2002)
  • Correlation in human ratings temporal measures
  • Poor job in assessing segmental quality

10
Theoretical background
  • Talk to me and communicative LL
  • Communication interaction with the computer
  • Noticing highlighting worst word
  • Affective aspects of LL multi-media-enriched
    input
  • Automating learning simple practice of language
  • Audio-lingual LL
  • Imitation of models in Talk to Me
  • Presentation of minimal pairs

11
Theoretical background
  • Acquisition of L2 phonology
  • Commercial products examples of prescriptive
    artefact design vs. vehicles for theory testing
  • Talk to me tests global pronunciation quality
  • No broader bearing on theory development
  • The tutor-tool distinction
  • Talk to me as a tutor or tool?
  • Students also meet with human tutor in this study

12
Test of effectiveness of Talk to Me
  • Framework
  • Control group 5-hour pronunciation tutoring
    assisted by an IPA program
  • Experimental group 4-hour tutoring using Talk to
    Me plus copy of Talk to Me at home
  • Subjects
  • Middle-aged engineers with diverse L1
  • 15 students in control group
  • 13 in experimental group 9 used TTM at home

13
Test of effectiveness of Talk to Me
  • Use
  • 4-hour program use with a pronunciation tutor
  • Students report 2-48 hours of program use at home
  • Pre- and post-testing
  • 10-minite PhonePass (now Versant) test
  • Tests conducted at beginning and end of course

14
Results
  • Questionnaire on satisfaction
  • Fun to use
  • Lack of time to use it
  • Overall scores
  • Control group 4.48 to 4.74
  • Experimental group 4.4 to 4.51

15
Pronunciation results
  • Mean scores
  • Neither group showed significant improvement
  • Individual scores
  • Students grouped according to proficiency level
  • Strongly accented students in experimental group
    showed significant improvement
  • Time on task
  • No clear relationship found

16
Evaluating CALL software
  • Chapelles (2001) 6 criteria
  • Language learning potential
  • A language learning vs. language use activity
  • Focus on form
  • Feedback visual with no expert explanation
  • Learner fit
  • Fits better for students with strong foreign
    accent

17
Evaluating CALL software
  • Meaning focus weak in TTM
  • Impact
  • Development of meta-cognitive strategies
  • Students were able to work on their own
  • Authenticity
  • Authentic dialogues
  • More natural speech needed
  • Practicality
  • Pricing hardware requirement

18
Evaluating ASR for spoken language evaluation
  • Varieties of English
  • ASR models important
  • Student adaptation to negative feedback
  • Precoda et al. (2000) negative correlation
    between practice with LL software and fluency
  • Cucchiarini et al. (2000) temporal measures
    rate of speech total duration work best but
    misleading

19
Conclusions
  • No significant improvement in pronunciation by
    using TTM
  • Beneficial most to students with intrusive
    foreign accent
  • Use of ASR in intelligent, talking language tutor

20
Naber (2003)
  • Introduction
  • Theoretical background
  • Design and implementation
  • Evaluation results
  • Conclusion

21
Introduction
  • Open source English style grammar checker
  • Input text
  • Output a list of possible errors
  • Rule-based approach
  • POS tagging and chunking
  • Error rules expressed in a simple XML format
  • Rules includes explanation of error
  • Tested on an error corpus and BNC

22
Theoretical background
  • Desired system features
  • Processing speed for interactive use
  • Integrated into an existing word processor
  • Limited false positives
  • Adaptable for personal needs
  • High recall

23
Error categories
  • Spelling errors - lexicon
  • Grammar errors - context
  • Style errors situation and text type
  • Sentence length and structure
  • Vocabulary use
  • Semantic errors world knowledge

24
Pre-processing
  • POS tagging using Qtag 3.1
  • Phrase chunking
  • Rule-based chunkers
  • Probabilistic chunkers

25
Grammar checking
  • Syntax-based checking
  • Full parsing requires grammar with broad
    coverage
  • Constraint relaxation required to detect error
    type
  • Statistics-based checking
  • Thresholding of (un)common POS n-grams
  • No specific error message
  • Rule-based checking
  • Incomplete S, easy configuration, extendable
  • Error message incremental development

26
Grammar checking
  • Some grammar errors
  • Subject-verb agreement and tag questions
  • Agreement between article and following word
  • Sentence boundary detection
  • Direct implementation into a translation system
  • Rule-based representation of S as REs
  • Supervised machine learning algorithms
  • 98 recall (Walker et al. 2001)

27
Controlled language checking
  • Controlled language
  • Made simpler by rules, avoids ambiguity
  • Easier to understand and parse by computers
  • Lexical restrictions
  • Grammar restrictions
  • Semantic restrictions
  • Style restrictions

28
Style checking
  • Choice of words
  • Simplicity
  • Punctuation
  • Dates, times, and numbers
  • Contracted forms
  • Lexical repetition

29
False friends
  • A pair of words from 2 different languages
  • Similarly written or pronounced
  • Have different meaning
  • Easily identifiable by rules

30
Evaluation with corpora
  • Precision
  • real errors found/all errors found
  • Recall
  • real errors found/real errors in text
  • BNC precision
  • Small tagged mailing list error corpus recall
  • Using search engines to find words/phrases

31
Related projects
  • Ispell and Aspell
  • Style and Diction (demo in Unix)
  • CLAWS as a Grammar Checker
  • Systems that are not publicly available
  • EasyEnglish
  • Critique
  • GramCheck
  • Park et als Grammar Checker
  • FLAG

32
Design and implementation
  • Implemented using python (brief demo)
  • System independent
  • Support object-oriented programming
  • Built-in support for common data types
  • Supports Unicode
  • Implicitly typed

33
Design and implementation
  • Class hierarchy
  • TextChecker()
  • Rule and RuleMatch objects
  • check() and math() methods
  • SentenceSplitter()
  • Tagger()
  • Chunker()
  • Installation
  • KWord

34
Rule development and testing
  • Take an undetected error
  • Write a rule for the error
  • Generalize the rule using POS tags if possible
  • Check if rule pattern appears in correct
    sentences
  • Redo step 3 if too many false positives from step
    4
  • Test new rule using real sentences in BNC
  • Example Of cause on page 34

35
Other checks
  • Checks outside the rule system
  • Sentence length check
  • Determiner check
  • Word repeat check
  • Whitespace check
  • Style checking
  • Short forms
  • Sentence beginning with Or
  • Ambiguous use of billion

36
Evaluation
  • POS tagger 93.58
  • Sentence boundary 96 (R), 97 (P)
  • Style and grammar checker
  • BNC 16 out of 75900 marked erroneous
  • Mailing list error corpus 42 out of 224 errors
    identified MS Word identified 49
Write a Comment
User Comments (0)
About PowerShow.com