Dialog Design 3 - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Dialog Design 3

Description:

Store tones, then put them together - The transition is the difficult thing to do ... Prosodic. Inflection, stress, pitch, timing. Pragmatic ... – PowerPoint PPT presentation

Number of Views:79
Avg rating:3.0/5.0
Slides: 19
Provided by: johns82
Category:

less

Transcript and Presenter's Notes

Title: Dialog Design 3


1
Dialog Design 3
  • Speech and natural language

2
Agenda
  • Video
  • Speech generation
  • Speech recognition
  • Natural language interfaces
  • Project part 3
  • Mid-term exam preview

3
Speech Input
  • Speech synthesis
  • Speaker recognition
  • Speech recognition
  • Natural language understanding

4
English Speech
  • Made up of 40 phonemes, 24 consonants and 16
    vowels

5
Speech Synthesis
  • Often hear robotic voice
  • Store tones, then put them together
  • -gt The transition is the difficult thing to do

6
Speaker Recognition
  • Tell which person it is (voice print)
  • Could be important for monitoring meetings

7
Speech Recognition
  • Primarily identifying words
  • Improving all the time
  • Commercial systems
  • IBM ViaVoice, Dragon Dictate, ...

8
Recognition Dimensions
  • Speaker dependent/independent
  • Parametric patters are sensitive to speaker
  • With training (dept) can get better
  • Vocabulary
  • Some are getting 50,000 words
  • Isolated word vs. continuous speech
  • Continuous where words stop begin
  • Typically a pattern match, no context used

9
Recognition Systems
  • Typical system has 5 components
  • Speech capture device - Has analog -gt digital
    converter
  • Digital Signal Processor - Gets word boundaries,
    scales, filters, cuts out extra stuff
  • Preprocessed signal storage - Processed speech
    buffered for recognition algorithm
  • Reference speech patterns - Stored templates or
    generative speech models for comparisons
  • Pattern matching algorithm - Goodness of fit from
    templates/model to users speech

10
Errors
  • Systems make four types of errors
  • Substitution - one for another
  • Rejection - detected, but not recognized
  • Insertion - added
  • Deletion - not detected
  • Which is more common, dangerous?

11
Natural Language Understanding
  • Putting meaning to the words
  • Input might be speech or could be typed in
  • Holy grail of Artificial Intelligence problems

12
NL Factors/Terms
  • Syntactic
  • Grammar or structure
  • Prosodic
  • Inflection, stress, pitch, timing
  • Pragmatic
  • Situated context of utterance, location, time
  • Semantic
  • Meaning of words

13
SR/NLU Assessment
  • Advantages
  • Disadvantages

14
SR/NLU Advantages
  • Easy to learn and remember
  • Powerful
  • Fast, efficient (not always)
  • Little screen real estate

15
SR/NLU Disadvantages
  • Doesnt work good enough yet
  • Assumes knowledge of problem domain
  • Not prompted, like menus
  • Requires typing skill (if keyboard)
  • Enhancements are invisible
  • Expensive to implement

16
Good in Situations
  • Hands busy
  • Mobility required
  • Eyes occupied
  • Conditions preclude use of keyboard
  • Visual impairment
  • Physical limitation

17
Project Part 3
  • Decide on a design
  • Implement a prototype
  • (Develop evaluation plan)

18
Mid-term Exam Review
  • Lectures
  • Chapters 1-7, 11.1-11.4
  • Short answer style questions
Write a Comment
User Comments (0)
About PowerShow.com