LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing - PowerPoint PPT Presentation

Loading...

PPT – LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing PowerPoint presentation | free to view - id: 57b14-ZDc1Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing

Description:

... want a flight from Milwaukee to Orlando one way leaving after 5 p. ... hotels, ... Please choose airline, hotel, or rental car. / prompt grammar type ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 62
Provided by: DanJur6
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: LING 138/238 SYMBSYS 138 Intro to Computer Speech and Language Processing


1
LING 138/238 SYMBSYS 138Intro to Computer Speech
and Language Processing
  • Lecture 3 October 5, 2004
  • Dan Jurafsky

2
Week 2 Dialogue and Conversational Agents
  • Examples of spoken language systems
  • Components of a dialogue system, focus on these
    3
  • ASR
  • NLU
  • Dialogue management
  • VoiceXML
  • Grounding and Confirmation

3
Conversational Agents
  • AKA
  • Spoken Language Systems
  • Dialogue Systems
  • Speech Dialogue Systems
  • Applications
  • Travel arrangements (Amtrak, United airlines)
  • Telephone call routing
  • Tutoring
  • Communicating with robots
  • Anything with limited screen/keyboard

4
A travel dialog Communicator
5
Call routing ATT HMIHY
6
A tutorial dialogue ITSPOKE
7
Dialogue System Architecture
  • Simplest possible architecture ELIZA
  • Read-search/replace-print loop
  • Well need something with more sophisticated
    dialogue control
  • And speech

8
Dialogue System Architecture
9
ASR engine
  • ASR Automatic Speech Recognition
  • Job of ASR system is to go from speech (telephone
    or microphone) to words
  • We will be studying this in a few weeks

10
ASR Overview (pic from Yook 2003)
11
ASR in Dialogue Systems
  • ASR systems work better if can constrain what
    words the speaker is likely to say.
  • A dialogue system often has these constraints
  • System What city are you departing from?
  • Can expect sentences of the form
  • I want to (leavedepart) from CITYNAME
  • From CITYNAME
  • CITYNAME
  • etc

12
ASR in Dialogue Systems
  • Also, can adapt to speaker
  • But!! ASR is errorful
  • So unlike ELIZA, cant count on the words being
    correct
  • As we will see, this fact about error plays a
    huge role in dialogue system design

13
Natural Language Understanding
  • Also called NLU
  • We will discuss this later in the quarter
  • There are many ways to represent the meaning of
    sentences
  • For speech dialogue systems, perhaps the most
    common is a simple one called Frame and slot
    semantics.
  • Semantics meaning

14
An example of a frame
  • Show me morning flights from Boston to SF on
    Tuesday.
  • SHOW
  • FLIGHTS
  • ORIGIN
  • CITY Boston
  • DATE Tuesday
  • TIME morning
  • DEST
  • CITY San Francisco

15
How to generate this semantics?
  • Many methods, as we will see in week 9
  • Simplest semantic grammars
  • LIST -gt show me I want can I see
  • DEPARTTIME -gt (afteraroundbefore) HOUR
    morning afternoon evening
  • HOUR -gt onetwothreetwelve (ampm)
  • FLIGHTS -gt (a) flightflights
  • ORIGIN -gt from CITY
  • DESTINATION -gt to CITY
  • CITY -gt Boston San Francisco Denver
    Washington

16
Semantics for a sentence
  • LIST FLIGHTS ORIGIN
  • Show me flights from Boston
  • DESTINATION DEPARTDATE
  • to San Francisco on Tuesday
  • DEPARTTIME
  • morning

17
Frame-filling
  • We use a parser (week 10) to take these rules and
    apply them to the sentence.
  • Resulting in a semantics for the sentence
  • We can then write some simple code
  • That takes the semantically labeled sentence
  • And fills in the frame.

18
Other NLU Approaches
  • Cascade of Finite-State-Transducers
  • Instead of a parser, we could use FSTs, which are
    very fast, to create the semantics.
  • Or we could use Syntactic rules with semantic
    attachments
  • This latter is what is done in VoiceXML, so we
    will see that today.

19
Generation and TTS
  • Wont say much about this today
  • TTS next week!
  • Generation two main approaches
  • Simple templates (prescripted sentences)
  • Unification use similar grammar rules as for
    parsing, but run them backwards!

20
Dialogue Manager
  • Eliza was simplest dialogue manager
  • Read-search/replace-print loop
  • No state was kept system did the same thing on
    every sentence
  • A real dialogue manager needs to keep state
  • We cant keep asking the same question over and
    over!

21
Three architectures for dialogue management
  • Finite State
  • Frame-based
  • Planning Agents

22
Finite State Dialogue Manager
23
Finite-state dialogue managers
  • System completely controls the conversation with
    the user.
  • It asks the user a series of question
  • Ignoring (or misinterpreting) anything the user
    says that is not a direct answer to the systems
    questions

24
Dialogue Initiative
  • Initiative means who has control of the
    conversation at any point
  • Single initiative
  • System
  • User
  • Mixed initative

25
System Initiative
  • Systems which completely control the conversation
    at all times are called system initiative.
  • Advantages
  • Simple to build
  • User always knows what they can say next
  • System always knows what user can say next
  • Known words Better performance from ASR
  • Known topic Better performance from NLU
  • Disadvantage
  • Too limited

26
User Initiative
  • User directs the system
  • Generally, user asks a single question, system
    answers
  • System cant ask questions back, engage in
    clarification dialogue, confirmation dialogue
  • Used for simple database queries
  • User asks question, system gives answer
  • Web search is user initiative dialogue.

27
Problems with System Initiative
  • Real dialogue involves give and take!
  • In travel planning, users might want to say
    something that is not the direct answer to the
    question.
  • For example answering more than one question in a
    sentence
  • Hi, Id like to fly from Seattle Tuesday morning
  • I want a flight from Milwaukee to Orlando one way
    leaving after 5 p.m. on Wednesday.

28
Single initiative universals
  • We can give users a little more flexibility by
    adding universal commands
  • Universals commands you can say anywhere
  • As if we augmented every state of FSA with these
  • Help
  • Correct
  • This describes many implemented systems
  • But still doesnt deal with mixed initiative

29
Mixed Initiative
  • Conversational initiative can shift between
    system and user
  • Simplest kind of mixed initiative use the
    structure of the frame itself to guide dialogue
  • Slot Question
  • ORIGIN What city are you leaving from?
  • DEST Where are you going?
  • DEPT DATE What day would you like to leave?
  • DEPT TIME What time would you like to leave?
  • AIRLINE What is your preferred airline?

30
Frames are mixed-initiative
  • User can answer multiple questions at once.
  • System asks questions of user, filling any slots
    that user specifies
  • When frame is filled, do database query
  • If user answers 3 questions at once, system has
    to fill slots and not ask these questions again!
  • Anyhow, we avoid the strict constraints on order
    of the finite-state architecture.

31
Multiple frames
  • flights, hotels, rental cars
  • Flight legs Each flight can have multiple legs,
    which might need to be discussed separately
  • Presenting the flights (If there are multiple
    flights meeting users constraints)
  • It has slots like 1ST_FLIGHT or 2ND_FLIGHT so use
    can ask how much is the second one
  • General route information
  • Which airlines fly from Boston to San Francisco
  • Airfare practices
  • Do I have to stay over Saturday to get a decent
    airfare?

32
Multiple Frames
  • Need to be able to switch from frame to frame
  • Based on what user says.
  • Disambiguate which slot of which frame an input
    is supposed to fill, then switch dialogue control
    to that frame.

33
VoiceXML
  • Voice eXtensible Markup Language
  • An XML-based dialogue design language
  • Makes use of ASR and TTS
  • Deals well with simple, frame-based mixed
    initiative dialogue.
  • Most common in commercial world (too limited for
    research systems)
  • But useful to get a handle on the concepts.

34
Voice XML
  • Each dialogue is a ltformgt. (Form is the VoiceXML
    word for frame)
  • Each ltformgt generally consists of a sequence of
    ltfieldgts, with other commands

35
Sample vxml doc
  • ltformgt
  • ltfield name"transporttype"gt
  • ltpromptgt
  • Please choose airline, hotel, or rental
    car. lt/promptgt
  • ltgrammar type"application/xnuance-gsl"gt
  • airline hotel "rental car"
  • lt/grammargt
  • lt/fieldgt
  • ltblockgt
  • ltpromptgt
  • You have chosen ltvalue expr"transporttype"gt.
    lt/promptgt
  • lt/blockgt
  • lt/formgt

36
VoiceXML interpreter
  • Walks through a VXML form in document order
  • Iteratively selecting each item
  • If multiple fields, visit each one in order.
  • Special commands for events

37
Another vxml doc (1)
  • noinputgt
  • I'm sorry, I didn't hear you. ltreprompt/gt
  • lt/noinputgt
  • ltnomatchgt
  • I'm sorry, I didn't understand that. ltreprompt/gt
  • lt/nomatchgt

38
Another vxml doc (2)
  • ltformgt
  • ltblockgt Welcome to the air travel
    consultant. lt/blockgt
  • ltfield name"origin"gt
  • ltpromptgt Which city do you want to
    leave from? lt/promptgt
  • ltgrammar type"application/xnuance-gsl"gt
  • (san francisco) denver (new york)
    barcelona
  • lt/grammargt
  • ltfilledgt
  • ltpromptgt OK, from ltvalue expr"origin"gt
    lt/promptgt
  • lt/filledgt
  • lt/fieldgt

39
Another vxml doc (3)
  • ltfield name"destination"gt
  • ltpromptgt And which city do you want to go
    to? lt/promptgt
  • ltgrammar type"application/xnuance-gsl"gt
  • (san francisco) denver (new york)
    barcelona
  • lt/grammargt
  • ltfilledgt
  • ltpromptgt OK, to ltvalue
    expr"destination"gt lt/promptgt
  • lt/filledgt
  • lt/fieldgt
  • ltfield name"departdate" type"date"gt
  • ltpromptgt And what date do you want to
    leave? lt/promptgt
  • ltfilledgt
  • ltpromptgt OK, on ltvalue
    expr"departdate"gt lt/promptgt
  • lt/filledgt
  • lt/fieldgt

40
Another vxml doc (4)
  • ltblockgt
  • ltpromptgt OK, I have you are departing from
  • ltvalue expr"origingt to ltvalue
    expr"destinationgt on ltvalue expr"departdate"gt
  • lt/promptgt
  • send the info to book a flight...
  • lt/blockgt
  • lt/formgt

41
A mixed initiative VXML doc
  • Mixed initiative user might answer a different
    question
  • So VoiceXML interpreter cant just evaluate each
    field of form in order
  • User might answer field2 when system asked field1
  • So need grammar which can handle all sorts of
    input
  • Field1
  • Field2
  • Field 1 and field 2
  • etc

42
VXML Nuance-style grammars
  • Rewrite rules
  • Wantsentence -gt I want to (flygo)
  • Nuance VXML format is
  • () for concatenation, for disjunction
  • Each rule has a name
  • Wantsentence (I want to fly go)
  • Airports (san francisco) denver

43
Mixed-init VXML example (3)
  • ltnoinputgt I'm sorry, I didn't hear you.
    ltreprompt/gt lt/noinputgt
  • ltnomatchgt I'm sorry, I didn't understand that.
    ltreprompt/gt lt/nomatchgt
  • ltformgt
  • ltgrammar type"application/xnuance-gsl"gt
  • lt! CDATA

44
Grammar
  • Flight ( ?
  • (i wanna (want to) fly go)
  • (i'd like to fly go)
  • ((i wanna)(i'd like a) flight)
  • ( from leaving departing Cityx)
    ltorigin xgt
  • ( (?going to)(arriving in) Cityx)
    ltdest xgt
  • ( from leaving departing Cityx
  • (?going to)(arriving in) Cityy)
    ltorigin xgt ltdest ygt
  • ?please
  • )

45
Grammar
  • City (san francisco) (s f o) return( "san
    francisco, california")
  • (denver) (d e n) return( "denver,
    colorado")
  • (seattle) (s t x) return(
    "seattle, washington")
  • gt lt/grammargt

46
Grammar
  • ltinitial name"init"gt
  • ltpromptgt Welcome to the air travel
    consultant. What are your travel plans?
    lt/promptgt
  • lt/initialgt
  • ltfield name"origin"gt
  • ltpromptgt Which city do you want to leave
    from? lt/promptgt
  • ltfilledgt
  • ltpromptgt OK, from ltvalue expr"origin"gt
    lt/promptgt
  • lt/filledgt
  • lt/fieldgt

47
Grammar
  • ltfield name"dest"gt
  • ltpromptgt And which city do you want to go
    to? lt/promptgt
  • ltfilledgt
  • ltpromptgt OK, to ltvalue expr"dest"gt
    lt/promptgt
  • lt/filledgt
  • lt/fieldgt
  • ltblockgt
  • ltpromptgt OK, I have you are departing from
    ltvalue expr"origin"gt
  • to ltvalue expr"dest"gt. lt/promptgt
  • send the info to book a flight...
  • lt/blockgt
  • lt/formgt

48
Grounding and Confirmation
  • Dialogue is a collective act performed by speaker
    and hearer
  • Common ground set of things mutually believed by
    both speaker and hearer
  • Need to achieve common ground, so hearer must
    ground or acknowledge speakers utterance.
  • Clark (1996)
  • Principle of closure. Agents performing an
    action require evidence, sufficient for current
    purposes, that they have succeeded in performing
    it

49
Clark and Schaefer Grounding
  • Continued attention B continues attending to A
  • Relevant next contribution B starts in on next
    relevant contribution
  • Acknowledgement B nods or says continuer like
    uh-huh, yeah, assessment (great!)
  • Demonstration B demonstrates understanding A by
    paraphrasing or reformulating As contribution,
    or by collaboratively completing As utterance
  • Display B displays verbatim all or part of As
    presentation

50
(No Transcript)
51
Grounding examples
  • Display
  • C I need to travel in May
  • A And, what day in May did you want to travel?
  • Acknowledgement
  • C He wants to fly from Boston
  • A mm-hmm
  • C to Baltimore Washington International

52
Grounding Examples (2)
  • Acknowledgement next relevant contribution
  • And, what day in May did you want to travel?
  • And youre flying into what city?
  • And what time would you like to leave?

53
Grounding and Dialogue Systems
  • Grounding is not just a tidbit about humans
  • Is key to design of conversational agent
  • Why?

54
Grounding and Dialogue Systems
  • Grounding is not just a tidbit about humans
  • Is key to design of conversational agent
  • Why?
  • HCI researchers find users of speech-based
    interfaces are confused when system doesnt give
    them an explicit acknowedgement signal
  • Experiment with this

55
Confirmation
  • Another reason for grounding
  • Speech is a pretty errorful channel
  • Hearer could misinterpret the speaker
  • This is important in Conv. Agents
  • Since we are using ASR, which is still really
    buggy.
  • So we need to do lots of grounding and
    confirmation

56
Explicit confirmation
  • S Which city do you want to leave from?
  • U Baltimore
  • S Do you want to leave from Baltimore?
  • U Yes

57
Explicit confirmation
  • U Id like to fly from Denver Colorado to New
    York City on September 21st in the morning on
    United Airlines
  • S Lets see then. I have you going from Denver
    Colorado to New York on September 21st. Is that
    correct?
  • U Yes

58
Implicit confirmation display
  • U Id like to travel to Berlin
  • S When do you want to travel to Berlin?
  • U Hi Id like to fly to Seattle Tuesday morning
  • S Traveling to Seattle on Tuesday, August
    eleventh in the morning. Your name?

59
Implicit vs. Explicit
  • Complementary strengths
  • Explicit easier for users to correct systemss
    mistakes (can just say no)
  • But explicit is cumbersome and long
  • Implicit much more natural, quicker, simpler (if
    system guesses right).

60
Implicit and Explicit
  • Early systems all-implicit or all-explicit
  • Modern systems adaptive
  • How to decide?
  • ASR system can give confidence metric.
  • This expresses how convinced system is of its
    transcription of the speech
  • If high confidence, use implicit confirmation
  • If low confidence, use explicit confirmation

61
Next Lecture
  • Dialogue acts
  • More on VXML
  • More on design of dialogue agents
  • Evaluation of dialogue agents
  • Dont forget to look at the homework early!!!!
About PowerShow.com