Intelligent Help or lack thereof in Spoken Dialog Systems - PowerPoint PPT Presentation

About This Presentation
Title:

Intelligent Help or lack thereof in Spoken Dialog Systems

Description:

Adding intelligent help to mixed-initiative spoken dialogue systems. ... Specify both a movie and theater to obtain showtimes. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 27
Provided by: SCS6
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Intelligent Help or lack thereof in Spoken Dialog Systems


1
HELP!
  • Intelligent Help (or lack thereof) in Spoken
    Dialog Systems
  • Dialogs on Dialogs discussion
  • Stefanie Tomko
  • 20-Feb-04

2
Papers
  • Adding intelligent help to mixed-initiative
    spoken dialogue systems. G. Gorrell, I. Lewin
    M. Rayner. In Proc. of ICSLP, 2002.
  • Targeted help for spoken dialogue systems
    intelligent feedback improves naive users'
    performance. B.A. Hockey, O. Lemon, E. Campana,
    L. Hiatt, G. Aist, J. Hieronymous, A. Gruenstein,
    J. Downding. In Proc. of EACL, 2003.
  • ????? ? there isn't a lot out there about this!

3
We need Help!
  • 56 of NL system users in experiment asked for
    help without explicit knowledge that they could
    do so
  • Speech Graffiti users knew about various
    help/orientation keywords
  • 91 used options
  • 70 used where was I?
  • 48 used help

4
What is Help?
How do I do ltsomethinggt?
I didn't understand what you said.
How do I say that?
Where was I?
What can I do?
5
User-initiated Help examples
  • NL Movieline
  • Wordy, general
  • This system allows you to obtain movie and
    theater information for Pittsburgh. You can ask
    for the location, phone number, and movie listing
    for a certain theater. Or, you can ask about a
    particular movie to get the rating and or genre
    find out where it is playing. Specify both a
    movie and theater to obtain showtimes. If you get
    stuck, you can say Reset, to start over.
  • Jupiter
  • Example based
  • You can ask about general weather forecasts as
    well as information on temperature, windspeed,
  • Try saying one of the following 'what's the
    weather for Denver?' 'what cities do you know
    about?' 'what do you know about besides weather?'
    'what can I say?'
  • Try saying one of the following 'are there any
    advisories for the United States?' 'what is the
    extended forecast for Boston?' 'will it rain in
    Toronto?'

6
User-initiated Help examples
  • Speech Graffiti
  • Somewhat "state" based
  • slot options you can say, rating is... G, PG,
    PG-13, R, NC-17, not rated, or you can ask, what
    is the rating?
  • options you can specify or ask about title, show
    time, day
  • help gives list of keywords on 1st round, then
    gives explanation of keyword functions
  • TellMe
  • Orientation, or, at main level, lots of general
    system info
  • You're in Sports, in the NHL section.

7
These are all kind of "dumb"
  • They might not take system state into account
  • They aren't smart about what users really want to
    do
  • They might not tell users exactly how to speak
  • They might not orient users to where they are in
    the system
  • But at least they give users some information

8
System-initiated "Help" examples
  • NL Movieline
  • Excuse me?
  • Didn't catch that.
  • Jupiter
  • Pardon me?
  • Speech Graffiti
  • I'm sorry, I'm having trouble understanding you
  • TellMe
  • I'm sorry, I didn't get that. Please say a
    category in Travel.
  • These are really dumb!

9
Intelligent/Targeted Help
  • Makes system-initiated help a little smarter
  • Goal provide immediate feedback, tailored to
    what the user said, for cases in which the system
    was not able to understand an utterance
  • Kind of different perspective compared to
    traditional error handling

What should I do to deal with this error?
How can I help the user not make this error in
the future?
10
Gorrell et al ICSLP paper
  • Grammar-based vs. statistical LMs
  • Grammars easy to create (?)
  • GB performs better if users know what to say
  • SLMs better for unusual less constrained utts
  • 1st attempt recognition only (i.e. no help)
  • Run all utts through GBLM SLM, choose based on
    confidence scores
  • Not reliable enough

11
On/Off House
  • User initiative
  • Natural language
  • Turn off the light in the bathroom
  • Are the hall and kitchen lights switched on?
  • Could you tell me which lights are on?

12
Targeted Help
13
Classification
  • Hand-classified training set
  • 12 classes
  • 24 features
  • Most common classes
  • REFEXP_COMMAND (35)
  • I didn't quite catch that. To turn a device on or
    off, you could try something like 'turn on the
    kitchen light.'
  • LONG_COMMAND (13)
  • I didn't quite catch that. Long commands can be
    difficult to understand. Perhaps try giving
    separate commands for each device.
  • PRON_COMMAND (11)
  • I didn't quite catch that. To change the status
    of a device or group of devices you've just
    referred to, you could try for example 'turn it
    on' or 'turn them off.'

14
Evaluation
  • Baseline classification error 65
  • Cross-validated final decision tree error 12
  • Between-subjects user study task
  • call a voice-enabled house leave it in a secure
    state
  • No training
  • Targeted help (N16) vs. control help (N15)

15
Results
16
Results (2)
  • Targeted help group had more variety in
    constructions
  • Targeted help users requested help more often
  • Six TH users vs. only one (!) control user
  • Longer dialogs in TH groups
  • Some of this is system exploration
  • No significant differences in awareness of final
    house state or perception of systems' abilities
  • No comparison of task completion

17
Hockey et al EACL paper
  • Domain WITAS command control for robotic
    helicopter
  • Targeted Help is an independent module

SLM parsable?
Grammar-based LM parsable?
Send to SLM
no
yes
no
yes
Play regular output
Create play appropriate help message
18
Help message content
  • Message contains one or more of
  • A. What the system heard
  • A report of the backup SLM recognition hypothesis
  • B. What the problem was (diagnostic)
  • A description of the problem with the user's
    utterance
  • C. What you might say instead
  • A similar in-grammar example
  • Rule-based determination of exact content
    for B C
  • Not clear how often A B C appear in what
    combinations

19
B. Diagnostic
  • Endpointing
  • Check if initial recognized word is ok initial
    parsable-input word
  • Out-of-volcabulary
  • Compare SLM vocab to GBLM vocab
  • Subcategorization
  • Check features of verbs in SLM hypothesis
  • Zoom in intrans
  • gt ! Zoom in on the red car

20
C. In-grammar example
  • Try to use words dialog-move type from user's
    original utterance
  • wh-question
  • yn-question
  • answer
  • command

21
Evaluation
  • Between-groups user study
  • Targeted help vs. no help
  • Was user-initiated help available?
  • N20, 5 tasks each
  • Only T1 T5 assessed
  • Locate an x and then land at the y

22
Results
  • Significantly fewer TH users gave up on tasks
  • Control users gave up on 39 of tasks
  • TH users gave up on only 6
  • Time to completion effects
  • Hard to measure "completion!"
  • Task (gt users get better over time)
  • Help x Task
  • Help alone
  • (plt.1 in "lenient" analysis)

23
Discussion
  • Definitely an improvement over "dumb" options
  • How easy are these options to automate and port
    to new domains/systems?
  • Classifier version needs training data
  • Rule-based version needs rules
  • Is there such a thing as too smart?
  • The system doesn't understand the word X
  • The system doesn't understand the word X used
    with the red car

24
Discussion (2)
  • Do grammaticality improvements fostered by TH
    persist?
  • How frequently is TH activated?
  • Does frequency decrease over time?
  • At a faster rate cf. plain-old help?
  • In rule-based system, how often do both LMs fail?

25
Discussion (3)
  • How often does either system (esp. rule-based)
    provide inappropriate help?
  • Wrong dialogue-move type?
  • Wrong vocabulary?
  • What of 1st-utt-after-TH are grammatical?
  • cf. plain-old help
  • Are there other ways to implement/ supplement TH?
  • State information?
  • Back-off to directed dialog? (in worst case)

26
Anything else?
  • Let me know if you come across any more
    references to this sort of thing
Write a Comment
User Comments (0)
About PowerShow.com