A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework - PowerPoint PPT Presentation

About This Presentation
Title:

A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework

Description:

A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan Introduction Do you know? Arthur Chan actually takes classes in CMU ! – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 34
Provided by: Arthu61
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework


1
A Newbie Experience of Dialogue System
Construction Using the Ravenclaw Framework
  • Arthur Chan

2
Introduction
  • Do you know?
  • Arthur Chan actually takes classes in CMU !
  • Course he took this year
  • Project Course Dialogue Systems
  • The course required the use of Ravenclaw/Olympus
  • A journal was kept on the experience I learned in
    the process
  • Requested by gang members such as Dan and Thomas

3
Speakers Bio
  • Mainly a speech recognition guy
  • i.e. the part that transform speech to text
  • Not very experienced in dialogue system
  • Only work on directed dialogue system
  • Speechwork 6.5
  • i.e. an all-in-one dialogue system speech
    recognizer
  • Dialogues are modularized
  • E.g. Digits, Alphabets, ZipCode

4
What did we do this year?
  • 3 systems by 3 groups
  • RoadFinder Aaron, Dave and Wen
  • ICSLPInfo Arthur, Lingyun and Rohit
  • Extension of Vera Mohit, Kaimin and ?
  • The actual situation
  • Dave did most of the stunts
  • Each group has a person just to take care
    development kick-start and system issues
  • Mailing list became the collaborative means

5
New Development
  • Sphinx3_Engine
  • With Sphinx 3.6 RCI
  • With Powerful Wideband Models (CALO) and
    Narrowband Models (Communicator)
  • LM Training Scripts
  • With tools newly built in Project L
    (CMU-Cambridge LM Toolkit V3)
  • IAX_Server
  • Allow systems to be used in Asterisk server(?)

6
This talk
  • Mini Case Study of ICSLPInfo
  • Try to learn what information we could give to
    users for a conference
  • The type of information is unknown
  • Two perspectives
  • From a new user perspective
  • From a developer perspective

7
The New Users Perspective
  • Generally, as a new user, is it easy to learn
    Ravenclaw?
  • Related Questions
  • Do I hate Dan? (Forever? Or even for a moment?)
  • Is it scary to use Ravenclaw?
  • What do we know /not know at a certain stage?
  • What is the general comment on the software?

8
The Developers Perspective
  • From a developers standpoint, what are the
    issues of development?
  • Issues in speech recognition?
  • Issues in dialogue system development?
  • Issues in general application development?
  • Issues in multi-developer development?
  • When should we work on SR/RP/DS/BE?

9
The Development Process
  • Stage 0 Planning, drawing diagrams and stuffs
  • Stage 1 Making some existing systems run
  • Stage 2 Making simple systems run
  • Making SR works without the backend
  • Making the backend works without the SR
  • Stage 3 Making the first end-to-end system to
    run
  • (Not cover today) Stage 4 Final adjustment and
    final demo

10
Stage 0 Planning (2-3 weeks)
  • Major issue
  • The type of useful information could be unknown
  • Author?
  • Session?
  • Title?
  • Venue?
  • We actually didnt know what is the most useful
    at Stage 0

11
Stage 1 Making some existing systems run (1
month)
  • Wide varieties of pre-built systems using
    Ravenclaw
  • Path 1 Starting from ConvertProj
  • ConvertProj is a very simple project
  • Path 2 Starting from RoomLine
  • Path 3 Starting from scratches
  • Path 1 was first chosen so that everyone could
    get an initial system

12
Note in Stage 1
  • Not everyone has easy time to get the initial
    setup running (1-2 weeks)
  • Forgot to install active perl and miscellaneous
    tools
  • At the beginning, didnt know where to debug
  • The synthesizer turns out to be not pre-built
    (1-2 weeks)
  • Speech Recognizer is not running yet
  • Dont know why at that point.

13
If we starts from ConvertProj
  • How do we write the first system then?
  • ConvertProj is very simple but we didnt know
    what it does
  • We didnt understand how Phoenix/Ravenclaw works
  • Rohit Let us start from Roomline then.
  • Turns out to be a very good idea
  • Why?
  • Roomline is complicated but the learner can learn
    from the code
  • There are also couple of patterns could be reused
    e.g for-loop, if-then-else

14
Note
  • We already got a hold of Description of
    Ravenclaw Agent Description Language
  • Not a tutorial, no examples
  • We didnt know how to start based on it
  • Thats why a template was needed
  • We end up trace the whole Roomline system

15
Stage 2a Making a system with working SR
  • Our biggest problem Name Recognition
  • Recognizing 1000 names
  • Many of them are Asian names
  • No training data
  • Dave hasnt built the LM building script
  • The type of information is not yet set
  • Should we handle names?

16
Stage 2a Making a system with working SR (cont.)
  • Our first bootstrapping system
  • Use Sphinx3_Engine CALO model
  • Probably the strongest SR we could use
  • Use Roomline language model
  • Just tweak the grammar a little bit
  • Add a lot of compound words into classes
  • Also, only use session chairs (180 names) is in
    grammar

17
The First System (No BE)
icslpinfo
Reset DateTime
Welcome
Logout
Task
Request Satistfied
Inform Logout
HMIHY
18
Note at Stage 2a
  • Finally gotten something running
  • But the system did nothing
  • We are still very vague in
  • how message is passed in Galaxy and
  • how results transferred from SR to RP to DM

19
Stage 2b Making the backend works without the SR
  • The backend is finally built at this stage
  • The backend/DM/RP is working and text console
    mode is working
  • DM now gives the abstract when asked about the
    author
  • But this time, SR fails because
  • the grammar accept too many
  • the Roomline LM was used.

20
Note at Stage 2b
  • Another difficult issue shows up
  • SR/RP/DM are very tightly coupled with each other
  • Other problems
  • Occasionally, is shown in the prompts
  • Because some prompts wasnt filled in
  • Good part
  • The first type of information we will handle is
    finally decided
  • This constrains SR
  • We start to feel time is running short

21
Stage 3 Making the first end-to-end system to run
  • Speech Recognition
  • Retrain LM using faked corpora
  • Significantly trimmed down the number of authors
    to recognize (From 200 to 30)
  • Few author names are easily recognized still.
  • The lucky ones
  • Alan Black
  • Arthur Toth
  • Julia Hirschberg
  • Andrew Rosenberg
  • (Alex is not very happy about this. His name is
    confused with context key)

22
Note at this point
  • Started to realized that SR couldnt have quick
    improvement
  • The problem of DM starts to be glaring
  • No disambiguation
  • When multiple results are return, no strategy to
    take care.
  • Also, SR always couldnt recognize things in
    grammar.
  • A lot of GARBAGE is recognized
  • See a lot On Alan Black

23
DM
  • Allow disambiguation using author name and
    session name
  • Taken care of different scenarios of results
  • If there is no results,
  • Say Sorry and restart.
  • If there is one result
  • Present the detail of the paper,
  • Then ask whether to present the abstract of the
    paper
  • If there is less than or equal to 5 results
  • Tell the user the number of papers found
  • Then ask whether to present the summary of the
    paper.
  • (List of titles of the paper)
  • If there is more than 5 results
  • Say sorry

24
Other small things We Hacked Out
  • Confidence of The Recognizer
  • Audio Server is hacked such that
  • We are always confident about the results.
  • Annoying restarting issue
  • Commented the restarting routine in Windows

25
Backend and NLG
  • Backend
  • (may be for this demo only)
  • SQL-based
  • Could do author-search and session-name-search
  • NLG
  • Fill in all sorts of prompts
  • A lot of Implicit Confirmation and Explicit
    Confirmation are missing
  • That caused a lot of in the system

26
Demo
  • Scenario
  • A user want to know information of the papers
    written by
  • Alan Black
  • Julia Hirschberg and
  • Andrew Rosenberg
  • What it shows
  • How bad recognition is taken care now.
  • What happened when the number of answers returned
    are multiple or single.

27
Note
  • Rohit Kummar and Lingyun Gao actually holds the
    latest and greatest system.
  • This system only shows how we built up from
    ground zero.

28
Summary 3 Difficult Issues in the Task
  • 1, Tight coupling of SR/RP/DM
  • When one part is right, others could failed
  • 2, SR issues
  • The SR task could be affected by different
    constraints.
  • First system is hard to be up
  • Compound with 1
  • 3, Lack of documentation in DM
  • The current documentation base is not strong
    enough
  • Read-and-implement approach doesnt work yet
  • Some concepts are difficult to understand
  • Say COMPLETE/SUCCEED/FAILED
  • GRAMMAR_MAPPING

29
Lessons learn
  • Iteratively develop the system by boostrapping
    each with simple systems
  • This would greatly reduce the pain of coupling
  • SR issue
  • The first system could be completed by some
    smaller grammars first
  • In some task, SR shouldnt be the focus at a
    certain point.
  • Aligned with common observation
  • DM Development
  • A good working template is necessary
  • What we need for loop, if-then-else templates

30
The bright side 1 birthday gift for Dave
  • Once understood, pretty easy to program
  • E.g. birthday celebration system
  • Sample Dialogue
  • S Do you want to know whats going on?
  • U Yes (or No)
  • S No matter whether you say yes or no, I will
    have to tell you. Begin message.
  • Hmm-hm. Today is Mr David Huggines Daines
    Birthday. Because everyone is too shy to sing
    the birthday song for him. Me, Frank, will have
    to sing it. Here you go. Happy Birthday to you,
    Happy Birthday to you. Happy Birthday to David.
    Happy Birthday to you. This message is bought to
    you by
  • End message

31
Bright Side 2
  • If compared to a directed dialogue system, the
    current system could give unexpected results.
  • Why?
  • several sub-systems of Dialogue system is working
    together
  • Built in Libraries
  • Grounding
  • Focuses
  • Developer-defined libraries
  • It is delightful to use it in general

32
Bright Side 3
  • Source code has consistent coding style
  • Development problem will be mainly stemmed from
  • 1, Lack of automatic regression test
  • 2, Lack of central manager
  • Not a bad thing in dialogue system if
    developer/system 1

33
Conclusion
  • Summarize the system development of how the
    end-to-end system of ICSLPInfo is first developed
  • Discussed several issues including
  • Coupling of systems
  • SR
  • DM development
  • Overall speaking
  • Thrilled when getting the system running and
    working
Write a Comment
User Comments (0)
About PowerShow.com