A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework - PowerPoint PPT Presentation

About This Presentation

Title:

A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework

Description:

A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework Arthur Chan Introduction Do you know? Arthur Chan actually takes classes in CMU ! – PowerPoint PPT presentation

Number of Views:102

Avg rating:3.0/5.0

Slides: 34

Provided by: Arthu61

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: A Newbie Experience of Dialogue System Construction Using the Ravenclaw Framework

1
A Newbie Experience of Dialogue System
Construction Using the Ravenclaw Framework

Arthur Chan

2
Introduction

Do you know?
Arthur Chan actually takes classes in CMU !
Course he took this year
Project Course Dialogue Systems
The course required the use of Ravenclaw/Olympus
A journal was kept on the experience I learned in
the process
Requested by gang members such as Dan and Thomas

3
Speakers Bio

Mainly a speech recognition guy
i.e. the part that transform speech to text
Not very experienced in dialogue system
Only work on directed dialogue system
Speechwork 6.5
i.e. an all-in-one dialogue system speech
recognizer
Dialogues are modularized
E.g. Digits, Alphabets, ZipCode

4
What did we do this year?

3 systems by 3 groups
RoadFinder Aaron, Dave and Wen
ICSLPInfo Arthur, Lingyun and Rohit
Extension of Vera Mohit, Kaimin and ?
The actual situation
Dave did most of the stunts
Each group has a person just to take care
development kick-start and system issues
Mailing list became the collaborative means

5
New Development

Sphinx3_Engine
With Sphinx 3.6 RCI
With Powerful Wideband Models (CALO) and
Narrowband Models (Communicator)
LM Training Scripts
With tools newly built in Project L
(CMU-Cambridge LM Toolkit V3)
IAX_Server
Allow systems to be used in Asterisk server(?)

6
This talk

Mini Case Study of ICSLPInfo
Try to learn what information we could give to
users for a conference
The type of information is unknown
Two perspectives
From a new user perspective
From a developer perspective

7
The New Users Perspective

Generally, as a new user, is it easy to learn
Ravenclaw?
Related Questions
Do I hate Dan? (Forever? Or even for a moment?)
Is it scary to use Ravenclaw?
What do we know /not know at a certain stage?
What is the general comment on the software?

8
The Developers Perspective

From a developers standpoint, what are the
issues of development?
Issues in speech recognition?
Issues in dialogue system development?
Issues in general application development?
Issues in multi-developer development?
When should we work on SR/RP/DS/BE?

9
The Development Process

Stage 0 Planning, drawing diagrams and stuffs
Stage 1 Making some existing systems run
Stage 2 Making simple systems run
Making SR works without the backend
Making the backend works without the SR
Stage 3 Making the first end-to-end system to
run
(Not cover today) Stage 4 Final adjustment and
final demo

10
Stage 0 Planning (2-3 weeks)

Major issue
The type of useful information could be unknown
Author?
Session?
Title?
Venue?
We actually didnt know what is the most useful
at Stage 0

11
Stage 1 Making some existing systems run (1
month)

Wide varieties of pre-built systems using
Ravenclaw
Path 1 Starting from ConvertProj
ConvertProj is a very simple project
Path 2 Starting from RoomLine
Path 3 Starting from scratches
Path 1 was first chosen so that everyone could
get an initial system

12
Note in Stage 1

Not everyone has easy time to get the initial
setup running (1-2 weeks)
Forgot to install active perl and miscellaneous
tools
At the beginning, didnt know where to debug
The synthesizer turns out to be not pre-built
(1-2 weeks)
Speech Recognizer is not running yet
Dont know why at that point.

13
If we starts from ConvertProj

How do we write the first system then?
ConvertProj is very simple but we didnt know
what it does
We didnt understand how Phoenix/Ravenclaw works
Rohit Let us start from Roomline then.
Turns out to be a very good idea
Why?
Roomline is complicated but the learner can learn
from the code
There are also couple of patterns could be reused
e.g for-loop, if-then-else

14
Note

We already got a hold of Description of
Ravenclaw Agent Description Language
Not a tutorial, no examples
We didnt know how to start based on it
Thats why a template was needed
We end up trace the whole Roomline system

15
Stage 2a Making a system with working SR

Our biggest problem Name Recognition
Recognizing 1000 names
Many of them are Asian names
No training data
Dave hasnt built the LM building script
The type of information is not yet set
Should we handle names?

16
Stage 2a Making a system with working SR (cont.)

Our first bootstrapping system
Use Sphinx3_Engine CALO model
Probably the strongest SR we could use
Use Roomline language model
Just tweak the grammar a little bit
Add a lot of compound words into classes
Also, only use session chairs (180 names) is in
grammar

17
The First System (No BE)
icslpinfo
Reset DateTime
Welcome
Logout
Task
Request Satistfied
Inform Logout
HMIHY
18
Note at Stage 2a

Finally gotten something running
But the system did nothing
We are still very vague in
how message is passed in Galaxy and
how results transferred from SR to RP to DM

19
Stage 2b Making the backend works without the SR

The backend is finally built at this stage
The backend/DM/RP is working and text console
mode is working
DM now gives the abstract when asked about the
author
But this time, SR fails because
the grammar accept too many
the Roomline LM was used.

20
Note at Stage 2b

Another difficult issue shows up
SR/RP/DM are very tightly coupled with each other
Other problems
Occasionally, is shown in the prompts
Because some prompts wasnt filled in
Good part
The first type of information we will handle is
finally decided
This constrains SR
We start to feel time is running short

21
Stage 3 Making the first end-to-end system to run

Speech Recognition
Retrain LM using faked corpora
Significantly trimmed down the number of authors
to recognize (From 200 to 30)
Few author names are easily recognized still.
The lucky ones
Alan Black
Arthur Toth
Julia Hirschberg
Andrew Rosenberg
(Alex is not very happy about this. His name is
confused with context key)

22
Note at this point

Started to realized that SR couldnt have quick
improvement
The problem of DM starts to be glaring
No disambiguation
When multiple results are return, no strategy to
take care.
Also, SR always couldnt recognize things in
grammar.
A lot of GARBAGE is recognized
See a lot On Alan Black

23
DM

Allow disambiguation using author name and
session name
Taken care of different scenarios of results
If there is no results,
Say Sorry and restart.
If there is one result
Present the detail of the paper,
Then ask whether to present the abstract of the
paper
If there is less than or equal to 5 results
Tell the user the number of papers found
Then ask whether to present the summary of the
paper.
(List of titles of the paper)
If there is more than 5 results
Say sorry

24
Other small things We Hacked Out

Confidence of The Recognizer
Audio Server is hacked such that
We are always confident about the results.
Annoying restarting issue
Commented the restarting routine in Windows

25
Backend and NLG

Backend
(may be for this demo only)
SQL-based
Could do author-search and session-name-search
NLG
Fill in all sorts of prompts
A lot of Implicit Confirmation and Explicit
Confirmation are missing
That caused a lot of in the system

26
Demo

Scenario
A user want to know information of the papers
written by
Alan Black
Julia Hirschberg and
Andrew Rosenberg
What it shows
How bad recognition is taken care now.
What happened when the number of answers returned
are multiple or single.

27
Note

Rohit Kummar and Lingyun Gao actually holds the
latest and greatest system.
This system only shows how we built up from
ground zero.

28
Summary 3 Difficult Issues in the Task

1, Tight coupling of SR/RP/DM
When one part is right, others could failed
2, SR issues
The SR task could be affected by different
constraints.
First system is hard to be up
Compound with 1
3, Lack of documentation in DM
The current documentation base is not strong
enough
Read-and-implement approach doesnt work yet
Some concepts are difficult to understand
Say COMPLETE/SUCCEED/FAILED
GRAMMAR_MAPPING

29
Lessons learn

Iteratively develop the system by boostrapping
each with simple systems
This would greatly reduce the pain of coupling
SR issue
The first system could be completed by some
smaller grammars first
In some task, SR shouldnt be the focus at a
certain point.
Aligned with common observation
DM Development
A good working template is necessary
What we need for loop, if-then-else templates

30
The bright side 1 birthday gift for Dave

Once understood, pretty easy to program
E.g. birthday celebration system
Sample Dialogue
S Do you want to know whats going on?
U Yes (or No)
S No matter whether you say yes or no, I will
have to tell you. Begin message.
Hmm-hm. Today is Mr David Huggines Daines
Birthday. Because everyone is too shy to sing
the birthday song for him. Me, Frank, will have
to sing it. Here you go. Happy Birthday to you,
Happy Birthday to you. Happy Birthday to David.
Happy Birthday to you. This message is bought to
you by
End message

31
Bright Side 2

If compared to a directed dialogue system, the
current system could give unexpected results.
Why?
several sub-systems of Dialogue system is working
together
Built in Libraries
Grounding
Focuses
Developer-defined libraries
It is delightful to use it in general

32
Bright Side 3

Source code has consistent coding style
Development problem will be mainly stemmed from
1, Lack of automatic regression test
2, Lack of central manager
Not a bad thing in dialogue system if
developer/system 1

33
Conclusion

Summarize the system development of how the
end-to-end system of ICSLPInfo is first developed
Discussed several issues including
Coupling of systems
SR
DM development
Overall speaking
Thrilled when getting the system running and
working

Write a Comment

User Comments (0)