Development and Operational Result of Real Environment Speechoriented Guidance Systems Kitarobo and - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Development and Operational Result of Real Environment Speechoriented Guidance Systems Kitarobo and

Description:

Development and Operational Result of Real Environment Speech ... Hiromichi KAWANAMI, Tobias CINCAREK, Shota TAKEUCHI, Hiroshi SARUWATARI, Kiyohiro SHIKANO ... – PowerPoint PPT presentation

Number of Views:93
Avg rating:3.0/5.0
Slides: 21
Provided by: mar101
Category:

less

Transcript and Presenter's Notes

Title: Development and Operational Result of Real Environment Speechoriented Guidance Systems Kitarobo and


1
Development and Operational Result of Real
Environment Speech-oriented Guidance Systems
Kita-robo and Kita-chan
Hiromichi KAWANAMI, Tobias CINCAREK, Shota
TAKEUCHI, Hiroshi SARUWATARI, Kiyohiro SHIKANO
kawanami_at_is.naist.jp
Nara Institute of Science and Technology
2
Outline
  • Motivation and Goal
  • Introduction of speech oriented information
    guidance systems, Kita-chan and Kita-robo
  • Kita-chan and Kita-robo user speech database

3
Motivation and Goal
  • Investigation of data portability
  • How much do AM/LM/example questions by a dialogue
    system contribute to a new dialogue system?
  • How much should we transcribe and label data of
    the new system manually in addition to the above
    (to realize response accuracy to the level of the
    preceding system)?
  • ? Cincarek, et al., Trans. IEICE(E) (to be
    published)
  • The result enables to estimate cost for
    developing a new dialogue system.
  • Comparison of CG agent system and Robot body
  • Which interface is used? By what age groups?
  • The result enables to support to design
    appropriate interfaces.

4
Outline
  • Motivation and Goal
  • Introduction of speech oriented information
    guidance systems, Kita-chan and Kita-robo
  • The preceding system, Takemaru-kun
  • Kita-chan with CG agent and Kita-robo with
    robot-body
  • System structures
  • Speech recognition module
  • Response generation module
  • Kita-chan and Kita-robo database

5
The preceding system, Takemaru-kun
  • Location
  • Entrance of a public center
  • Domain
  • Facilities of the center
  • Local information (the city, sightseeing,
    traffic, public institution)
  • General (News, weather forecast, date, time)
  • Character profile, Greetings
  • Dialogue strategy
  • Example-based one-question-one-answer
  • Interface
  • User input speech, mouse
  • System response synthetic speech, CG animation,
    web browser
  • Operation period Nov. 2002 to present

6
Appearance of Takemaru-kun
The North Community Center, Ikoma city, Nara
CG agent animation
Web browser
directional microphone
mouse
speaker
Takemaru-kun is a mascot character of Ikoma city.
7
Kita-chan and Kita-robo appearances
Railway station, Gakken Kita-Ikoma In Ikoma
city, Nara
Kita-chan dialogue system
Kita-robo dialogue system
8
Appearance of Kita-chan
speakers
directional microphone
Web browsers
CG agent animation
Touch panel display
Kita-chan is a mascot character of
Gakken-kita-ikoma station.
9
Appearance of Kita-robo
speakers
(Movie camera for speaker detection (plan))
Web browser
CG eyes animation
directional microphone
No mouse or touch panel
10
System comparison
11
Speech recognition module
GMMs (adult /child / laugh/cough/noise)
Mic. input
if noise/cough/laugh, reject
Power and ZC threshold
Speech / noise discrimination using GMMs and
length
if speech, continue decoding
Duration threshold
Parallel decoding
System input
Using adult AM/LM (N-gram)
AM likelihood comparison
Using child AM/LM (N-gram)
if Using adult AM/LM (described grammar)
Likelihood threshold
reject
Using child AM/LM (described grammar)
Text, used decoder info.
Response generation module (next slide)
12
Response generation module
if decoded by described grammar
Generating surface sentence using rules
(separated for adult and for child)
Response sentence, Web URL, Animation
text
decoder
Searching the most similar example question and
outputting the corresponding response
if decoded by N-gram
if child
if adult
QADB (Pairs of example question and system
response ) for child
QADB (Pairs of example question and system
response ) for adult
13
From Takemaru-kun Video
  • noise, cough, laugh rejection
  • Adult/child discrimination

14
Outline
  • Motivation and Goal
  • Introduction of speech oriented information
    guidance systems, Kita-chan and Kita-robo
  • Kita-chan and Kita-robo database
  • System inputs of 21 months (since Mar. 2006)
  • Eight months database with manual transcription
    and label
  • Preliminary analysis

15
Database
  • All system inputs to the two systems are recorded
  • Twenty-one months (in present)
  • Noise input is also preserved.
  • Manual Database
  • Database of first eight months from each systems
    with manual transcription and labels by hearing
    of 5 labelers

16
Manual Database
  • Waveform with speech/noise classification
    information
  • if speech,
  • Transcription
  • Transcription
  • Pronunciation using Kana
  • Noise tag insertion to them
  • noise, background conversation, lack of initial
    part, overflow, etc.
  • Valid / Invalid classification
  • Valid Utterance which intends to get a system
    response
  • Invalid Utterance which does not intend to get a
    system response
  • Label
  • Age group
  • Pre-school / lower grade student / higher grade
    student / adult / elderly
  • Gender
  • All labels are given subjectively by hearing.
  • (Appropriate system response for Valid utterance)

17
Operational results from Database
Numbers of valid utterance input to Kita-chan.
7 months (2006/04 to 2006/07) total Valid
input total 14,682 utterances Invalid input
total 12,849 utterances
18
Operational results from Database
Numbers of valid utterance input to Kita-robo.
7 months (2006/04 to 2006/07) total Valid
input total 27,397 utterances Invalid input
total 21,637 utterances
19
Conclusion
  • Introduction of Kita-chan and Kita-robo
  • Kita-chan with CG agent and Kita-robo with
    robot-body
  • Real environment system at a Railway station
  • Database
  • System inputs of 21 months (since Mar. 2006)
  • Eight months database with manual transcription
    and label
  • Operational result
  • The Number of Kita-robo inputs are about two
    times to Kita-chan.

20
  • Thank you for your attention.
Write a Comment
User Comments (0)
About PowerShow.com