Multimodal Architecture for Integrating Voice and Ink XML Formats - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Multimodal Architecture for Integrating Voice and Ink XML Formats

Description:

Multimodal Architecture for Integrating Voice and Ink XML Formats Under the guidance of Dr. Charles Tappert By Darshan Desai, Shobhana Misra, Yani Mulyani, Than NyiNyi – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 13
Provided by: Kus46
Category:

less

Transcript and Presenter's Notes

Title: Multimodal Architecture for Integrating Voice and Ink XML Formats


1
Multimodal Architecture for Integrating Voice and
Ink XML Formats
  • Under the guidance of
  • Dr. Charles Tappert
  • By
  • Darshan Desai, Shobhana Misra,
  • Yani Mulyani, Than NyiNyi

2
Agenda
  • Introduction of Architecture
  • System Architecture
  • Implemented Design Model
  • Sample Dialogue Design
  • InkXML Architecture
  • Tools Used
  • Conclusion

3
Introduction of Architecture
  • Generic nature
  • Supports development of multimodal applications
    that can handle speech, ink, and touch-tone
    digits integration patterns, and also can
    interpret unimodal speech, ink, and touch-tone
    digits input, as well as combined multi-modal
    input.
  • System consists of Ink/Voice SDKs and a
    multimodal integrator.
  • Voice SDK provides the voice processing
    capabilities.
  • Ink SDK processes the information entered through
    ink media.
  • Multimodal integrator handles disambiguation,
    errors and generates the confirmation feedback.
  • Dialogue design

4
System Architecture
5
Implemented Design Model
Voice Input/ Output Device
TTS ENGINE
VOICE XML Browser
CISCO Router
Speech To Text Engine
PSTN
DATABASE
INK XML Interpreter (Java/C)
Ink Input Device
Handwriting Recognition Engine
6
Sample Dialogue Design
  • (Banking information application)
  • System You can access your existing account or
    you can open a new account. What would you like
    to do?
  •  User Check existing account
  •  System Did you say existing account?
  •  User Yes
  •  System Please enter your account number.
  •  User one eight one four six five
  •  System Did you write one eight seven four six
    five
  •  User No
  •  System Sorry My Mistake. Please enter your
    account number.
  •  User one eight one four six five
  •  System Did you write one eight one four six
    five
  •  User Yes
  •  System Please speak your four digit, pin number
  •  User one two three four
  •  System Did you say one two three four?
  •  User Yes
  •  System Please use the ink to input your full
    name.
  •  Control passes to the ink media. The system
    waits for the user to input the new text and
    submit.

7
  • Cont.
  • System Choose personal information, checking or
    savings.
  •  User personal information
  •  System Did you say personal information?
  •  User Yes
  •  System What would you like to do? Access your
    information or change your information.
  •  User Change information
  •  System Did you say change information
  •  User Yes
  •  System Would you like to change the address or
    telephone number or exit?
  •  User Address
  •  System Did you say address?
  •  User Yes
  •  System Please enter your new address by ink
  •  Control passes to the ink media. The system
    waits for the user to input the new text and
    submit. Once the user has submitted the data the
    control switches back to voice.
  •  System Did you write one martine av white
    plains new york one zero six zero three
  •  User Yes
  •  System Your address has been changed.

8
InkXML
  • InkXMLs primary goal is to bring the full power
    of web development and content delivery to ink
    applications.
  • InkXML enables the exchange of virtual ink among
    devices, such as handhelds, laptops, desktops,
    and servers.
  • InkXML will provide the ink component of web
    based multimodal applications
  • Numerous standards already exist that are closely
    related to or could be used to represent digital
    ink. (eg. ITU T-150, UNIPEN and Jot)
  • InkXML has two requirements functional meaning
    enumerate functions required by ink applications
    and pragmatic makes inkxml usable and efficient
    for developing ink applications

9
InkXML Architecture
Application
SDK Library
Ink Log Generator
API
Event Handler
Driver
Pen Hardware
10
Tools Used
  • Software
  • VoiceXML gateway(Nuance Voice Server)
  • Tomcat Server
  • Ink SDK (IBM)
  • Windows 2000 Server
  • Pingtel softphone (for sip dialup)
  • Hardware
  • Wacom pen tablet
  • Cisco 2600 router with FX0 card
  • Enterprise server
  • Microphone and speakers

11
Conclusion
  • The proposed architecture for developing
    multimodal voice/ink applications for noisy
    mobile environments combines different input
    modalities to facilitate the development of
    robust and friendly multimodal applications
    supporting superior error handling.
  • We envision that users will soon employ smart
    devices such as wireless phones with integrated
    pen tablets and more powerful processing
    capabilities to take full advantage of the
    proposed multimodal voice/ink architecture.
  • Such smart devices should be able to perform
    locally enhanced media processing, such as voice
    recognition, speech synthesis, and handwriting
    recognition.
  • Graphic generation capabilities on the users pen
    tablets should also enhance the efficiency of
    multimodal applications and may allow for the
    development of applications for a broader
    spectrum of the population, including permanently
    and temporarily disabled users

12
Advisors
  • Dr. Charles Tappert
  • Dr. Zouheir Trabelsi
  • Yi-Min Chee (IBM T.J.Watson)
  • Dr. Michael Perrone (IBM T.J.Watson)
Write a Comment
User Comments (0)
About PowerShow.com