Voice Based Email System - PowerPoint PPT Presentation

1 / 22

About This Presentation

Title:

Voice Based Email System

Description:

... certain characters, including less-than ( ), greater than ( ), and ampersand ... Ampersand Greater Than ( ) Less-than ( ) Voice XML. Voice Server ... – PowerPoint PPT presentation

Number of Views:7859

Avg rating:3.0/5.0

Slides: 23

Provided by: rob870

Category:

more less

Transcript and Presenter's Notes

Title: Voice Based Email System

1
Voice Based Email System

Masters Project
Advisor Dr. Irwin Levinstein
Student Sriram Mallepedi

2
Introduction

For most people today, the preferred means of
communication is by email. Most people use
computers to check their email, which means that
if they are away from a computer or are using a
computer without Internet access, they cannot
check their email.
Voice Based Email System provides, a way for
you to check your email and even send email. The
Voice Based Email System will check your email
and read whatever message you want in your INBOX
aloud over the telephone.

3
Technologies Used

XML
Voice XML
BeVocal Café
Tellme
Voxeo
Java Servlets
JavaMail
JAF
Java Server Pages

4
XML Basics

An XML document is comprised of one or more named
elements organized into a nested hierarchy.
Element - An element is an opening tag, some
data, and a closing tag.
Tag - A tag is an element name preceded by a
less-than symbol (lt) and followed by a
greater-than (gt) symbol.
Attribute - An attribute specifies properties of
the element that you modify and consists of a
name/value pair. Attribute values must be
contained in matching single or double quotes.

5
XML Rules

For any given element, the name of the opening
tag must match that of the closing tag.
Tag names are case-sensitive.
An element may include zero or more attributes.
Elements may be arranged in an infinitely nested
hierarchy, but only one element in the document
can be designated as the root document element.
Sample
ltsalutationsgt
ltwelcome type"texan"gtHowdy,
partnerlt/welcomegt
ltwelcome type"australian"gtG'day,
matelt/welcomegt
lt/salutationsgt

6
XML Reserved Characters

XML reserves certain characters, including
less-than (lt), greater than (gt), and ampersand
(). To express these characters in your document
data, use the equivalent character entity

7
Voice XML
8
Voice Server

Telephony The Voice XML server must first
accept call from the telephony server and access
telephony data regarding the call such as dialed
and the dialing numbers. Based on the callers
dialing number, the server may provide Voice XML
applications with locality information, such as
the callers city and sate. If the user is
transferred to another phone number using the
transfer tag, the Voice XML server is responsible
for initiating a new outbound call leg to the
transfer number and bridging it with the users
call.
URL database When a call is terminated at the
server, the server must match the dialed number
to the desired services URL. The server may have
provisioning and billing systems associated with
this database.

9
Voice Server continued..

Retrieving Voice XML When the Voice XML
services URL is know, the gateway must retrieve
the Voice XML, page and associated files, such as
recorded audio and grammar files, from the
servers Web host.
Interpreting Voice XML With the applications
Voice XML code and associated files now located
on the server, the server must interpret the
code, stepping through the dialogs and
interacting with ASR,TTS,DTMF and other services
as required. This may involve requesting
additional files from the Web server.
Accessing ASR and TTS These services may be
hosted on the Voice XML server as either software
or hardware or may be located remotely on a
server with dedicated speech processing
capabilities.
Caching The Voice XML server can cache
prerecorded audio files, grammars and Voice XML
pages themselves.

10
Voice XML Basics

XML is designed to represent arbitrary data,
VoiceXML describes grammars, prompts, event
handlers, and other data structures useful in
describing voice interaction between a human and
a computer.
VoiceXML, a specific kind of XML designed to
describe voice applications. VoiceXML follows all
the rules of XML and some more.
At the root of every VoiceXML document is a root
element, the vxml element.
VoiceXML has predefined elements which the voice
browser can understand. These elements have
predefined set of attributes.

11
Application Structure

A VoiceXML application consists of a set of
VoiceXML documents.
Each VoiceXML document contains one or more
dialogs describing a specific interaction with
the user.
Dialogs may present the user with information or
prompt the user to provide information, and when
complete, they can redirect the flow of control
to another dialog in that document, to a dialog
in another document in the same application, or
to a dialog in another application entirely.
VoiceXML provides two types of dialogs form and
menu.

12
Forms

VoiceXML's primary user interface paradigm is
that of a form with a number of elements that
either provide information for the user or
present fields to be filled by user input.
Every form contains one or more form items, which
are elements within a form that describe some
kind of user interaction related to filling-in
the form.
Children Elements block, catch, data, error,
field, filled, grammar, help, initial, link,
noinput, nomatch, object, property, record,
script, subdialog, transfer, var

13
Menu

The menu element is a convenient shorthand
version for a form. The menu element uses the
prompt element to present a list of choices to
the user.
Each option in the list is mirrored in a choice
element as a speech or DTMF grammar fragment.
If the user's input matches the grammar fragment,
the Platform navigates to the location specified
by the next or expr attribute or fires the event
specified by the event attribute of that choice
element.
Children Elements audio, catch, choice, data,
enumerate, error, help, noinput, nomatch, prompt,
property, script, value, var

14
Field

Primary method of gathering input from a user is
through the field element. Field is a blank slate
waiting to be filled by user input.
For each field, you must provide a grammar, which
is a piece of information describing the
allowable user inputs for a given field. The
VoiceXML interpreter uses this information to
determine whether the user's response was
meaningful for this particular field.
You can specify grammars in one of two ways
through the type attribute of the field element,
which causes the VoiceXML interpreter to use one
of its built-in grammars, or through a grammar
element, which contains a grammar of your own
creation.

15
Grammar

A grammar defines the set of valid expressions
that a user can say or type when interacting with
a voice application.
Each interactive dialog in an application
references one or more grammars using one or more
grammar elements.
VoiceXML provides you with several choices when
integrating grammars into your voice application
Reference a 'builtin' grammar by setting the type
attribute of the field element.
Define your own grammar using the Nuance Grammar
Specification Language (GSL)
VoiceXML supports touch-tone input also known as
DTMF input and voice input.

16
Development Cycle

The typical development life cycle for a voice
application using the BeVocal Cafe is as follows
Design the dialog. The dialog of an application
is an essential part of the voice user interface.
Build an application in VoiceXML using static
content
Use the VoiceXML Checker to check for syntax
errors and use the Vocal Debugger to test flow of
dialogs.
Test the application by calling the toll free
number (1-877-33-VOCAL) provided by the Cafi.
Extend the application to serve dynamic content
using server side scripts (such as JSP)
Deploy your application

17
Voice Based Email System

Voice Based Web based user interface.
Use Microsoft Exchange as the mail management
system.
Receive emails with text voice attachments
using IMAP.
Send emails with text voice attachments using
SMTP Forward Reply.
Delete emails from both (voice web) interface.
Move emails to specific folders from web
interface.
Manage folders from web based user interface
create, delete edit.
Session is maintained on voice based UI as long
as user does not hang up.

18
Architecture
19
(No Transcript)
20
Restrictions and Problems Faced

Restrictions
Due to Tellmes FetchTimeOut variable, a limit
of 6.5MB per audio attachment file is imposed.
We use BeVocal Cafe website for voiceXML
interpretation so we have to use its grammer.
Problems Faced
There is a default configuration limit of 2MB on
the connector in Tomcat.
BeVocal Café does not support Session.getDefaultIn
stance but supports Session.getInstance

21
Future Enhancements