Voice Based Email System - PowerPoint PPT Presentation

1 / 22
About This Presentation

Voice Based Email System


... certain characters, including less-than ( ), greater than ( ), and ampersand ... Ampersand Greater Than ( ) Less-than ( ) Voice XML. Voice Server ... – PowerPoint PPT presentation

Number of Views:7569
Avg rating:3.0/5.0
Slides: 23
Provided by: rob870


Transcript and Presenter's Notes

Title: Voice Based Email System

Voice Based Email System
  • Masters Project
  • Advisor Dr. Irwin Levinstein
  • Student Sriram Mallepedi

  • For most people today, the preferred means of
    communication is by email. Most people use
    computers to check their email, which means that
    if they are away from a computer or are using a
    computer without Internet access, they cannot
    check their email.
  • Voice Based Email System provides, a way for
    you to check your email and even send email. The
    Voice Based Email System will check your email
    and read whatever message you want in your INBOX
    aloud over the telephone.

Technologies Used
  • XML
  • Voice XML
  • BeVocal Café
  • Tellme
  • Voxeo
  • Java Servlets
  • JavaMail
  • JAF
  • Java Server Pages

XML Basics
  • An XML document is comprised of one or more named
    elements organized into a nested hierarchy.
  • Element - An element is an opening tag, some
    data, and a closing tag.
  • Tag - A tag is an element name preceded by a
    less-than symbol (lt) and followed by a
    greater-than (gt) symbol.
  • Attribute - An attribute specifies properties of
    the element that you modify and consists of a
    name/value pair. Attribute values must be
    contained in matching single or double quotes.

XML Rules
  • For any given element, the name of the opening
    tag must match that of the closing tag.
  • Tag names are case-sensitive.
  • An element may include zero or more attributes.
  • Elements may be arranged in an infinitely nested
    hierarchy, but only one element in the document
    can be designated as the root document element.
  • Sample
  • ltsalutationsgt
  • ltwelcome type"texan"gtHowdy,
  • ltwelcome type"australian"gtG'day,
  • lt/salutationsgt

XML Reserved Characters
  • XML reserves certain characters, including
    less-than (lt), greater than (gt), and ampersand
    (). To express these characters in your document
    data, use the equivalent character entity

Voice XML
Voice Server
  • Telephony The Voice XML server must first
    accept call from the telephony server and access
    telephony data regarding the call such as dialed
    and the dialing numbers. Based on the callers
    dialing number, the server may provide Voice XML
    applications with locality information, such as
    the callers city and sate. If the user is
    transferred to another phone number using the
    transfer tag, the Voice XML server is responsible
    for initiating a new outbound call leg to the
    transfer number and bridging it with the users
  • URL database When a call is terminated at the
    server, the server must match the dialed number
    to the desired services URL. The server may have
    provisioning and billing systems associated with
    this database.

Voice Server continued..
  • Retrieving Voice XML When the Voice XML
    services URL is know, the gateway must retrieve
    the Voice XML, page and associated files, such as
    recorded audio and grammar files, from the
    servers Web host.
  • Interpreting Voice XML With the applications
    Voice XML code and associated files now located
    on the server, the server must interpret the
    code, stepping through the dialogs and
    interacting with ASR,TTS,DTMF and other services
    as required. This may involve requesting
    additional files from the Web server.
  • Accessing ASR and TTS These services may be
    hosted on the Voice XML server as either software
    or hardware or may be located remotely on a
    server with dedicated speech processing
  • Caching The Voice XML server can cache
    prerecorded audio files, grammars and Voice XML
    pages themselves.

Voice XML Basics
  • XML is designed to represent arbitrary data,
    VoiceXML describes grammars, prompts, event
    handlers, and other data structures useful in
    describing voice interaction between a human and
    a computer.
  • VoiceXML, a specific kind of XML designed to
    describe voice applications. VoiceXML follows all
    the rules of XML and some more.
  • At the root of every VoiceXML document is a root
    element, the vxml element.
  • VoiceXML has predefined elements which the voice
    browser can understand. These elements have
    predefined set of attributes.

Application Structure
  • A VoiceXML application consists of a set of
    VoiceXML documents.
  • Each VoiceXML document contains one or more
    dialogs describing a specific interaction with
    the user.
  • Dialogs may present the user with information or
    prompt the user to provide information, and when
    complete, they can redirect the flow of control
    to another dialog in that document, to a dialog
    in another document in the same application, or
    to a dialog in another application entirely.
  • VoiceXML provides two types of dialogs form and

  • VoiceXML's primary user interface paradigm is
    that of a form with a number of elements that
    either provide information for the user or
    present fields to be filled by user input.
  • Every form contains one or more form items, which
    are elements within a form that describe some
    kind of user interaction related to filling-in
    the form.
  • Children Elements block, catch, data, error,
    field, filled, grammar, help, initial, link,
    noinput, nomatch, object, property, record,
    script, subdialog, transfer, var

  • The menu element is a convenient shorthand
    version for a form. The menu element uses the
    prompt element to present a list of choices to
    the user.
  • Each option in the list is mirrored in a choice
    element as a speech or DTMF grammar fragment.
  • If the user's input matches the grammar fragment,
    the Platform navigates to the location specified
    by the next or expr attribute or fires the event
    specified by the event attribute of that choice
  • Children Elements audio, catch, choice, data,
    enumerate, error, help, noinput, nomatch, prompt,
    property, script, value, var

  • Primary method of gathering input from a user is
    through the field element. Field is a blank slate
    waiting to be filled by user input.
  • For each field, you must provide a grammar, which
    is a piece of information describing the
    allowable user inputs for a given field. The
    VoiceXML interpreter uses this information to
    determine whether the user's response was
    meaningful for this particular field.
  • You can specify grammars in one of two ways
    through the type attribute of the field element,
    which causes the VoiceXML interpreter to use one
    of its built-in grammars, or through a grammar
    element, which contains a grammar of your own

  • A grammar defines the set of valid expressions
    that a user can say or type when interacting with
    a voice application.
  • Each interactive dialog in an application
    references one or more grammars using one or more
    grammar elements.
  • VoiceXML provides you with several choices when
    integrating grammars into your voice application
  • Reference a 'builtin' grammar by setting the type
    attribute of the field element.
  • Define your own grammar using the Nuance Grammar
    Specification Language (GSL)
  • VoiceXML supports touch-tone input also known as
    DTMF input and voice input.

Development Cycle
  • The typical development life cycle for a voice
    application using the BeVocal Cafe is as follows
  • Design the dialog. The dialog of an application
    is an essential part of the voice user interface.
  • Build an application in VoiceXML using static
  • Use the VoiceXML Checker to check for syntax
    errors and use the Vocal Debugger to test flow of
  • Test the application by calling the toll free
    number (1-877-33-VOCAL) provided by the Cafi.
  • Extend the application to serve dynamic content
    using server side scripts (such as JSP)
  • Deploy your application

Voice Based Email System
  • Voice Based Web based user interface.
  • Use Microsoft Exchange as the mail management
  • Receive emails with text voice attachments
    using IMAP.
  • Send emails with text voice attachments using
    SMTP Forward Reply.
  • Delete emails from both (voice web) interface.
  • Move emails to specific folders from web
  • Manage folders from web based user interface
    create, delete edit.
  • Session is maintained on voice based UI as long
    as user does not hang up.

(No Transcript)
Restrictions and Problems Faced
  • Restrictions
  • Due to Tellmes FetchTimeOut variable, a limit
    of 6.5MB per audio attachment file is imposed.
  • We use BeVocal Cafe website for voiceXML
    interpretation so we have to use its grammer.
  • Problems Faced
  • There is a default configuration limit of 2MB on
    the connector in Tomcat.
  • BeVocal Café does not support Session.getDefaultIn
    stance but supports Session.getInstance

Future Enhancements
  • Support for multiple email accounts
  • Address Book Feature
  • SMS message on cell phone when a new message
  • WAP based UI to access email from cell phone
  • Queue management system from web based UI

  • Thank you.
Write a Comment
User Comments (0)
About PowerShow.com