Microsoft Speech Server PowerPoint PPT Presentation

presentation player overlay
1 / 28
About This Presentation
Transcript and Presenter's Notes

Title: Microsoft Speech Server


1
Microsoft Speech Server 2004
2
Advanced Techniques for Speech Application
Development
Duane Laflotte MCSE, MCSD, MCT, MCPI, MCSA,
MCDBA
3
Who am I?
  • Duane Laflotte
  • CriticalSites
  • dlaflotte_at_criticalsites.com
  • Digital Samurai
  • Security Department Manager
  • Senior Developer Architect
  • MCSE, MCSD, MCT, MCPI, MCSA, MCDBA
  • Blog http//www.criticalsites.com/dlaflotte

4
Agenda
  • Our Sample Application
  • Speech Flow Control
  • Interaction With Outside Data Sources
  • Dynamic Grammars
  • Multi-ModalComputer? Hello Computer?
  • Design Considerations
  • Questions?

5
Our Sample Application
  • Blog -
  • n a shared on-line journal where people can
    post diary entries about their personal
    experiences and hobbies

6
Demo
  • This demonstration will show the blog software we
    use at CriticalSites (dasBlog www.dasblog.net)
    with some custom modifications to allow for
    speech interaction.

7
Agenda
  • Our Sample Application
  • Speech Flow Control
  • Interaction With Outside Data Sources
  • Dynamic Grammars
  • Multi-ModalComputer? Hello Computer?
  • Design Considerations
  • Questions?

8
Speech Flow Control
  • Client Side
  • RunSpeech Its all about Control
  • SpeechCommon Object
  • SemanticItem Manipulation
  • Server Side
  • ASP.Net
  • SemanticItem Manipulation

9
Client Side RunSpeech
  • RunSpeech is the client side object that manages
    dialog flow.
  • To change flow you dont interact with RunSpeech
    directly, however, dialog flow is modified by
    changing the state or attributes of semantic
    items being monitored by RunSpeech.
  • Some Methods Pause(), Resume(), CurrentCall(),
    EventSource(), OnUserDisconnected()

10
Client Side SpeechCommon
  • SpeechCommon allows you to check the state of
    SemanticItems.
  • Some Constants SpeechCommon.EMPTY,
    SpeechCommon.NEEDSCONFIRMATION,
    SpeechCommon.CONFIRMED
  • (example)
  • if (mySemanticItem.state SpeechCommon.CONFIRMED
    )

11
Client Side SemanticItem Manipulation
  • SemanticItems hold information gathered from
    users.
  • Example siNumberOfShares, siPIN, siAccountNumber
  • Some Methods SetText(string, bool), Confirm(),
    Clear(), IsEmpty(), GetAttribute(string),
    SetAttribute(String, object), isConfirmed()
  • Some Properties .value, .state, .attributes

12
Show Code
  • This demonstration will show how to manipulate
    the dialog flow utilizing client side functions,
    events, and objects.

13
Server Side ASP.Net
  • Sometimes it is necessary to retrieve data or
    access other resources that arent available in
    client-side code.
  • There are several methods to allow posting of
    data back to the server for processing.
  • Page Complete post back
  • autoPostBack when value of a SI changes
  • autoPostBack when a Command is matched against
    the grammar
  • Explicit by calling SpeechCommon.Submit()

14
Server Side SemanticItem Manipulation
  • SemanticItems can be interrogated and manipulated
    on the server side as well as the client side.
  • Example
  • if(siQuantity.State ! Microsoft.Speech.Web.UI.Se
    manticState.Empty)
  • siPrice.Text GetPrice(siQuantity.Text)

15
Show Code
  • This demonstration will show how to manipulate
    the dialog flow utilizing server side functions,
    events, and objects.

16
Agenda
  • Our Sample Application
  • Speech Flow Control
  • Interaction With Outside Data Sources
  • Dynamic Grammars
  • Multi-ModalComputer? Hello Computer?
  • Design Considerations
  • Questions?

17
Interaction With Outside Data Sources
  • Relational Databases
  • Web Services

18
Speech and Relation Data
  • DataTableNavigator
  • Web is 2D and great for displaying relational
    data. Voice is 1D which presents a problem
  • Built in commands and events help the application
    respond to users (i.e. NVG_previousOnFirstError,
    etc)
  • ADO.Net can also be used
  • All navigation of the data would need to be
    handled by the developer and not a custom control.

19
Show Code
  • This demonstration will show how to retrieve data
    from a database and how to use that dynamic data
    in your dialogs.

20
Speech and Web Services
  • You can leverage web services from your speech
    applications
  • Pros
  • Allows for dynamic, up to the second, data from
    disparate data sources.
  • Cons
  • Web services leverage the web protocol which
    isnt known for its speed.

21
Demo Show Code
  • This demonstration will show how to retrieve data
    from an external web service and use the returned
    response in your dialog.

22
Agenda
  • Our Sample Application
  • Speech Flow Control
  • Interaction With Outside Data Sources
  • Dynamic Grammars
  • Multi-ModalComputer? Hello Computer?
  • Design Considerations
  • Questions?

23
Dynamic Grammars
  • Pros
  • Allows for natural speech interface
  • Allows your application utilize responses that
    may be unknown at design time
  • Cons
  • Doesnt allow you to leverage some of the caching
    features built into Speech Server
  • May make your application difficult to debug and
    optimize

24
Example Grammar
  • lt!-- Yes._value string ("Yes") --gt
  • ltrule id"Yes" scope"public"gt
  • ltexamplegt yes lt/examplegt
  • ltexamplegt yes please lt/examplegt
  • ltexamplegt correct lt/examplegt
  • ltexamplegt sure lt/examplegt
  • ltexamplegt ok lt/examplegt
  • ltitemgt
  • ltone-ofgt
  • ltitemgt yes lt/itemgt
  • ltitemgt yeah lt/itemgt
  • ltitemgt yeh lt/itemgt
  • ltitemgt ya lt/itemgt
  • ltitemgt yup lt/itemgt
  • ltitemgt yep lt/itemgt
  • ltitemgt indeed lt/itemgt
  • ltitemgt positive lt/itemgt
  • ltitemgt ok lt/itemgt
  • ltitemgt sure lt/itemgt

25
Agenda
  • Our Sample Application
  • Speech Flow Control
  • Interaction With Outside Data Sources
  • Dynamic Grammars
  • Multi-ModalComputer? Hello Computer?
  • Design Considerations
  • Questions?

26
Multi-Modality
  • No one wants you to read the screen
  • Think of this as an opportunity to deliver extra
    content, help around a topic, or to allow for
    mixed initiative interactions.
  • Small screen real-estate and/or difficult data
    entry paradigms are perfect candidates!

27
Agenda
  • Our Sample Application
  • Speech Flow Control
  • Interaction With Outside Data Sources
  • Dynamic Grammars
  • Multi-ModalComputer? Hello Computer?
  • Design Considerations
  • Questions?

28
Design Considerations
  • Less is not more, but sometimes better then more
  • We are pleased to
  • Im sorry I didnt understand
  • Design of VUI must be done upfront
  • Design for your users base
  • Are most one time callers?
  • Should our system be advanced and allow
    Mixed-Initiative responses?
  • Is Yes/No really that hard to understand?

29
Questions?
30
Thank You!
Write a Comment
User Comments (0)
About PowerShow.com