HLT Development - PowerPoint PPT Presentation

About This Presentation
Title:

HLT Development

Description:

Robustness - ability to handle more corrupt input and graceful degradation of performance: ... Definition of the Scenario for SC-1. Timeline for data annotation ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 18
Provided by: AlonL
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: HLT Development


1
HLT Development
  • NESPOLE! Pittsburgh Meeting
  • December 4, 2000

2
Session Agenda
  • HLT Server demo
  • Partner updates on HLT module development SR,
    Analysis, Generation
  • Status of HLT servers and architecture
    functionality, coverage,
  • Status of data collection, transcription and
    annotation
  • Planning of D6 annotated data for 1st showcase
  • Development timelines

3
Timeline for Integration and HLT Development
  • Mar 1,01 Demonstration at EC
  • Feb 23,01 Complete full system tested
  • Feb 12,01 Start intensive tests between E/G/F
    clients and I agent at APT - initial user
    studies!
  • Jan 29,01 System integration complete, begin
    technical tests
  • Jan 15,01 Each site completes integration with
    Aethra mediator, starts tests of integration

4
D6 Annotated Data for SC-1
  • Description of scenario development
  • Description of data collection procedures
  • Summary of data collected
  • Annotated data for the four languages? (at least
    samples) Fabio - check with PO

5
Discussion Issues for Tuesday
  • Prioritize scenarios, focus on ONE?
  • Functionality of mediator audio transmission to
    both sides.
  • Status of mediator-HLT integration (timestamps
    etc.)
  • Lessons learned from data collection - Celine and
    Susi

6
Nespole! HLT Objectives
  • Scalability- expansion of existing domain
  • expanding coverage of IF to broader Travel Domain
    as required for first showcase
  • development of analysis and generation approaches
    that support easy expansion
  • new broad and general IF representation and
  • appropriate analysis and generation approaches

7
Nespole! HLT Objectives
  • Portability- easy expansion into new domains
  • extending existing IF with Domain Actions for
    other domains (Help Desk for 2nd showcase)
  • new broad IF representation
  • new analysis and generation approaches that are
    appropriate for the new broad IF

8
Nespole! HLT Objectives
  • Robustness - ability to handle more corrupt input
    and graceful degradation of performance
  • multiple alternative analysis/translation
    approaches
  • better identification of out-of-domain utterances
    and confidence measures

9
HLT Server Components
  • Each HLT Server consists of an Analysis Chain and
    a Generation Chain
  • Analysis Chain
  • Speech Recognition analysis into IF
  • Generation Chain
  • Generation from IF Speech Synthesis
  • Each site free to develop its own analysis and
    generation technology
  • Communication between modules is primarily via
    IF, using the ComSwitch server and protocol

10
Main Constraints and Requirements
  • Maintain site technology freedom and distributed
    HLT development as much as possible
  • Leverage off existing C-STAR technology
  • start with existing analysis and generation
    engines
  • use (extend) C-STAR CommSwitch protocol
  • New server architecture allows
  • constant availability for testing and development
  • plug-and-play of new modules
  • separation of external API issues from required
    HLT communication

11
CMU/UKA Approach
  • New analysis approach for domain-specific
    task-oriented language combines rule-based and
    statistical/trainable methods
  • New analysis engine for new style IF, using chunk
    parser followed by new combiner and mapper
  • Possibly addition of MEMT direct translation
    approach for coverage and robustness
  • Effective combination and disambiguation of all
    above approaches
  • New generation from IF using GenKit

12
New Approach SALT
  • SALT - Statistical Analyzer for Lang. Translation
  • Combines ML trainable and rule-based analysis
    methods for robustness and portability
  • Rule-based parsing restricted to well-defined set
    of argument-level phrases and fragments
  • Trainable classifiers (NN, Decision Trees, etc.)
    used to derive the DA (speech-act and concepts)
    from the sequence of argument concepts.
  • Phrase-level grammars are more robust and
    portable to new domains

13
Alternative Approach MEMT
  • Multi Engine Machine Translation
  • Translates directly into target language (no IF)
  • Based on Pangloss/Diplomat translation system
    developed at CMU
  • Uses a combination of EBMT, phrase glossaries and
    a bilingual dictionary
  • English/German system operational
  • Good fall-back for uncovered utterances

14
Data Collection for First Showcase
  • Data collection with APT agent
  • real dialogues between users and APT agents
  • monolingual dialogues
  • 288 English dialogues collected in 4 sessions
  • 28 dialogues transcribed
  • none annotated with IF (yet)
  • Lessons and Comments
  • realistic scenario
  • uneven dialogues agent dominates conversation
  • problems with recording/collection setup

15
Data Transcription andAnnotation
  • May-00 Goals and Time-line
  • 50 dialogues per language, 4 dialogues per hour
  • data collection by end of August
  • transcription by end of September
  • Annotation with IF by end of October
  • Revised schedule...

16
Points for Discussion
  • Definition of the Scenario for SC-1
  • Timeline for data annotation
  • Timeline for HLT module development
  • Planning D6

17
Definition of Scenario (May-00)
  • Analysis of APT email data (Paolo)
  • 9 main categories
  • developed 20 specific scenarios
  • APT will look at scenarios and prioritize them,
    and prioritize web pages (for translation to
    French) within 10 days
  • We will use existing web pages for APT (in
    I,G,E), and some translated into French
  • Goal is to focus on up to 10 scenarios
Write a Comment
User Comments (0)
About PowerShow.com