NESPOLE Project Status Carnegie Mellon University - PowerPoint PPT Presentation

1 / 10
About This Presentation
Title:

NESPOLE Project Status Carnegie Mellon University

Description:

NESPOLE! Project Status. Carnegie Mellon University. Ancona Meeting. April 18-19, 2002 ... Packet-loss, video, and modem connection tests. Data Collection for ... – PowerPoint PPT presentation

Number of Views:46
Avg rating:3.0/5.0
Slides: 11
Provided by: AlonL
Category:

less

Transcript and Presenter's Notes

Title: NESPOLE Project Status Carnegie Mellon University


1
NESPOLE! Project StatusCarnegie Mellon University
  • Ancona Meeting
  • April 18-19, 2002

2
Recent Developments Apr-02
  • Improved analysis and generation grammars (using
    old C-STAR data)
  • Improved SR engines
  • Packet-loss, video, and modem connection tests
  • Data Collection for Showcase 2A
  • Evaluation Scheme Experiment
  • Paper and Demo at HLT-02
  • Paper submissions to ACL-02, ICSLP-02, ESSLLI-02

3
IF Status Report
  • Presented by Donna Gates

4
WP5 HLT Modules
  • Data Collection for Showcase-2A completed in
    February-2002
  • Status of transcriptions from all sites?
  • CMU will maintain a data repository (Alon
    collecting all data CDs here)
  • IF discussions and development have already
    started (Donna)
  • Development Schedule?

5
WP7 Evaluation
  • D9 Evaluation of Showcase-1 Report draft
    circulated earlier this week
  • Each site should verify that most up-to-date
    results are being reported
  • Include detailed tables in the report?
  • Majority vote finalize a common procedure
  • New evaluation experiments

6
Majority Vote Scheme
  • Issue did all sites use same guidelines?
  • What to do when there is no majority?
  • i.e. 4 graders assign P/P/K/K
  • What to do when there is complete disagreement?
  • i.e. 3 graders assign P/K/B
  • Need to recalculate scores from prev evaluation?

7
New Evaluation Experiments
  • We are investigating three main issues
  • Binary versus 3-way grading
  • Majority vote versus averaging of scores
  • Intercoder and Intracoder agreement
  • Grading Experiment
  • Four groups, three graders in each group
  • Each group grades two sets, two weeks apart
  • Sets are different but have a common large
    overlap
  • Groups differ in eval scheme used (binary/3-way)

8
Planned Analysis of Data
  • Compare results across grading schemes (binary
    vs. 3-way) on same set of data
  • Compare majority scores with average scores
  • Evaluate Intercoder agreement between graders (on
    same set and same scheme)
  • Evaluate Intracoder agreement of same grader (on
    overlap data in the two sets, same grading scheme
    in both sessions)

9
Preliminary Results
10
Plans for Final Evaluations
  • Improved end-to-end evaluations
  • Additional component evaluations?
  • Additional user studies?
  • How do we evaluate user interfaces, communication
    effectiveness?
Write a Comment
User Comments (0)
About PowerShow.com