To Share a Task or Not: Some Ramblings from a Mad i'e', crazy INLGer - PowerPoint PPT Presentation

About This Presentation
Title:

To Share a Task or Not: Some Ramblings from a Mad i'e', crazy INLGer

Description:

Someone gets a gold star? 6. The kind you want depends on your ultimate goal. ... What about referring expressions in news stories? Pronoun use? Conjunctions? ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 12
Provided by: Gest1
Category:
Tags: crazy | inlger | mad | news | ramblings | share | star | task

less

Transcript and Presenter's Notes

Title: To Share a Task or Not: Some Ramblings from a Mad i'e', crazy INLGer


1
To Share a Task or Not Some Ramblings from a Mad
(i.e., crazy) INLGer
  • Kathy McCoy
  • CIS Department
  • University of Delaware

2
What is intended by Shared Task?
  • A competition for money?
  • A funded activity in itself?
  • A competition just for the fun of it?
  • A competition or a cooperation?
  • A cooperation would entail groups of researchers
    collaborating on a larger system (need
    agreed-upon architecture)
  • A competition would entail different groups
    working against each other on the same problem

3
What is the desired outcome?
  • An advance in technology that may be applicable
    in lots of different places?
  • An advance in NLG technology that will allow more
    commercialization? bigger web presence? more
    excitement?
  • More funding for INLG research?
  • More publications of INLG research?
  • Bring more people into the field?
  • Get some important task done (that needs INLG)?

4
What about Comparative Evaluations?
  • Major problem here is that we must agree on what
    is to be evaluated and how.
  • Must have a number of different groups working on
    precisely the same problem with same assumptions.
  • What is the desired outcome of comparative
    evaluations?
  • We get to name a system winner?
  • Presumably we would learn something about the
    task, but it isnt quite clear to me what that
    something is.

5
2 Ends of the Spectrum in Shared Task/Evaluations
  • The killer application
  • Text summarization should have been it!
  • Generates excitement in the field
  • Generates funding opportunities
  • Component pieces
  • Referring expression generation is an example
  • What will be accomplished?
  • Someone gets a gold star?

6
  • The kind you want depends on your ultimate goal.
  • Both share some dangers revolving around choice
    of evaluation methods.

7
Dangers in Shared Task
  • Exclusion
  • Shared task metrics become the de facto standard
    for evaluating research in the field
  • Doesnt allow one to do research that doesnt do
    well with the metrics (and the metrics are going
    to be prejudiced)
  • May leave generation behind Killer Apps may
    find such interesting problems that generation
    becomes secondary.
  • Emphasis on shallow processing excluding
    theoretical benefits

8
Multiple or Human-Based Metrics Dont Help
9
The Killer App Story
  • The application itself must define the
    appropriate metric(s) does the application
    work?
  • Many of the things we hold near and dear have a
    significantly smaller influence than some other
    things
  • Discourse coherence
  • Complicated syntax/variation in syntax
  • Lexical choice
  • Referring expression generation

10
But
  • We KNOW these things are important!
  • Problem becomes
  • Other more important aspects are deemed to make
    more of a difference
  • By the time these issues come up, people have
    invested too much time into a particular kind of
    solution

11
Comparative Evaluations
  • The nature of the shared/agreed upon evaluation
    methods placed a judgment on importance of some
    aspects over others
  • Evaluation is necessarily prejudiced with respect
    to which issues are stressed
  • Referring expressions distinguishing
    descriptions with concrete knowledge base
  • What about referring expressions in news stories?
    Pronoun use? Conjunctions? Influence of
    surrounding text?
Write a Comment
User Comments (0)
About PowerShow.com