QUALIFIER in TREC12 QA Main Task - PowerPoint PPT Presentation

About This Presentation
Title:

QUALIFIER in TREC12 QA Main Task

Description:

Given a question and a large text corpus, return an 'answer' rather than relevant 'documents' ... E.g. 'When did Bob Marley die ? ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 36
Provided by: huiy
Learn more at: http://www.cs.cmu.edu
Category:
Tags: qualifier | bob | main | marley | task | trec12

less

Transcript and Presenter's Notes

Title: QUALIFIER in TREC12 QA Main Task


1
QUALIFIER in TREC-12 QA Main Task
  • Hui Yang, Hang Cui, Min-Yen Kan, Mstislav
    Maslennikov, Long Qiu, Tat-Seng Chua
  • School of Computing
  • National University of Singapore
  • Email yangh_at_comp.nus.edu.sg

2
Outline
  • Introduction
  • Factoid Subsystem
  • List Subsystem
  • Definition Subsystem
  • Result
  • Conclusion and Future Work

3
Introduction
  • Given a question and a large text corpus, return
    an answer rather than relevant documents
  • QA is at the intersection of IR IE NLP
  • Our system - QUALIFIER
  • Consists 3 subsystems
  • External Resources Web, WordNet, Ontology
  • Event-based Question Answering
  • New Modules introduced

4
Outline
  • Introduction
  • Factoid Subsystem
  • List Subsystem
  • Definition Subsystem
  • Result
  • Conclusion and Future Work

5
Factoid System Overview
6
Factoid Subsystem
  • Detailed Question Analysis
  • QA Event Construction
  • QA Event Mining
  • Answer Selection
  • Answer Justification
  • Fine-grained Named Entity Recognition
  • Anaphora Resolution
  • Canonicalization Coreference
  • Successive Constraint Relaxation

7
Factoid Subsystem
  • Detailed Question Analysis
  • QA Event Construction
  • QA Event Mining
  • Answer Selection
  • Answer Justification
  • Fine-grained Named Entity Recognition
  • Anaphora Resolution
  • Canonicalization Coreference
  • Successive Constraint Relaxation

8
Why Event-based QA - I
  • The world consists of two basic types of things
    entities and events and people often ask
    questions about them.
  • From Question Answerings Point of View
  • Questions enquiries about entities or events.

9
Why Event-based QA - II
  • QA Entities
  • Anything having existence (living or nonliving)
  • E.g. What is the democratic party symbol?
  • QA Events
  • Something that happens at a given place and
    time.
  • E.g. How did donkey become democratic party
    symbol?

Thomas Nast
1870
Harpers Weekly cartoon
10
Why Event-based QA - III
  • Entity Questions
  • Properties, or
  • entities themselves
  • definition questions.
  • Event Questions
  • Elements of events
  • Location,
  • Time,
  • Subject,
  • Object,
  • Quantity
  • Description
  • Action, etc.
  • Table 1 Correspondence of WH-Questions Event
    Elements

question event event_element entity
entity_property event event_element
event_element time location subject
object quantity description action
other entity object subject entity_property
quantity description other
11
Event-based QA Hypothesis
  • Equivalency ? QA event Ei,Ej ,if
    all_elements(Ei) all_elements(Ej), then Ei
    Ej, and vice versa
  • Generality if all_elements(Ei) is a subset of
    all_elements(Ej), then Ei is more general than
    Ej
  • Cohesiveness if elements a, b both belong to an
    event Ei, and a, c do not belong to a known
    event, then co-occurrence(a,b) is greater than
    co-occurrence(a,c)
  • Predictability if elements a, b both belong to
    an event Ei, then a b and b a.

12
QA Event Space
  • Consider an event to be a point in a
    multi-dimensional QA event space.
  • If we know all the elements about an event, then
    we can easily answer different questions about it
  • E.g. When did Bob Marley die ?
  • As there are innate associations among these
    elements if they belong to the same event
    (Cohesiveness), we can use what are already known
  • To narrow the search scope
  • To find rest of the unknown event elements, the
    answer (Predictability)

13
Problems to be Solved
  • However, for most of the cases, it is difficult
    to find the correct unknown element(s), i.e., the
    correct answer
  • Two major problems
  • Insufficient known elements
  • Inexact known elements
  • Solution
  • Explore the use of world knowledge (Web and
    WordNet glosses) to find more known elements
  • Exploit the lexical knowledge from (WordNet
    synsets and morphemics) to find exact forms.

14
How to Find a QA Event
  • Using Web
  • From original query term q(0) , retrieve top N
    web documents
  • ? qi(0)?q(0), extract nearby non-trivial words in
    one sentence or n words away (in Cq ) and rank
    them by computing its probability of correlation
    with qi(0)
  • Using WordNet
  • ? qi(0)?q(0), extract terms that are lexically
    related to qi(0) by locating them in Gloss Gq
    and Synset Sq
  • Combine the external knowledge resources to form
    term collection
  • Kq Cq (Gq ? Sq)

15
QA Event Construction
  • Structured Query Formulation
  • We perform structural analysis on Kq to form
    semantic groups of terms
  • Given any two distinct terms ti, tj ? Kq , we
    compute their
  • Lexical correlation
  • Co-occurrence correlation
  • Distance correlation

16
QA Event Construction
  • For example, What Spanish explorer discovered
    the Mississippi River?

The final Boolean query becomes (Mississippi)
(FrenchSpanish) (Hernando Soto De)
(1541) (explorer) (first European river).
17
QA Event Mining
  • Extract important association rules among the
    elements by using data mining techniques.
  • Given a QA event Ei, we define X, Y as two sets
    of event elements.
  • Event mining studies the rules of the form X ? Y,
    where X, Y are QA event element sets, X ? Y ?,
    and Y? elementoriginal ?.
  • if X ? Y , ignore X ? Y.
  • if cardinality(Y) 1, ignore X ? Y.
  • if Y? elementoriginal ??, ignore X ? Y.

18
Passage Answer Selection
  • Select Passage based on Answer Event Score (AES)
    from the relevant documents in the QA corpus
  • Support (X ? Y)
  • Confidence (X ? Y)
  • The weight for answers candidate j is defined as

19
Related Modules Fine-grained Named Entity
Recognition
  • Fine-grained NE Tagging
  • Non-ascii Character Remover
  • Number Format Converter
  • E.g. one hundred eleven 111
  • Rule Confliction Revolver
  • Longer Length
  • Ontology
  • Handcrafted Priorities

20
Related Modules Answer Justification
  • We generate axioms based on our manually
    constructed ontology. For example,
  • q1425 What is the population of Maryland?
  • Sentence Maryland 's population is 50,000 and
    growing rapidly.
  • Ontology Axiom (OA) Maryland (c1) population
    (c1, c2) - 5000000(c2)
  • In this way, we could identify the wrong answer
    50000, which is the surface text shown.

21
Factoid Results
22
Factoid Results
23
Outline
  • Introduction
  • Factoid Subsystem
  • List Subsystem
  • Definition Subsystem
  • Result
  • Conclusion and Future Work

24
List System Overview
25
List Subsystem
  • Multiple Answers from Same Paragraph
  • Canonicalization Resolution
  • Unique answer
  • the States , USA, United States, etc
  • Pattern-based Answer Extraction
  • , and
    verb
  • include , ,
  • list of
  • top number adj-superlative

26
List Results
27
Outline
  • Introduction
  • Factoid Subsystem
  • List Subsystem
  • Definition Subsystem
  • Result
  • Conclusion and Future Work

28
System Overview
29
Definition Subsystem
30
Definition Subsystem
  • Pre-processing
  • document filter
  • anaphora resolution
  • sentence positive set and negative set
  • Sentence Ranking
  • Sentence weighting in Corpus
  • Sentence weighting in Web
  • Overall weighting

31
Definition Subsystem
  • Answer Generation (Progressive Maximal Margin
    Relevance)
  • All sentences are ordered in descending order by
    weights.
  • Add the first sentence to the summary.
  • Examine the following sentences. If Weight(stc)-
    Weight(next_stc) avg_sim(stc), Add next_stc to
    summary
  • Go to Step 3) till the length limit of the target
    summary is satisfied.

32
Definition Results
  • We empirically set the length of the summary for
    People and Objects based on question
    classification results.

33
Outline
  • Introduction
  • Factoid Subsystem
  • List Subsystem
  • Definition Subsystem
  • Result
  • Conclusion and Future Work

34
Overall Performance
35
Conclusion and Future Work
  • Conclusion
  • Event-based Question Answering
  • Factoid question and list questions explore the
    power of Event-based QA
  • Definition questions answering combines IR and
    Summarization
  • Use Ontology to boost the performance of our NE
    and answer justification modules
  • Future Work
  • Give a formal proof of our QA event hypothesis
  • Working towards an online question answering
    system
  • Interactive QA
  • Analysis and opinion questions
  • VideoQA question answering on news video
Write a Comment
User Comments (0)
About PowerShow.com