QUALIFIER in TREC12 QA Main Task

About This Presentation

Title:

QUALIFIER in TREC12 QA Main Task

Description:

Given a question and a large text corpus, return an 'answer' rather than relevant 'documents' ... E.g. 'When did Bob Marley die ? ... – PowerPoint PPT presentation

Number of Views:71

Avg rating:3.0/5.0

Slides: 36

Provided by: huiy

Learn more at: http://www.cs.cmu.edu

Category:

more less

Transcript and Presenter's Notes

Title: QUALIFIER in TREC12 QA Main Task

1
QUALIFIER in TREC-12 QA Main Task

Hui Yang, Hang Cui, Min-Yen Kan, Mstislav
Maslennikov, Long Qiu, Tat-Seng Chua
School of Computing
National University of Singapore
Email yangh_at_comp.nus.edu.sg

2
Outline

Introduction
Factoid Subsystem
List Subsystem
Definition Subsystem
Result
Conclusion and Future Work

3
Introduction

Given a question and a large text corpus, return
an answer rather than relevant documents
QA is at the intersection of IR IE NLP
Our system - QUALIFIER
Consists 3 subsystems
External Resources Web, WordNet, Ontology
Event-based Question Answering
New Modules introduced

4
Outline

Introduction
Factoid Subsystem
List Subsystem
Definition Subsystem
Result
Conclusion and Future Work

5
Factoid System Overview
6
Factoid Subsystem

Detailed Question Analysis
QA Event Construction
QA Event Mining
Answer Selection
Answer Justification
Fine-grained Named Entity Recognition
Anaphora Resolution
Canonicalization Coreference
Successive Constraint Relaxation

7
Factoid Subsystem

Detailed Question Analysis
QA Event Construction
QA Event Mining
Answer Selection
Answer Justification
Fine-grained Named Entity Recognition
Anaphora Resolution
Canonicalization Coreference
Successive Constraint Relaxation

8
Why Event-based QA - I

The world consists of two basic types of things
entities and events and people often ask
questions about them.
From Question Answerings Point of View
Questions enquiries about entities or events.

9
Why Event-based QA - II

QA Entities
Anything having existence (living or nonliving)
E.g. What is the democratic party symbol?
QA Events
Something that happens at a given place and
time.
E.g. How did donkey become democratic party
symbol?

Thomas Nast
1870
Harpers Weekly cartoon
10
Why Event-based QA - III

Entity Questions
Properties, or
entities themselves
definition questions.
Event Questions
Elements of events
Location,
Time,
Subject,
Object,
Quantity
Description
Action, etc.

Table 1 Correspondence of WH-Questions Event
Elements

question event event_element entity
entity_property event event_element
event_element time location subject
object quantity description action
other entity object subject entity_property
quantity description other
11
Event-based QA Hypothesis

Equivalency ? QA event Ei,Ej ,if
all_elements(Ei) all_elements(Ej), then Ei
Ej, and vice versa
Generality if all_elements(Ei) is a subset of
all_elements(Ej), then Ei is more general than
Ej
Cohesiveness if elements a, b both belong to an
event Ei, and a, c do not belong to a known
event, then co-occurrence(a,b) is greater than
co-occurrence(a,c)
Predictability if elements a, b both belong to
an event Ei, then a b and b a.

12
QA Event Space

Consider an event to be a point in a
multi-dimensional QA event space.
If we know all the elements about an event, then
we can easily answer different questions about it
E.g. When did Bob Marley die ?
As there are innate associations among these
elements if they belong to the same event
(Cohesiveness), we can use what are already known
To narrow the search scope
To find rest of the unknown event elements, the
answer (Predictability)

13
Problems to be Solved

However, for most of the cases, it is difficult
to find the correct unknown element(s), i.e., the
correct answer
Two major problems
Insufficient known elements
Inexact known elements
Solution

Explore the use of world knowledge (Web and
WordNet glosses) to find more known elements
Exploit the lexical knowledge from (WordNet
synsets and morphemics) to find exact forms.

14
How to Find a QA Event

Using Web
From original query term q(0) , retrieve top N
web documents
? qi(0)?q(0), extract nearby non-trivial words in
one sentence or n words away (in Cq ) and rank
them by computing its probability of correlation
with qi(0)
Using WordNet
? qi(0)?q(0), extract terms that are lexically
related to qi(0) by locating them in Gloss Gq
and Synset Sq
Combine the external knowledge resources to form
term collection
Kq Cq (Gq ? Sq)

15
QA Event Construction

Structured Query Formulation
We perform structural analysis on Kq to form
semantic groups of terms

Given any two distinct terms ti, tj ? Kq , we
compute their
Lexical correlation
Co-occurrence correlation
Distance correlation

16
QA Event Construction

For example, What Spanish explorer discovered
the Mississippi River?

The final Boolean query becomes (Mississippi)
(FrenchSpanish) (Hernando Soto De)
(1541) (explorer) (first European river).
17
QA Event Mining

Extract important association rules among the
elements by using data mining techniques.
Given a QA event Ei, we define X, Y as two sets
of event elements.
Event mining studies the rules of the form X ? Y,
where X, Y are QA event element sets, X ? Y ?,
and Y? elementoriginal ?.
if X ? Y , ignore X ? Y.
if cardinality(Y) 1, ignore X ? Y.
if Y? elementoriginal ??, ignore X ? Y.

18
Passage Answer Selection

Select Passage based on Answer Event Score (AES)
from the relevant documents in the QA corpus
Support (X ? Y)
Confidence (X ? Y)
The weight for answers candidate j is defined as

19
Related Modules Fine-grained Named Entity
Recognition

Fine-grained NE Tagging
Non-ascii Character Remover
Number Format Converter
E.g. one hundred eleven 111
Rule Confliction Revolver
Longer Length
Ontology
Handcrafted Priorities

20
Related Modules Answer Justification

We generate axioms based on our manually
constructed ontology. For example,
q1425 What is the population of Maryland?
Sentence Maryland 's population is 50,000 and
growing rapidly.
Ontology Axiom (OA) Maryland (c1) population
(c1, c2) - 5000000(c2)
In this way, we could identify the wrong answer
50000, which is the surface text shown.

21
Factoid Results
22
Factoid Results
23
Outline

Introduction
Factoid Subsystem
List Subsystem
Definition Subsystem
Result
Conclusion and Future Work

24
List System Overview
25
List Subsystem

Multiple Answers from Same Paragraph
Canonicalization Resolution
Unique answer
the States , USA, United States, etc
Pattern-based Answer Extraction
, and
verb
include , ,
list of
top number adj-superlative

26
List Results
27
Outline

Introduction
Factoid Subsystem
List Subsystem
Definition Subsystem
Result
Conclusion and Future Work

28
System Overview
29
Definition Subsystem
30
Definition Subsystem

Pre-processing
document filter
anaphora resolution
sentence positive set and negative set
Sentence Ranking
Sentence weighting in Corpus
Sentence weighting in Web
Overall weighting

31
Definition Subsystem

Answer Generation (Progressive Maximal Margin
Relevance)
All sentences are ordered in descending order by
weights.
Add the first sentence to the summary.
Examine the following sentences. If Weight(stc)-
Weight(next_stc) avg_sim(stc), Add next_stc to
summary
Go to Step 3) till the length limit of the target
summary is satisfied.

32
Definition Results

We empirically set the length of the summary for
People and Objects based on question
classification results.

33
Outline

Introduction
Factoid Subsystem
List Subsystem
Definition Subsystem
Result
Conclusion and Future Work

34
Overall Performance
35
Conclusion and Future Work

Conclusion
Event-based Question Answering
Factoid question and list questions explore the
power of Event-based QA
Definition questions answering combines IR and
Summarization
Use Ontology to boost the performance of our NE
and answer justification modules
Future Work
Give a formal proof of our QA event hypothesis
Working towards an online question answering
system
Interactive QA
Analysis and opinion questions
VideoQA question answering on news video

Write a Comment

User Comments (0)

About PowerShow.com

QUALIFIER in TREC12 QA Main Task - PowerPoint PPT Presentation

QUALIFIER in TREC12 QA Main Task

Given a question and a large text corpus, return an 'answer' rather than relevant 'documents' ... E.g. 'When did Bob Marley die ? ... – PowerPoint PPT presentation