A Novel Pattern Learning Method for Open Domain Question Answering - PowerPoint PPT Presentation

Loading...

PPT – A Novel Pattern Learning Method for Open Domain Question Answering PowerPoint presentation | free to download - id: 4510e8-OWE4M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

A Novel Pattern Learning Method for Open Domain Question Answering

Description:

Identify the Q_Tag of the new question and generate its Q_Pattern What is the most populous city in the United States? What Q_BeVerb Q_Focus in Q_LCN? 2. – PowerPoint PPT presentation

Number of Views:53
Avg rating:3.0/5.0
Slides: 17
Provided by: auge
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: A Novel Pattern Learning Method for Open Domain Question Answering


1
A Novel Pattern Learning Method for Open Domain
Question Answering
  • IJCNLP 2004
  • Yongping Du, Xuanjing Huang, Xin Li, Lide Wu

2
Abstract
  • They develop a novel pattern based method for
    implementing answer extraction in QA
  • For each type of question, the corresponding
    answer patterns can be learned from the Web
    automatically
  • Given a new question, these answer patterns can
    be applied to find the answer
  • They give a performance analysis of this approach
    using the TREC-11 question set

3
Introduction
  • Three main components in QA system
  • Many other question answering systems have used
    pattern based method
  • Patterns are not learned
  • They can handle only one question term in the
    candidate answering sentence

4
Introduction (Cont.)
  • Each answer pattern is consisted of three parts
  • ltQ_Taggt ConstString ltAgt
  • ltQ_Taggt the key phrases of question
  • ltAgt the answer, any string holding the position
    will be extracted as the answer
  • ConstString a sequence of words
  • ltQ_Taggt indentification -gt learning answer
    patterns -gt answer extraction -gt performance
    analysis

5
Question Analysis
  • A set of symbols are defined to represent
    questions
  • Q_Focus the head word of the noun phrase which
    is binding with interrogative
  • Interrogative be verb NP
  • How Adj be verb

6
Question Analysis
  • They select 182 questions from TREC, and these
    questions contain all the Q_Tag symbols
  • Every element of the questions is replaced with
    its corresponding Q_Tag
  • The classification of questions will be built
    based on the Q_Pattern and the answer type
  • Six answer type

7
Example
  • Question What is the largest city in Germany?
  • Question type LCN What Q_BeVerb Q_Focus in
    Q_LCN?
  • Doc 1 ,Berlin is the largest city in Germany
    and is develop...
  • Answer pattern , ltAgt Q_BeVerb Q_Focus in Q_LCN

8
Pattern Learning
  • 1. Constructing Query Q_Tag Answer, for
    example, the largest city Germany
    Berlin
  • 2. Searching Submit query into Google, and
    download the top 100 documents
  • 3. Snipped Selection Extract the snippets
    containing 10 words around the answer
  • 4. Answer Pattern Extraction Replace the
    question term in each snippet by the
    corresponding Q_Tag, and the answer term by the
    tag ltAgt
  • The shortest string containing the Q_Tag and the
    tag ltAgt is extracted as the answer pattern

9
Pattern Learning (Cont.)
  • 5. Computing the weight of each answer pattern
  • They learn answer patterns for each question type

10
Pattern Evaluation
  • answer pattern ltAgt Q_BeVerb Q_Focus may
    extract candidate answer Portland from the
    snippet Portland is the largest city in Oregon
  • They provide an approach to evaluate the answer
    patterns
  • 1. Query for each answer pattern of the question
    is formed and submitted to Google, and top 100
    snippets are downloaded for evaluation
  • The query consists of three parts
  • Head Tail Q_Focus Q_NameEntity
  • answer pattern ltAgt Q_BeVerb Q_Focus
  • query is the largest city Germany

11
Pattern Evaluation (Cont.)
  • 2. The confidence of each answer pattern is
    calculated by the formula
  • 3. At last the score of each answer pattern is
    computed as the formula
  • The major advantage over other pattern based QA
    systems is that more than one question term can
    be included in the answer pattern

12
Sample Answer patterns
13
Answer Extraction
  • Use google as the search engine and get top 100
    snippets for answer extraction
  • 1. Identify the Q_Tag of the new question and
    generate its Q_Pattern
  • What is the most populous city in the United
    States?
  • What Q_BeVerb Q_Focus in Q_LCN?
  • 2. Determine the question type of the question.
    The corresponding answer patterns of this
    question type are also selected
  • 3. Replace the Q_Tag symbols of each answer
    pattern with the corresponding question term of
    the question
  • , ltAgt Q_BeVerb Q_Focus in Q_LCN
  • , ltAgt is the most populous city in the United
    States

14
Answer Extraction (Cont.)
  • 4. Select the words matching tag ltAgt as the
    candidate answer
  • 5. Discard the candidate answers which do not
    satisfy the answer type of the question, using
    name entity tagger
  • 6. Sort the remainder candidate answers by their
    answer patterns scores and their frequencies.
    The highest score is selected as the final answer

15
Performance Analysis
  • TREC-9 and TREC-10 are training
  • examples to learn answer patterns
  • 500 questions of TREC-11 are experimented

16
Conclusion
  • They take part in the TREC-12 this year and
    their result is above the median score of all
    runs submitted
  • Some answer patterns they learned are too
    specific and are useless for answering new
    question
  • What is a shaman?
  • Q_Focus was the priest, the ltAgt and
About PowerShow.com