Acquiring Syntactic and Semantic Transformations in Question Answering - PowerPoint PPT Presentation


Title: Acquiring Syntactic and Semantic Transformations in Question Answering


1
Acquiring Syntactic and Semantic Transformations
in Question Answering
MSR Summer School 2010
Michael Kaisser Was PhD student, School of
Informatics, University of Edinburgh Now Program
Manager, Search Technology Center Munich, Bing
2
Overview
  • What is Question Answering (QA)?
  • Why is QA difficult?
  • A Corpus of Question-Answer Sentence Pairs
    (QASPs)
  • Acquiring reformulation rules from the QASP Corpus

3
Part 1What is Question Answering and why is it
difficult?
4
What is factoid QA?
  • Question answering (QA) is the task of
    automatically answering a question posed in
    natural language.
  • ? something like a search engine.
  • Usually, a QA system searches for the answer in a
    collection of natural language texts.
  • This might be a news paper corpus or the WWW (or
    something else).

5
What is factoid QA?
6
Why is factoid QA difficult?
  • Questions are fairly simple.
  • But what about the sentences containing the
    answers?
  • Average length in words (TREC 02-06 data)
  • Questions 8.14 (st. dev. 2.81)
  • Answer Sentences 28.99 (st. dev. 13.13)
  • (in a corpus of news paper articles, e.g. From
    the NYT.)

7
Why is factoid QA difficult?
But what about the sentences containing the
answers? Who defeated the Spanish armada? "The
old part of Plymouth city clusters around the
Hoe, the famous patch of turf on which Drake is
said to have finished a calm game of bowls before
heading off to defeat the invading Spanish Armada
in 1588. What day did Neil Armstrong land on the
moon? "Charlie Duke, Jim Lovell, Apollo 11's
back-up commander, and Fred Haise, the back-up
lunar module pilot, during the tense moments
before the lunar module carrying Neil Armstrong
and Edwin Buzz'' Aldrin Jr. landed on the moon
on July 20, 1969."
8
Why is factoid QA difficult?
But what about the sentences containing the
answers? Average length in words (TREC 02-06
data) Questions 8.14 (st. dev. 2.81) Answer
Sentences 28.99 (st. dev. 13.13) The
problematic part of factoid QA is not to the
questions, but the answer sentences.
9
Why is factoid QA difficult?
The problematic part of factoid QA are not the
questions, but the answer sentences. The problem
here are the many syntactic and semantic
possibilities in which an answer to a question
can be formulated. (Paraphrasing)
10
Why is factoid QA difficult?
The problematic part of factoid QA are not the
questions, but the answer sentences. The problem
here are the many syntactic and semantic
possibilities in which an answer to a question
can be formulated. (Paraphrasing) Who can we
deal with this? How can we detect all these
possible answer sentence formulations?
11
Part 2A Corpus of Question Answer Sentence Pairs
(QASPs)
12
Usefulness of TREC data
  • TREC publishes lots of valuable data
  • question test sets
  • correct answers
  • lists of documents that contain the identified
    instances of the correct answers

13
Usefulness of TREC data
  • TREC publishes lots of valuable data
  • question test sets
  • correct answers
  • lists of documents that contain the identified
    instances of the correct answers
  • yet, no answer sentences are identified

14
Usefulness of TREC data
  • TREC publishes lots of valuable data
  • question test sets
  • correct answers
  • lists of documents that contain the identified
    instances of the correct answers
  • yet, no answer sentences are identified
  • But maybe we can get these ourselves?

15
Excursus Mechanical Turk
  • Amazon Web Service
  • Artificial Artificial Intelligence
  • Requesters upload Human Intelligence Tasks (HITs)
  • Users (turkers) complete HITs for small monetary
    rewards

16
QASP Corpus Creation
TREC question
Instructions
AQUAINT document (shortened for screenshot)
Input field for answer sentence
Input field for answer
17
QASP Corpus Numbers
  • Data collected

8,830 QASPs
Price for complete experiment Approx. USD 650
18
QASP Corpus - Examples
1396, XIE19961004.0048, "What is the name of the
volcano that destroyed the ancient city of
Pompeii?", "However, both sides made some
gestures of appeasement before Chirac set off for
the Italian resort city lying beside the Vesuve
volcano which destroyed the Roman city of
Pompeii.", "Vesuve", 1 1396, NYT19980607.0105, "W
hat is the name of the volcano that destroyed the
ancient city of Pompeii?", "The ruins of Pompeii,
the ancient city wiped out in A.D. 79 by the
eruption at Vesuvius, are Italy's most popular
tourist attraction, visited by two million people
a year.", "Vesuvius", 1 1396, NYT19981201.0229, "
What is the name of the volcano that destroyed
the ancient city of Pompeii?", "Visiting tourists
enter the excavated ruins of the city - buried by
the eruption of Mount Vesuvius - via a tunnel
through the defensive walls that surround it,
just as visiting traders did 2,000 years ago.",
"Mount Vesuvius", C
19
Part 3Acquiring Syntactic and Semantic
Reformulations from the QASP corpus
20
Employing the QASP Corpus
  • We will use the QASP corpus to learn syntactic
    structures of answer sentences for classes of
    questions.
  • Algorithm has three steps
  • Rule Creation
  • Rule Evaluation
  • Rule Execution

21
Employing the QASP Corpus
  • Example QASP
  • Q Who is Tom Cruise married to?
  • AS Tom Cruise and Nicole Kidman are married.
  • A Nicole Kidman (Side note Data from 2000.)
  • Question matches pattern WhoisNPVERBto?
  • Non-stop-word constituents are
  • NP (Tom Cruise), V (married)
  • These and the answer can be found in the answer
    sentence.

22
Employing the QASP Corpus
  • Nicole Kidman and Tom Cruise are married.
  • Sentence is parsed with the Stanford Parser.
  • Paths from each question constituent to the
    answer are extracted and stored in a rule
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?conj
  • Path 4 ?nsubj

23
Employing the QASP Corpus
  • Nicole Kidman and Tom Cruise are married.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?conj
  • Path 4 ?nsubj
  • 1Nicole (Nicole,NNP,2)nn
  • 2Kidman (Kidman,NNP,7)nsubj
  • 3and (and,CC,2)cc
  • 4Tom (Tom,NNP,5)nn
  • 5Cruise (Cruise,NNP,2)conj
  • 6are (be,VBP,7)cop
  • 7married (married,JJ,null)ROOT

married
nsubj
cop
are
Kidman
nn
conj
cc
Nicole
and
Cruise
nn
Tom
24
Employing the QASP Corpus
  • Tom Cruise married Nicole Kidman in 1990.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubj ?dobj
  • Path 4 ?dobj
  • Tom Cruise is married to Nicole Kidman.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubjpass ?prep ?pobj
  • Path 4 ?prep ?pobj

25
Employing the QASP Corpus
  • Tom Cruise is married to Nicole Kidman.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubjpass ?prep ?pobj
  • Path 4 ?prep ?pobj
  • Process is repeated for all QASPs (a test set
    might be set aside)
  • All rules are stored in a file.

26
Employing the QASP Corpus
  • Rule evaluation
  • For each question in the corpus
  • Search for candidate sentences in AQUAINT corpus
    using Lucene.
  • Test if paths are present and point to the same
    node. Check if answer is correct. Store results
    in file.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubj ?dobj
  • Path 4 ?dobj
  • correct 5
  • incorrect 3

27
Employing the QASP Corpus
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubj ?dobj
  • Path 4 ?dobj
  • correct 5
  • incorrect 3

correct
Pattern precision p
correct incorrect
(see e.g. Ravichandran and Hovy, 2002)
28
Employing the QASP Corpus
  • Rule execution
  • Very similar to rule evaluation
  • For each question in the corpus
  • Search for candidate sentences in AQUAINT corpus
    using Lucene.
  • Test if paths are present and point to the same
    node. If so, extract answer.

29
Finally...
30
Employing the QASP Corpus
Results evaluation set 3
Results rise to 0.278 / 0.411 with semantic
alignment (not covered in this talk).
31
Employing the QASP Corpus
  • Comparison with baseline
  • Baseline gets syntactic answer structures from
    questions not from answer sentences, otherwise
    similar.
  • Large tradition in QA to do this
  • Katz and Lin, 2003 Punyakanok et al., 2004
    Bouma et al., 2005 Cui et al., 2005 etc.
  • Baseline is very simple, uses none of commonly
    used improvements (e.g. fuzzy matching), but so
    does the method proposed here.
  • Baseline performance (Evaluation set 3)
  • 0.068 (accuracy overall) compared to
    0.278 (308)
  • 0.100 (accuracy if rule exists) compared to
    0.411 (311)

32
Employing the QASP Corpus
33
Part 4STC Europe
34
STC Europe
  • Program Manager at STC Europe in Munich
  • STC Search Technology Center
  • We have sites in London, Munich, Paris and soon
    in Poland
  • In Munich we work on Relevance (Quality of our 10
    blue links) for European markets
  • In my group, we work on query alterations

35
  • Thank You!

PS QASP corpus downloadable on my homepage
(http//homepages.inf.ed.ac.uk/s0570760/).
36
References
  • Fillmore and Lowe, 1998
  • Fillmore, C. F. B. C. J. and Lowe, J. B. (1998).
    The BerkeleyFrameNet Project. In Proceedings of
    COLING-ACL 1998.
  • Gildea and Jurafsky, 2002
  • Gildea, D. and Jurafsky, D. (2002). Automatic
    Labeling of Semantic Roles. Computatrional
    Linguistics, 28(3)245288.
  • Lin and Katz, 2005
  • Lin, J. and Katz, B. (2005). Building a Reusable
    Test Collection for Question Answering. Journal
    of the American Society for Information Science
    and Technology.
  • Xue and Palmer, 2004 Xue, N. and Palmer, M.
    (2004). Calibrating Features for Semantic Role
    Labeling. In Proceedings of EMNLP 2004,
    Barcelona, Spain.
View by Category
About This Presentation
Title:

Acquiring Syntactic and Semantic Transformations in Question Answering

Description:

... Who is Tom Cruise married to? AS: Tom Cruise and Nicole Kidman are married. A: Nicole Kidman (Side note: Data from 2000.) – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 37
Provided by: k951975
Learn more at: http://research.microsoft.com
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Acquiring Syntactic and Semantic Transformations in Question Answering


1
Acquiring Syntactic and Semantic Transformations
in Question Answering
MSR Summer School 2010
Michael Kaisser Was PhD student, School of
Informatics, University of Edinburgh Now Program
Manager, Search Technology Center Munich, Bing
2
Overview
  • What is Question Answering (QA)?
  • Why is QA difficult?
  • A Corpus of Question-Answer Sentence Pairs
    (QASPs)
  • Acquiring reformulation rules from the QASP Corpus

3
Part 1What is Question Answering and why is it
difficult?
4
What is factoid QA?
  • Question answering (QA) is the task of
    automatically answering a question posed in
    natural language.
  • ? something like a search engine.
  • Usually, a QA system searches for the answer in a
    collection of natural language texts.
  • This might be a news paper corpus or the WWW (or
    something else).

5
What is factoid QA?
6
Why is factoid QA difficult?
  • Questions are fairly simple.
  • But what about the sentences containing the
    answers?
  • Average length in words (TREC 02-06 data)
  • Questions 8.14 (st. dev. 2.81)
  • Answer Sentences 28.99 (st. dev. 13.13)
  • (in a corpus of news paper articles, e.g. From
    the NYT.)

7
Why is factoid QA difficult?
But what about the sentences containing the
answers? Who defeated the Spanish armada? "The
old part of Plymouth city clusters around the
Hoe, the famous patch of turf on which Drake is
said to have finished a calm game of bowls before
heading off to defeat the invading Spanish Armada
in 1588. What day did Neil Armstrong land on the
moon? "Charlie Duke, Jim Lovell, Apollo 11's
back-up commander, and Fred Haise, the back-up
lunar module pilot, during the tense moments
before the lunar module carrying Neil Armstrong
and Edwin Buzz'' Aldrin Jr. landed on the moon
on July 20, 1969."
8
Why is factoid QA difficult?
But what about the sentences containing the
answers? Average length in words (TREC 02-06
data) Questions 8.14 (st. dev. 2.81) Answer
Sentences 28.99 (st. dev. 13.13) The
problematic part of factoid QA is not to the
questions, but the answer sentences.
9
Why is factoid QA difficult?
The problematic part of factoid QA are not the
questions, but the answer sentences. The problem
here are the many syntactic and semantic
possibilities in which an answer to a question
can be formulated. (Paraphrasing)
10
Why is factoid QA difficult?
The problematic part of factoid QA are not the
questions, but the answer sentences. The problem
here are the many syntactic and semantic
possibilities in which an answer to a question
can be formulated. (Paraphrasing) Who can we
deal with this? How can we detect all these
possible answer sentence formulations?
11
Part 2A Corpus of Question Answer Sentence Pairs
(QASPs)
12
Usefulness of TREC data
  • TREC publishes lots of valuable data
  • question test sets
  • correct answers
  • lists of documents that contain the identified
    instances of the correct answers

13
Usefulness of TREC data
  • TREC publishes lots of valuable data
  • question test sets
  • correct answers
  • lists of documents that contain the identified
    instances of the correct answers
  • yet, no answer sentences are identified

14
Usefulness of TREC data
  • TREC publishes lots of valuable data
  • question test sets
  • correct answers
  • lists of documents that contain the identified
    instances of the correct answers
  • yet, no answer sentences are identified
  • But maybe we can get these ourselves?

15
Excursus Mechanical Turk
  • Amazon Web Service
  • Artificial Artificial Intelligence
  • Requesters upload Human Intelligence Tasks (HITs)
  • Users (turkers) complete HITs for small monetary
    rewards

16
QASP Corpus Creation
TREC question
Instructions
AQUAINT document (shortened for screenshot)
Input field for answer sentence
Input field for answer
17
QASP Corpus Numbers
  • Data collected

8,830 QASPs
Price for complete experiment Approx. USD 650
18
QASP Corpus - Examples
1396, XIE19961004.0048, "What is the name of the
volcano that destroyed the ancient city of
Pompeii?", "However, both sides made some
gestures of appeasement before Chirac set off for
the Italian resort city lying beside the Vesuve
volcano which destroyed the Roman city of
Pompeii.", "Vesuve", 1 1396, NYT19980607.0105, "W
hat is the name of the volcano that destroyed the
ancient city of Pompeii?", "The ruins of Pompeii,
the ancient city wiped out in A.D. 79 by the
eruption at Vesuvius, are Italy's most popular
tourist attraction, visited by two million people
a year.", "Vesuvius", 1 1396, NYT19981201.0229, "
What is the name of the volcano that destroyed
the ancient city of Pompeii?", "Visiting tourists
enter the excavated ruins of the city - buried by
the eruption of Mount Vesuvius - via a tunnel
through the defensive walls that surround it,
just as visiting traders did 2,000 years ago.",
"Mount Vesuvius", C
19
Part 3Acquiring Syntactic and Semantic
Reformulations from the QASP corpus
20
Employing the QASP Corpus
  • We will use the QASP corpus to learn syntactic
    structures of answer sentences for classes of
    questions.
  • Algorithm has three steps
  • Rule Creation
  • Rule Evaluation
  • Rule Execution

21
Employing the QASP Corpus
  • Example QASP
  • Q Who is Tom Cruise married to?
  • AS Tom Cruise and Nicole Kidman are married.
  • A Nicole Kidman (Side note Data from 2000.)
  • Question matches pattern WhoisNPVERBto?
  • Non-stop-word constituents are
  • NP (Tom Cruise), V (married)
  • These and the answer can be found in the answer
    sentence.

22
Employing the QASP Corpus
  • Nicole Kidman and Tom Cruise are married.
  • Sentence is parsed with the Stanford Parser.
  • Paths from each question constituent to the
    answer are extracted and stored in a rule
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?conj
  • Path 4 ?nsubj

23
Employing the QASP Corpus
  • Nicole Kidman and Tom Cruise are married.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?conj
  • Path 4 ?nsubj
  • 1Nicole (Nicole,NNP,2)nn
  • 2Kidman (Kidman,NNP,7)nsubj
  • 3and (and,CC,2)cc
  • 4Tom (Tom,NNP,5)nn
  • 5Cruise (Cruise,NNP,2)conj
  • 6are (be,VBP,7)cop
  • 7married (married,JJ,null)ROOT

married
nsubj
cop
are
Kidman
nn
conj
cc
Nicole
and
Cruise
nn
Tom
24
Employing the QASP Corpus
  • Tom Cruise married Nicole Kidman in 1990.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubj ?dobj
  • Path 4 ?dobj
  • Tom Cruise is married to Nicole Kidman.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubjpass ?prep ?pobj
  • Path 4 ?prep ?pobj

25
Employing the QASP Corpus
  • Tom Cruise is married to Nicole Kidman.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubjpass ?prep ?pobj
  • Path 4 ?prep ?pobj
  • Process is repeated for all QASPs (a test set
    might be set aside)
  • All rules are stored in a file.

26
Employing the QASP Corpus
  • Rule evaluation
  • For each question in the corpus
  • Search for candidate sentences in AQUAINT corpus
    using Lucene.
  • Test if paths are present and point to the same
    node. Check if answer is correct. Store results
    in file.
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubj ?dobj
  • Path 4 ?dobj
  • correct 5
  • incorrect 3

27
Employing the QASP Corpus
  • Pattern Who1is2NP3VERB4to5
  • Path 3 ?nsubj ?dobj
  • Path 4 ?dobj
  • correct 5
  • incorrect 3

correct
Pattern precision p
correct incorrect
(see e.g. Ravichandran and Hovy, 2002)
28
Employing the QASP Corpus
  • Rule execution
  • Very similar to rule evaluation
  • For each question in the corpus
  • Search for candidate sentences in AQUAINT corpus
    using Lucene.
  • Test if paths are present and point to the same
    node. If so, extract answer.

29
Finally...
30
Employing the QASP Corpus
Results evaluation set 3
Results rise to 0.278 / 0.411 with semantic
alignment (not covered in this talk).
31
Employing the QASP Corpus
  • Comparison with baseline
  • Baseline gets syntactic answer structures from
    questions not from answer sentences, otherwise
    similar.
  • Large tradition in QA to do this
  • Katz and Lin, 2003 Punyakanok et al., 2004
    Bouma et al., 2005 Cui et al., 2005 etc.
  • Baseline is very simple, uses none of commonly
    used improvements (e.g. fuzzy matching), but so
    does the method proposed here.
  • Baseline performance (Evaluation set 3)
  • 0.068 (accuracy overall) compared to
    0.278 (308)
  • 0.100 (accuracy if rule exists) compared to
    0.411 (311)

32
Employing the QASP Corpus
33
Part 4STC Europe
34
STC Europe
  • Program Manager at STC Europe in Munich
  • STC Search Technology Center
  • We have sites in London, Munich, Paris and soon
    in Poland
  • In Munich we work on Relevance (Quality of our 10
    blue links) for European markets
  • In my group, we work on query alterations

35
  • Thank You!

PS QASP corpus downloadable on my homepage
(http//homepages.inf.ed.ac.uk/s0570760/).
36
References
  • Fillmore and Lowe, 1998
  • Fillmore, C. F. B. C. J. and Lowe, J. B. (1998).
    The BerkeleyFrameNet Project. In Proceedings of
    COLING-ACL 1998.
  • Gildea and Jurafsky, 2002
  • Gildea, D. and Jurafsky, D. (2002). Automatic
    Labeling of Semantic Roles. Computatrional
    Linguistics, 28(3)245288.
  • Lin and Katz, 2005
  • Lin, J. and Katz, B. (2005). Building a Reusable
    Test Collection for Question Answering. Journal
    of the American Society for Information Science
    and Technology.
  • Xue and Palmer, 2004 Xue, N. and Palmer, M.
    (2004). Calibrating Features for Semantic Role
    Labeling. In Proceedings of EMNLP 2004,
    Barcelona, Spain.
About PowerShow.com