Question Answering OpenDomain modified lecture from E. Riloffs webpage - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Question Answering OpenDomain modified lecture from E. Riloffs webpage

Description:

Text: The 2002 Winter Olympics will be held in beautiful Salt Lake City, Utah. ... Locations: Salt Lake City; Massachusetts; France. Dates/Times: November; ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 23
Provided by: lab256
Category:

less

Transcript and Presenter's Notes

Title: Question Answering OpenDomain modified lecture from E. Riloffs webpage


1
Question Answering (Open-Domain)(modified
lecture from E. Riloffs webpage)
  • Grand Challenge Problem for NLP A program that
    can find the answer to arbitrary questions from
    text resources.
  • WWW, encyclopedias, books, manuals, medical
    literature, scientific papers, etc.
  • Another application database queries
  • Converting natural language questions into
    database queries was one of the earliest NLP
    applications!
  • A scientific reason to do Q/A the ability to
    answer questions about a story is the hallmark of
    understanding.

2
Multiple Document Question Answering
  • A multiple document Q/A task involves questions
    posed against a collection of documents.
  • The answer may appear in the collection multiple
    times, or may not appear at all! For this task,
    it doesnt matter where the answer is found.
  • Applications include WWW search engines, and
    searching text repositories such as news
    archives, medical literature, or scientific
    articles.

3
TREC-9 Q/A Task
Number of Documents 979,000 Megabytes of
Text 3033 Document Sources AP, WSJ,
Financial Times, San Jose Mercury News, LA
Times, RBIS Number of Questions 682 Question
Sources Encarta log, Excite log
  • Sample questions
  • How much folic acid should an expectant mother
    get daily?
  • Who invented the paper clip?
  • What university was Woodrow Wilson president of?
  • Where is Rider College located?
  • Name a film in which Jude law acted.
  • Where do lobsters like to live?

4
TREC and ACQUAINT
  • TREC-10 new questions from MSNSearch logs and
    AskJeeves, some of which have no answers, or
    require fusion across documents
  • List questions Name 32 countires Pope John Paul
    II has visited.
  • Dialogue processing Which museum in Florence
    was damaged by a major bomb explision in 1993?
    On what day did this happen?
  • ACQUAINT Advanced Question and Answering for
    INTelligence (e.g., beyond factoids, the Multiple
    Perspective Q-A work at Pitt)

5
Single Document Question Answering
  • A single document Q/A task involves questions
    associated with one particular document.
  • In most cases, the assumption is that the answer
    appears somewhere in the document and probably
    once.
  • Applications involve searching an individual
    resource, such as a book, encyclopedia, or
    manual.
  • Reading comprehension tests are also a form of
    single document question answering.

6
Reading Comprehension Tests
  • Mars Polar Lander- Where Are You?
  • (January 18, 2000) After more than a month
    of searching for a single sign from NASAs Mars
    Polar Lander, mission controllers have lost hope
    of finding it. The Mars Polar Lander was on a
    mission to Mars to study its atmosphere and
    search for water, something that could help
    scientists determine whether life even existed on
    Mars. Polar Lander was to have touched down on
    December 3 for a 90-day mission. It was to land
    near Mars south pole. The lander was last heard
    from minutes before beginning its descent. The
    last effort to communicate with the three-legged
    lander ended with frustration at 8 a.m. Monday.
    We didnt see anything, said Richard Cook, the
    spacecrafts project manager at NASAs Jest
    Propulsion laboratory. The failed mission to the
    Red Planet cost the American government more the
    200 million dollars. Now, space agency
    scientists and engineers will try to find out
    what could have gone wrong. They do not want to
    make the same mistakes in the next mission.
  • When did the mission controllers lost Hope of
    communication with the Lander?
  • Who is the Polar Landers project manager?

  • Where on Mars was the spacecraft supposed to
    touch down?
  • What was the mission of the Mars Polar Lander?


7
Reading Comprehension Tests
  • Mars Polar Lander- Where Are You?
  • (January 18, 2000) After more than a month
    of searching for a single sign from NASAs Mars
    Polar Lander, mission controllers have lost hope
    of finding it. The Mars Polar Lander was on a
    mission to Mars to study its atmosphere and
    search for water, something that could help
    scientists determine whether life even existed on
    Mars. Polar Lander was to have touched down on
    December 3 for a 90-day mission. It was to land
    near Mars south pole. The lander was last heard
    from minutes before beginning its descent. The
    last effort to communicate with the three-legged
    lander ended with frustration at 8 a.m. Monday.
    We didnt see anything, said Richard Cook, the
    spacecrafts project manager at NASAs Jest
    Propulsion laboratory. The failed mission to the
    Red Planet cost the American government more the
    200 million dollars. Now, space agency
    scientists and engineers will try to find out
    what could have gone wrong. They do not want to
    make the same mistakes in the next mission.
  • When did the mission controllers lost Hope of
    communication with the Lander?
    (Answer 8AM, Monday Jan. 17)
  • Who is the Polar Landers project manager?

    (Answer Richard
    Cook)
  • Where on Mars was the spacecraft supposed to
    touch down?
    (Answer near
    Mars south pole)
  • What was the mission of the Mars Polar Lander?

    (Answer
    to study Mars atmosphere and search for water)


8
Why use reading comprehension tests?
  • The tests were designed to ask questions that
    would demonstrate whether a child understands a
    story. So they are an objective way to evaluate
    the reading ability of computer programs.
  • Questions and answer keys already exist!
  • Tests are available for many grade levels, so we
    can challenge our Q/A computer programs with
    progressively harder questions.
  • The grade level of an exam can give us some ideas
    of the reading ability of our computer programs
    (e.g. it reads at a 2nd grade level).
  • Grade school exams typically ask factual
    questions that mimic real-world applications (as
    opposed to high school exams that often ask
    general inferential questions, e.g. what is the
    topic of the story).

9
Judging Answers
There are several possible ways to present an
answer Short Answer the exact answer to the
question Answer Sentence the sentence
containing the answer. Answer Passage a passage
containing the answer. (e.g., a paragraph) Short
answers are difficult to score automatically
because many variations are often acceptable.
Example Text The 2002 Winter Olympics
will be held in beautiful Salt Lake City, Utah.
Q Where will the 2002 winter
Olympics be held? A1 beautiful Salt Lake City,
Utah
A2
Salt Lake City, Utah

A3 Salt Lake City

A4
Salt Lake

A5 Utah
10
Reciprocal Ranking Scheme
In a real Q/A application, it doesnt make much
sense to produce several possible answers. But
for the purposes of evaluating computer models,
several answer candidates are often ranked by
confidence. Reciprocal Ranking Scheme the
score for a question is 1/R, where R is a rank
of the first correct answer in the list. Q What
is the capital of Utah?
A1 Ogden

A2 Salt
Lake City

A3 Provo

A4 St. George

A5 Salt
Lake The score for the question Q would be ½.
11
Architecture of Typical Q/A Systems
Question Typing input question, output
entity type(s) Document/Passage Retrieval
inputtext(s), outputrelevant texts Named Entity
Tagging inputrelevant texts, outputtagged
text Answer Identification inputquestion,
entity types(s), tagged text
12
Question Typing
Many common varieties of questions expect a
specific type of answer.
For example WHO person, organization, or
country. WHERE location (specific or
general) WHEN date on time period HOW MUCH an
amount HOW MANY a number WHICH CITY a city


Most Q/A systems use a
question classifier to assign a type to each
question. The question type constrains the set
of possible answers. The classification rules
are often developed by hand and are quite simple.
13
A Question Type Hierarchy (excerpt)
Default NP Thingname Title Temporal Time D
ate Definition Agent Organization Person C
ountry Location Country
14
Document/Passage Retrieval
  • For some applications, the text collection that
    must be searched is very large. For example, the
    TREC Q/A collection is about 3 GB!
  • Applying NLP techniques to large text collections
    is too expensive to do in real-time. So,
    information retrieval (IR) engines identigy the
    most relevant texts, using the question words as
    key words.
  • Document Retrieval Systems return the N documents
    that are most relevant to the question. Passage
    retrieval systems return the N passages that are
    most relevant to the question.
  • Only the most relevant documents/passages are
    given to the remaining modules of the Q/A system.
    If the IR engine doesnt retrieve text(s)
    containing the answer, the Q/A system is out of
    luck!

15
Named Entity Tagging
Named Entity (NE) Taggers recognize certain types
of Named objects and other easily identifiable
semantic classes. Common NE classes
are People Mr. Fripper John Fripper
President Fripper Locations Salt Lake City
Massachusetts France Dates/Times November
Monday 510 pm Companies KVW Co. KVW Inc.
KVW corporation Measures 500 dollars 40 miles
32 lbs
16
Sample Text
Consider this sentence President George Bush
announced a new bill that would send 1.2
million dollars to Miami Florida for a new
hurricane tracking system. After applying a
Named Entity Tagger, the text might look like
this
announced a new bill that would send million dollars to
for a new hurricane tracking system.
17
Rules for Name Entity Tagging
Most Named Entity Taggers use simple rules that
are developed by hand. Most
rules use the following types of
clues Keywords Ex. Mr., Corp.,
city Common Lists Ex. Cities, countries,
months of the year, common first names, common
last names Special Symbols Ex. Dollar signs,
percent signs Structured Phrases Ex. Dates
often appear as MONTH, DAY , YEAR Syntactic
Patterns (more rarely) Ex. LOCATIONS_NP,
LOCATION_NP is usually a single location (e.g.
Boston, Massachusetts).
18
Answer Identification
  • At this point, weve assigned a type to the
    question and weve tagged the
  • text with Named Entities. So we can now narrow
    down the candidate
  • pool to entities of the right type.
  • Problem There are often many objects of the
    right type, even in a single
  • text.
  • The Answer Identification module is responsible
    for finding the best answer to the question.
  • For questions that have Named Entity types, this
    module must figure out which item of the right
    type is correct.
  • For questions that do not have Named Entity
    types, this module is essentially starting from
    scratch.

19
Word Overlap
The most common method of Answer Identification
is to measure the amount of Word Overlap between
the question and an answer candidate. Basic Word
Overlap Each answer candidate is scored by
counting how many question words are present in
or near the candidate. Stop Words sometimes
closed class words (often called Stop Words in
IR) are not included in the word overlap
measure. Stemming sometimes morphological
analysis is used to compare only the root forms
or words (e.g. walk and walked would
match). Weights some words may be weighted more
heavily than others (e.g., verbs might be given
more weight than nouns).
20
The State of the Art in Q/A
Most remedia reading comprehension results
Answer
Sentence Identification around 40 Best TREC-9
results (Mean Reciprocal Rank)

50-byte answers

MRR0.58, no correct answer was found
for 34 of questions 250-byte
answers MRR0.76, no correct answer was found
for14 of questions The best TREC Q/A system is
more sophisticated Q/A model that uses syntactic
dependency structures, semantic hierarchies, etc.
But more intelligent Q/A models are still highly
experimental.
21
Answer Confusability Experiments
  • Manually annotated data for 165 TREC-9 questions
    and 186 CBC questions for perfect question
    typing, perfect answer sentence identification,
    and perfect semantic tagging.
  • Idea An oracle gives you the correct question
    type, a sentence containing the answer, and
    correctly tags all entities in the sentence that
    match the question type.
  • Ex. The oracle tells you that the question
    expects a person, gives you a sentence containing
    the correct person, and tags all person entities
    in that sentence. The one thing the oracle does
    not tell you is which person is the correct one.
  • Measured the answer confusability the score
    that a Q/A system would get if it randomly
    selected an iem of the designed type from the
    answer sentence.

22
Example
Q1 When was Fred Smith born? S1 Fred Smith
lived from 1823 to 1897. Q2 What city is
Massachusetts General Hospital located in? S2
It was conducted by a cooperative group of
oncologists from Hoag, Massachusetts General
Hospital in Boston, Dartmouth College in New
Hampshire, UC San Diego Medical Center, McGill
University in Montreal and the University of
Missouri in Columbia.
Write a Comment
User Comments (0)
About PowerShow.com