Interpreting Loosely Encoded Questions - PowerPoint PPT Presentation

1 / 39
About This Presentation
Title:

Interpreting Loosely Encoded Questions

Description:

as part of Project Halo. What is the conductivity of the following ... Halo ... two sets of questions from Project Halo (150 questions in total). Types of ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 40
Provided by: labwa
Category:

less

Transcript and Presenter's Notes

Title: Interpreting Loosely Encoded Questions


1
Interpreting Loosely Encoded Questions
  • James Fan and Bruce Porter
  • University of Texas at Austin

Full support for this research was provided by
Vulcan Inc. as part of Project Halo
2
Problem
KB
English question
end user
question encoding
3
Task
  • Context end users pose questions to
    knowledge-based question-answering systems
    without intimate knowledge of the structure of
    the knowledge base
  • Task translate end users encodings so that they
    align with the KB.

4
Input and output
  • naïve encoding a question encoded without regard
    for the structure of the knowledge base. Naïve
    encodings are often literal translations from the
    original English expressions, i.e. the form of
    questions we should expect from end users.
  • correct encoding a question encoding that aligns
    with the structure of the knowledge base.

5
Loose speak
  • loose speak the part of an encoding that fails
    to align with the knowledge base.
  • Not meant to be pejorative. Loose refers to the
    imprecise way that people form English
    expressions.

6
Project Halo phase I
  • Three systems for Advanced Placement chemistry.
    (Barker et al. 2004 Angele et al. 2003)
  • A chemistry KB is built.
  • The best KB answers enough questions to score 3.
  • Knowledge engineers encode 160 English test
    questions (10 man-weeks).

7
Project Halo phase II
  • Develop a knowledge-acquisition tool that will
    enable domain experts in the sciences to
    independently formulate and debug high quality,
    reusable knowledge modules.
  • Develop a knowledge-based question-answering
    system that allows an untrained end-user to pose
    questions and problems to those underlying
    knowledge modules.

8
Examples (continued)
  • When dilute nitric acid was added to a solution
    of one of the following chemicals, a gas was
    evolved. This gas turned a drop of limewater,
    Ca(OH)2 cloudy, due to the formation of a white
    precipitate. The chemical was
  • (a) household ammonia, NH3
  • (b) baking soda, NaHCO3
  • (c) table salt, NaCl
  • (d) epsom salt, MgSO4? 7H2O
  • (e) bleach, 5 NaOCl

9
Examples (continued)
  • Which of the following aqueous solutions has the
    lowest conductivity?
  • (a) 0.1 M CuSO4
  • (b) 0.1 M KOH
  • (c) 0.1 M BaCl2
  • (d) 0.1 M HF
  • (e) 0.1 M HNO3

10
Burden of interpreting loose speak
  • Without interpreting loose speak, question
    encodings will not yield right answers.
  • Without intimate knowledge of the knowledge base
    structure, no loose speak will be interpreted.
  • Obtaining such intimate knowledge about the
    knowledge base and interpreting loose speak is a
    heavy burden.

11
Previous approaches
  • Restrict expressiveness
  • keywords
  • question templates, such as what happens to
    during ?.
  • Unsuitable for questions, such as the previous
    examples.
  • Educate users
  • But for different KBs require different
    education.
  • Unsuitable for untrained end-users.

12
Project goal
  • To improve knowledge-based question-answering
    systems by automating the interpretation of loose
    speak to produce correct encodings of questions
  • Input a naïve encoding of a question.
  • Output an encoding of the input question that
    conveys the intended semantics of the input, and
    does not contain loose speak.

13
Project Goal
Goal
Interpreter
KB
English question
end user
question encoding
14
Study 1 types of loose speak
  • Purpose since a naïve encoding may differ from a
    correct encoding in many ways, we need to
    discover types of loose speak.
  • Methodology compare naïvely encoded questions
    with the correct encodings.
  • Data two sets of questions from Project Halo
    (150 questions in total).

15
Types of loose speak
16
Types and frequencies
17
Algorithm
  • Overview reuse the knowledge in the KB being
    queried.
  • Made of a test and repair function.
  • Test check to see if an input contains loose
    speak based on constraint violation and the
    knowledge in the KB.
  • Repair finds a list of interpretations based on
    spread activation on the KB using the input as
    anchor points.

18
Example
  • Question Hydrolysis of NaCH3COO yields?
  • a strong acid and a strong base
  • a weak acid and a weak base
  • a strong acid and a weak base
  • a weak acid and a strong base
  • none of the above

19
Example (continued) test
  • There is no constraint violation because the
    domain of raw-material is Event, the range of
    raw-material is Tangible-Entity, Hydrolysis is an
    Event, and NaCH3COO is a Tangible-Entity.
  • However, it detects a loose speak because there
    is no super or subclass of Hydrolysis whose
    raw-material is a super or subclass of NaCH3COO
    in the KB.

Hydrolysis
raw-material
result
?
NaCH3COO
intensity
?
20
Example (continued) repair
  • Breadth-first search starting from Hydrolysis.
  • Spread activation terminates when it finds a
    super or subclass of NaCH3COO.

Time-Interval
Chemical-Entity
time
has-basic-structural-unit
Hydrolysis
raw-material
site
Place
Chemical
Halo KB
21
Example (continued)
Hydrolysis
result
raw-material
?
Chemical
intensity
?
has-basic-structural-unit
NaCH3COO
22
Study 2 interpreter performance
  • Data
  • 50 multiple choice questions from AP chemistry
    practice tests.
  • Distinct from the data used in frequency study.
  • Users
  • 3 users with different background in knowledge
    engineering and chemistry.
  • Given a brief 3-page tutorial on encoding
    question. Not complete tutorial on using the KB.
  • Measurements
  • precision and recall.

23
Experimental results
24
Discussion and analysis
  • Loose speak is very common on average 91.3 of
    the encodings by the users contain loose speak.
  • None of the encodings that contain loose speak
    would be correctly answered by our knowledge
    base.
  • The loose speak interpreter works well in our
    test precision 95, recall near 90.

25
Related work
  • Metonymy
  • Based on a set of rules (Weischedel Sondheimer
    1983 Grosz et al. 198 Lytinen, Burridge
    Kirtner 1992 Fass 1997).
  • Based on KB-search (Browse 1978 Markert Hahn
    1997 Harabagiu 1998).
  • KB-search in knowledge acquisition (Davis 1979
    Kim Gil 1999 Blythe 2001).

26
Summary
  • Defined loose speak as the part of a question
    encoding that misaligns with existing knowledge
    base structures.
  • Preliminary evaluation shows that loose speak is
    common.
  • The interpreter can detect and interpret most
    occurrences of loose speak correctly in our test.

27
Future work
  • Expand the investigation of loose speak into
    other aspects of knowledge base interaction, such
    as knowledge acquisition.

28
Why doesnt traversal order matter? (most of the
time).
  • Interpretation of an edge does not affect other
    edges because the interpreter does not alter the
    original head and tail, and does not depend on
    the interpretation of other edges.
  • Except
  • overly generic concept type of loose. Use
    backtracking.
  • queries process them last, and process in the
    direction of the edges in the queries.

29
Example (continued)
30
Why didnt you use a more sophisticated search?
  • Deeper search is not better. A very deep search
    will return encodings that are not closely
    related to the input, therefore they are less
    likely to convey the intended meaning of the
    input.
  • If only shallow search is needed, then a brute
    force is sufficient.

31
Isnt everything related to something in the
taxonomy? So most search results must be useless.
  • The semantic relations in the searches do include
    subclasses relation, but they do not include
    superclasses relation.
  • If both superclasses and subclasses are
    included, then any concept can be found from
    another by climbing up and down the taxonomy, and
    a large number of spurious interpretations may be
    returned.

32
Precision and recall definition
  • Measurements (Jurafsky Martin 2000)
  • Precision of correct answers given by system
    / of answer given by system
  • Recall of correct answers given by system /
    total of possible correct answers
  • of correct answers given by system is the
    of question encodings interpreted correctly .
  • of answer given by system is the of
    question encodings the interpreter detects loose
    speak and finds an interpretation.
  • total of possible correct answers is the
    number of all question encodings that contain
    loose speak.

33
Experiment details
  • tp inputs contain LS, and they are interpreted
    correctly
  • fp inputs don't contain LS, but they are
    interpreted
  • tn inputs don't contain LS, and they are not
    interpreted
  • fn inputs contain LS, but they are not
    interpreted
  • Special cases
  • If an input has syntax mistakes, such as missing
    paren, use set filler instead of single inst.
    fixed versions are used
  • If an input causes interpreter to crash, then it
    counts as no interpretations found (hence fn) no
    matter what the cause of the crash is (could be
    KB or really really bad encoding)
  • If the interpretation solves the LS in an input
    correctly even if the result isn't the perfect
    encoding for the question, it counts as true
    positive
  • If the input has LS and the interpretation is
    incorrect or partial correct, then it counts as
    fn

34
Test repair test
  • Constraint violation
  • If the edge violates structural constraints, then
    it must contain loose speak (because correct
    encodings are consistent with the structure of
    the knowledge base)
  • Returns many true positives and false negatives.
  • Resemblance test
  • If the input does not resemble any existing
    knowledge, then it may contain loose speak
    because studies have shown that one frequently
    repeats similar versions of general theories
    (Clark, et al. 2000).
  • Returns many false positives and true negatives.

35
Test repair test (continued)
  • Constraint violation implemented as a test for
    domain and range violation of the relation in an
    edge
  • Resemblance test
  • if the edge represents a query, then it passes
    the test only if the KB can compute one or more
    fillers for the tail.
  • Otherwise, passes if the KB
    contains an edge such that
    Headkb subsumes or is subsumed by Headq and
    Tailkb subsumes or is subsumed by Tailq.

36
Test and repair repair
  • Given , repair is implemented as two
    breadth-first procedures
  • search_head start at C1, traverse all semantic
    relations, stop when a suitable instance C3 is
    found. C3 is suitable if does not
    contain loose speak. The successful search path
    is returned.
  • search_tail similar search starting from C2.

37
Example (continued) interpreting loose speak
  • the domain of intensity is Thing, and the range
    is Intensity-Value. Because the result of
    Hydrolysis is a Chemical, which is a Thing, it
    passes the constraint violation test.
  • However because the query about intensity does
    not return any value, it fails the resemblance
    test.

Hydrolysis
raw-material
result
?
Chemical
intensity
has-basic-structural-unit
?
NaCH3COO
38
Example (continued) interpreting loose speak
  • search_head finds that the Base-Role played by
    the resulting Chemical has an intensity value.
  • Its a role type of loose speak.

Chemical-Entity
has-basic-structural-unit
Chemical
Intensity-Value
plays
intensity
Base-Role
Halo KB
39
Example (continued) interpreting loose speak
Hydrolysis
result
raw-material
?
Chemical
plays
?
has-basic-structural-unit
NaCH3COO
intensity
?
Write a Comment
User Comments (0)
About PowerShow.com