Reading to Learn Q3 Review - PowerPoint PPT Presentation

Loading...

PPT – Reading to Learn Q3 Review PowerPoint presentation | free to download - id: 69addf-ZjhiN



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Reading to Learn Q3 Review

Description:

Reading to Learn Q3 Review Peter Clark John Thompson Tom Jenkins Phil Harrison Bill Murray – PowerPoint PPT presentation

Number of Views:21
Avg rating:3.0/5.0
Slides: 91
Provided by: PeterC182
Learn more at: http://www.cs.utexas.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Reading to Learn Q3 Review


1
Reading to LearnQ3 Review
  • Peter Clark
  • John Thompson
  • Tom Jenkins
  • Phil Harrison
  • Bill Murray

2
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

3
SRI-Boeings Reading to Learn Seedling
  • Goal
  • study issues in learning through reading by
    working with a reduced version of the problem,
    namely working with controlled, rather than
    unrestricted natural language. The NLP task is
    factored into two
  • full NL ? CL, CL ? logic
  • Rationale
  • by sidestepping some of the linguistic issues of
    full NLP, can focus on knowledge integration
    issues
  • methods for full NL ? CL can be studied
    separately

this project
4
SRI-Boeings Reading to Learn Seedling
  • Approach
  • Rewrite 5 pages of chemistry text into our
    controlled language, CPL
  • Extend and use our CPL interpreter to generate
    logic
  • Integrate this new knowledge with an existing
    chemistry knowledge base (from the Halo Pilot),
    which has the new knowledge surgically deleted
    from it
  • Evaluate the performance of the CPL-extended KB
    with the original
  • Report on the problems encountered and solutions
    developed

5
This Seedling in Mobius
Test Generation
Natural Language Processing
Knowledge Integration
Introspection
This seedling
6
Summary
  • Q3
  • Completed coding of key sentences in CPL
  • Demonstration of inference with that knowledge
  • Study of cues for identifying important text
  • Assembly of key lessons learned
  • Interaction with ISI
  • Exploration of shallow knowledge extraction
  • Q4
  • Finish interpretation of additional sentences
  • Assemble qualitative and quantitive evaluations
  • Continue interaction with ISI Side-by-side study
  • Final report

7
Main Results and Messages
  • With some hand-holding, part of the Mobius loop
    can be done
  • But chemistry is a formidable domain
  • Contributions
  • 10 key lessons learned for a larger project
  • Qualitative and quantitative evaluation data

8
10 Key Lessons
  • Much of the text is irrelevant (fluff)
  • Much important knowledge is conveyed by examples
    diagrams
  • General principles are rarely spelt out clearly
  • Text is full of ambiguity, metaphor, and
    metonymy/loosespeak
  • Declarative knowledge may be hidden in procedural
    descriptions
  • Text creates disconnected knowledge, which may
    not chain well
  • Discourse structure is important
  • Generic sentences are ubiquitous
  • Many sentences pose major representational
    challenges
  • Traditional KR structures are difficult to extend

9
Two Reformulations into CPL
  • Reformulation of the whole 5 pages into CPL
  • Approximately 250 sentences
  • Syntactic conversion pseudo-logic
  • generally not inference capable, esp. generics
  • Re-reformulation of first subsection into
    explicit if-thens
  • Inference capable but greater distance from
    source text
  • Reformulation of key pieces into CPL
  • approximately 10 if-then rules
  • inference capable
  • barely recognizable from the original source text

10
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

11
Some CPL Rules
IF a substance is an acid THEN the substance
tastes sour. IF an acid contacts an
acid-sensitive dye THEN the acid changes the
color of the dye. IF a substance is a base THEN
the substance tastes bitter. IF a substance is a
base THEN the substance feels slippery. IF a
substance is an acid THEN the substance contains
hydrogen. IF a thing is a base THEN the thing is
a substance. IF an Arrhenius base contacts water
THEN the base emits OH-minus ions in the
water. IF an Arrhenius acid is dissolving in
water THEN the dissolving is increasing the
concentration of H-plus ions in the water. IF an
Arrhenius base is dissolving in water THEN the
dissolving is increasing the concentration of
OH-minus ions in the water IF a substance is a
HCl substance THEN the substance is an Arrhenius
acid. IF hydrogen chloride gas is in water THEN
the gas dissolves easily in the water. IF
hydrogen chloride gas is in water THEN the gas
reacts with the water.
12
Reformulation of the 5 pages
  • Note introductory material, flowery language,
    fluff, complex sentences, parentheticals.

13
IF a substance is an acid THEN the substance
tastes sour. IF an acid contacts an
acid-sensitive dye THEN the acid changes the
color of the dye. IF a substance is a base THEN
the substance tastes bitter. IF a substance is a
base THEN the substance feels slippery.
14
IF a substance is a HCl substance THEN the
substance is an Arrhenius acid. IF hydrogen
chloride gas is in water THEN the gas dissolves
easily in the water. IF hydrogen chloride gas is
in water THEN the gas reacts with the water. HCl
is the chemical symbol for hydrogen chloride. IF
a substance is an aqueous solution of HCl
substance THEN the substance is hydrochloric
acid. IF a substance is concentrated
hydrochloric acid THEN 37 percent of the mass of
the substance is HCl. IF a substance is
concentrated hydrochloric acid THEN the
concentration of HCl in the substance is 12 M.
? (Implied but not explicit)
15
(surface logical form)
the'(e1,x1,e2) aqueous'(e3,x1)
solution'(e2,x1) of'(e4,x1,x2) hcl'(e5,x2)
know'(e6,z1,x1,x3) as'(e7,e6,x3)
hydrochloric'(e8,x3) acid'(e9,x3)
IF a substance is an aqueous solution of HCl
substance THEN the substance is hydrochloric
acid.
CPL
(every Hydrochloric-Acid has-definition
(instance-of (Aqueous-Solution)) (has-solute
((a HCl-Substance)))
Halo KB style
16
Summary of Interpretation Challenges
  • Interpreting generics.
  • "Acids cause some dyes to change color."
  • how to handle negation.
  • "Some substances containing hydrogen are not
    acids."
  • "The transfer leaves no undissociated acid
    molecules"
  • Vague attributes ("properties", "due to")
  • Properties of aqueous solutions of Arrhenius
    acids are due to H-plus ions"
  • coreference with nominalizations
    ("react"/"reaction")
  • "Hydrogen chloride reacts... The reaction
    produces..."
  • naming how to represent both the name and the
    symbol for a chemical.
  • "An aqueous solution of HCl is called
    hydrochloric acid."
  • how to get new technical vocabulary meanings
    into the system.
  • "NaOH dissociates in water."
  • "H2O abstracts the proton from HX"
  • how to represent definitions.
  • "Arrhenius acids and defined..."
  • how to state that one category is more general
    than another.
  • "Bronsted-Lowry acids are more general than
    Arrhenius acids."

17
Summary of Interpretation Challenges (cont)
  • how to represent "sometimes".
  • "An HO3-plus ion sometimes reacts with an H2O
    molecule."
  • how to represent modals/tendancies like "can".
  • "A molecule of a Bronsted-Lowry acid can donate a
    proton..."
  • how to represent an argument (proof), and
    generalize from it.
  • "Therefore, the H2O molecule acts as a
    Bronsted-Lowry base.
  • "Substances with negligible acidity contain
    hydrogen, but the substances do not behave as
    acids in water."
  • vagueness ("is mostly", "nearby", "some")
  • "The NH4Cl is mostly solid particles."
  • "Some acids are better proton donors than other
    acids."
  • "A weak acid partly transfers the acid's protons
    to the water."
  • "Proton-transfer reactions are governed by the
    relative strengths of the bases"
  • "The solution has a negligible concentration of
    HCl molecules."
  • "An aqueous solution of acetic acid consists
    mainly of HC2H3O2 molecules"
  • "The aqueous solution has relatively few H3O-plus
    ions"
  • metonymy
  • "The H2O molecule in Equation 16.5 donates a
    proton"
  • "In Equation 16.9 HX dissolves in water."
  • "Equation 16.9 describes the behavior of a strong
    acid in water."

18
Summary of Interpretation Challenges (cont)
  • definitions with negation.
  • "An H-plus ion is a proton with no valence
    electron."
  • presuppositions
  • "Acids cause some dyes to change color."
  • "A Bronsted-Lowry acid always reacts with a
    nearby Bronsted-Lowry base."
  • generalized formulae and equations
  • "In Equation 16.6 the symbol HX denotes an acid."
  • how to compute and represent differences
  • "An acid and a base differing only in a proton
    are called a conjugate pair"
  • how to handle definite references ("the" base)
    that haven't been introduced.
  • "Removing a proton from the acid produces the
    conjugate base."
  • change over time
  • "The HNO2 molecule becomes the NO2-minus ion."
  • "The H2O molecule changes into the hydronium ion"
  • "Acids cause some dyes to change color."
  • semi-malformed sentences
  • "A stronger acid has a weaker conjugate base."
  • How to state and represent hypothetical
    situations.

19
Summary of Interpretation Challenges (cont)
  • Generalization from examples
  • In any reaction we can identify two sets of
    conjugate acid-base pairs. For example, consider
    the reaction
  • Information in tables and diagrams

20
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

21
Recall from Last Time
  • Most of the textbook sentences are fluff and
    examples
  • and are not needed to solve test questions
  • A few key sentences (and a table) are the heart
    of this section of the textbook
  • and are often given in italics
  • These key sentences are not worded as precisely
    as needed for automatic translation into axioms
    that can chain together to solve a problem
  • in fact, some parts are not stated at all
  • students look at diagrams and examples and figure
    it out

22
Overview
  • 4 key pieces of knowledge in the Section
  • Computing the direction of the reaction
  • Rewriting in CPL
  • Compare to UTs KM encoding
  • Compare to ISIs shallow logical form
  • Identifying the acids/bases in a reaction
  • Computing the conjugate of an acid/base
  • Comparing the strengths of two acids/bases

23
Overview
  • 4 key pieces of knowledge in the Section
  • Computing the direction of the reaction
  • Rewriting in CPL
  • Compare to UTs KM encoding
  • Compare to ISIs shallow logical form
  • Identifying the acids/bases in a reaction
  • Computing the conjugate of an acid/base
  • Comparing the strengths of two acids/bases

24
A Key Sentence in Our Textbook
  • Lets look at one example of a key sentence
  • From these examples we conclude that in every
    acid-base reaction the position of the
    equilibrium favors transfer of the proton to the
    stronger base.
  • Restated in Sample Exercise 16.3
  • Thus, the equilibrium favors the direction in
    which the proton moves from the stronger acid and
    becomes bonded to the stronger base.
  • In other words, the reaction favors consumption
    of the stronger acid and stronger base and
    formation of the weaker acid and weaker base.

25
(No Transcript)
26
Rewriting a Sentence into CPL
Textbook
In every acid-base reaction the position of the
equilibrium favors transfer of the proton to the
stronger base.
Naïve Encoding 1
IF there is a reaction AND one base in the
reaction is stronger than the other base in the
reaction THEN the direction of the reaction is
away from the stronger base. favors transfer
to ? direction is away from
Naïve Encoding 2
IF there is a reaction AND there is a base on the
left side of the reaction AND there is a base on
the right side of the reaction AND the first base
is stronger than the second base THEN the
direction of the reaction is to the right.
27
Further Refinement of the CPL
Naïve Encoding 2
IF there is a reaction AND there is a base on the
left side of the reaction AND there is a base on
the right side of the reaction AND the first base
is stronger than the second base THEN the
direction of the reaction is to the right.
The chemical entity whose formula is on the left
side of the equation of the reaction and which
plays a base role
28
Final CPL Rule That Worked!
IF there is an equation of a reaction AND a
first chemical entity has a chemical formula AND
the first chemical formula is part of the left
side of the equation AND the first chemical
entity is playing a base role AND a second
chemical entity has a second chemical formula AND
the second chemical formula is part of the right
side of the equation AND the second chemical
entity is playing a base role AND the first
chemical entity is stronger than the second
chemical entity THEN the direction of the
reaction is right to the right AND the
equilibrium side of the reaction is right. lies
on the right
the base on the LHS
the base on the RHS
(means stronger base than)
(UTs rep. uses Reaction, but should use Equation)
29
Compare Sentence to Final CPL
  • In every acid-base reaction the position of the
    equilibrium favors transfer of the proton to the
    stronger base.
  • IF there is an equation of a reaction
  • AND a first chemical entity has a chemical
    formula
  • AND a second chemical entity has a second
    chemical formula
  • AND the first chemical formula is part of the
    left side of the equation
  • AND the second chemical formula is part of the
    right side of the equation
  • AND the first chemical entity is playing a base
    role
  • AND the second chemical entity is playing a base
    role
  • AND the first chemical entity is stronger than
    the second chemical entity
  • THEN the direction of the reaction is right
  • AND the equilibrium side of the reaction is
    right.
  • (Theres a 2nd rule like this that concludes the
    direction is left)

not actually used!
30
KM Generated from CPL
  • (_Equation7461 equation-of _Reaction7462)
  • (_Chemical Entity7468 has-chemical-formula
    _Chemical Formula7469)
  • (_Chemical Formula7469 equal _Part7485)
  • (_Part7485 is-part-of _Left Side7483)
  • (_Left Side7483 is-region-of _Equation7461)
  • (_Chemical Entity7475 has-chemical-formula
    _Chemical Formula7476)
  • (_Chemical Formula7476 equal _Part7494)
  • (_Part7494 is-part-of _Right Side7492)
  • (_Right Side7492 is-region-of _Equation7461)
  • (_Chemical Entity7468 plays _Base Role7501)
  • (_Chemical Entity7475 plays _Base Role7508)
  • (_Chemical Entity7468 stronger-base-than
    _Chemical Entity7475)
  • ?
  • (_Direction7518 value right)
  • (_Direction7518 direction-of _Reaction7462)
  • (_Equilibrium Side7524 property right)
  • (_Equilibrium Side7524 equilibrium-side-of
    _Reaction7462)

chem.on LHS
IF
chem. on RHS
THEN
31
Structure of the CPL Axioms
1. Find equilibrium side (or direction) of
equation
2. Find out if a chemical is playing a base role
in the equation
4. Check whether one base is stronger than
another base
4a. Look in Table
3. Find out if a chemical is the conjugate base
of another chemical
3b. Check whether one formula differs from
another in an H
3a. Look in Table, or
(not in CPL)
32
Notes on our CPL Rule
  • The wording is way different from the original
    text!
  • The literal sentence translation would not have
    produced anything that could solve a problem,
    given an equation
  • In every acid-base reaction the position of the
    equilibrium favors transfer of the proton to the
    stronger base.
  • this would create a Favoring event
  • the position of the equilibrium is the agent
  • the transfer of the proton is the object
  • what does this mean?

33
Overview
  • 4 key pieces of knowledge in the Section
  • Computing the direction of the reaction
  • Rewriting in CPL
  • Compare to UTs KM encoding
  • Compare to ISIs shallow logical form
  • Identifying the acids/bases in a reaction
  • Computing the conjugate of an acid/base
  • Comparing the strengths of two acids/bases

34
How UT Encoded This
  • "In acid/base equilibrium reactions, the reaction
    proceeds in the direction of the side where
    equilibrium lies their comment for use in
    explanations
  • (every Reaction has
  • (direction ( (if (not (the direction of
    Self)) then (a Direction-Value with
    (value ((if (the output of (a
    Compute-Equilibrium-Position with (input
    (Self)))) then (if ((the output of
    (a Compute-Equilibrium-Position
    with (input (Self)))) (the
    raw-material of Self)) then
    left else right)))))

To find the direction of a reaction
Compute the equilibrium position
If the chemicals match the raw materials
Then the direction is left, else right
35
UTs Compute-Equilibrium-Position
(every Compute-Equilibrium-Position has (input
((a Reaction))) (output (
See if both the strong acid and base are on the
LHS. (if ( Check the acids.
((the output of
(a Compare-Relative-Strengths-of-Acids
with (input (
(oneof (the raw-material
of (the input of Self))
where
(the Acid-Role plays of It))
(oneof (the result of (the input of
Self)) where
(the Acid-Role plays
of It)))))) (oneof (the
raw-material of (the input of Self))
where (the Acid-Role plays of
It))) and
Check the bases. ((the
output of (a
Compare-Relative-Strengths-of-Bases with
(input ( (oneof (the
raw-material of (the input of Self))
where (the Base-Role plays of
It)) (oneof (the
result of (the input of Self))
where (the Base-Role plays of
It))))))
(oneof (the raw-material of (the input of
Self)) where (the
Base-Role plays of It)))) then
(the result of (the input of Self))
else (the raw-material of (the input of
Self))))))
If the stronger of
the raw material acid
and the result acid
is the raw material acid
(same for bases)
then equilibrium is on the result side else the
raw material side
36
Notes on UTs Encoding
  • Very procedural!
  • Various procedural methods are encoded
  • both qualitative and quantitative
  • Nothing like the textbook sentences
  • Their representation does not match the natural
    conceptual model we expected
  • see the next slide

37
Mismatches between UT and CPL
  • UT put a direction slot on a Reaction, we
    expected it to be on an Equation
  • UT has no model of the left and right sides of an
    Equation, only the raw-materials and result
    slots of a Reaction
  • UT has a Conjugate-Acid-Base-Pair concept, but
    lacks the conjugate-base conjugate-acid
    relations we expected
  • UT has no slot for the equilibrium-side of an
    Equation, only the direction of a reaction

38
More Mismatches between UT and CPL
  • UT gives us no primitives to use for formula
    manipulation (adding an H), its buried within
    their Compute-Conjugate-Acid
  • UTs model of Formula does not include a charge
    slot, theyve only attached it to the Chemical
    itself
  • UT has no notion of stronger-base-than, they
    only label a chemical with intensity strong
    or weak.
  • So, it would help if the conceptual model were
    closer to natural language!

39
Overview
  • 4 key pieces of knowledge in the Section
  • Computing the direction of the reaction
  • Rewriting in CPL
  • Compare to UTs KM encoding
  • Compare to ISIs shallow logical form
  • Identifying the acids/bases in a reaction
  • Computing the conjugate of an acid/base
  • Comparing the strengths of two acids/bases

40
ISIs Shallow Logical Form for our Sentence
From these examples we conclude that in every
acid-base reaction the position of the
equilibrium favors transfer of the proton to the
stronger base.
position'(e17,x5) of'(e18,x5,x6)
the'(e19,x6,e20) equilibrium'(e20,x6)
favor'(e11,x5,x7,z2) transfer'(e21,x7)
of'(e22,x7,x8) the'(e23,x8,e24)
proton'(e24,x8) to'(e25,x7,x9)
the'(e26,x9,e27) strong'(e28,x9)
base'(e27,x9)
from'(e1,e2,x1) these'(e3,s1,e4)
example'(e4,x1) plural'(e7,x1,s1)
we'(e8,x2) plural'(e9,x2,s2)
conclude'(e2,x2,x3,z1) that'(e10,e2,e11)
in'(e12,e11,x4) every'(e13,x4,e14)
acid-base'(e15,x4) reaction'(e14,x4)
the'(e16,x5,e17)
41
Graph of ISIs Shallow Logical Form

z1 conclude(x2, x3)
?
x2 we
x3 missing!
from (x1)
?
x1 example
that
in(x4)
?
these
x4 reaction
z2 favor (x5, x7)
every(x4)
acid-base(x4)
x5 position
x7 transfer
of (x5, x6)
of (x7, x8)
to (x7, x9)
x6 equilibrium
x9 base
x8 proton
strong (x9)
42
Notes on ISIs Shallow Logical Form
  • Not far removed from a syntactic parse
  • They plan to do much more development of this
  • Will probably produce a literal translation
  • there will be a Favoring event, with agent
    object
  • As with the naïve CPL sentence, a literal
    translation would not help solve a Chemistry
    problem

43
Overview
  • 4 key pieces of knowledge in the Section
  • Computing the direction of the reaction
  • Rewriting in CPL
  • Compare to UTs KM encoding
  • Compare to ISIs shallow logical form
  • Identifying the acids/bases in a reaction
  • Computing the conjugate of an acid/base
  • Comparing the strengths of two acids/bases

44
(No Transcript)
45
CPL for 2nd Key Sentence
  • In any acid-base (proton transfer) reaction we
    can identify two sets of conjugate acid-base
    pairs.
  • IF there is an equation of a reaction
  • AND a first chemical entity has a chemical
    formula
  • AND a second chemical entity has a second
    chemical formula
  • AND the first chemical formula is part of the
    left side of the equation
  • AND the second chemical formula is part of the
    right side of the equation
  • AND the first chemical entity is the conjugate
    base of the second chemical entity
  • THEN the first chemical entity is playing a base
    role
  • AND the second chemical entity is playing an
    acid role.
  • (Theres a 2nd rule like this with first second
    reversed)

46
UT Code for 2nd Key Sentence
(every Chemical has (plays ( (if
((the term of (the atomic-chemical-formula
of (the has-basic-structural-u
nit of Self))) and (not (the Base-Role
plays of Self))) then
(if ((has-value (oneof (the
result of (the Reaction raw-material-of of
Self)) where (((the
elements of
(the term of
(the atomic-chemical-formula of
(the
has-basic-structural-unit of It))))
(forall2 (the elements of (the term of
(the
atomic-chemical-formula of
(the has-basic-structural-unit of
Self)))) (if ((the2 of
It2) H) then (pair
((the1 of It2) 1) H)
else It2))) or... then (a
Base-Role)
jump to the other side of the equation!
Reaction
result
raw-material
Chemical
Chemical
IF one of the chemicals on the other side of
the reaction
has an extra H
THEN this chemicals a base
47
Overview
  • 4 key pieces of knowledge in the Section
  • Computing the direction of the reaction
  • Rewriting in CPL
  • Compare to UTs KM encoding
  • Compare to ISIs shallow logical form
  • Identifying the acids/bases in a reaction
  • Computing the conjugate of an acid/base
  • Comparing the strengths of two acids/bases
  • These last two items are presented in a table

48
Conjugate Acid-Base Pairs
Textbook
CPL
  • IF there is an HCl and a Cl-Minus
  • THEN the conjugate base of the HCl is the
    Cl-minus.
  • IF there is an H3O-Plus and an H2O
  • THEN the conjugate base of the H3O-Plus is the
    H2O.
  • Etc.

49
Relative Strengths of Bases
Textbook
CPL
  • IF there is a Cl-Minus and an HSO4-Minus
  • THEN the HSO4-Minus is a stronger base than the
    Cl-Minus.
  • IF there is a HSO4-Minus and an NO3-Minus
  • THEN the NO3-Minus is a stronger base than the
    HSO4-Minus.
  • IF there is an NO3-Minus and an H2O
  • THEN the H2O is a stronger base than the
    NO3-Minus.
  • Etc.

50
Lessons from Key Sentences - 1
  • The key sentences did not translate literally
    into useful logic
  • they had to be carefully rewritten in CPL
  • and knowledge was added from studying diagrams
    and examples
  • and they were tested with each other to chain
    together
  • It was difficult to make use of the UT
    representations
  • they were very procedural
  • their representations were further removed from
    the English
  • so, we should use more natural representations
  • ISIs shallow logical forms may produce literal
    translations
  • again, not useful for solving problems

51
Lessons from Key Sentences - 2
  • Reading knowledge directly from a Chemistry text
    would be very challenging
  • the knowledge has to be written precisely enough
    for a computer (with little common sense) to
    encode
  • knowledge in tables and diagrams may be critical
  • the knowledge has to chain together to solve
    difficult exam problems
  • we need text that is written much more dryly and
    precisely
  • we need a domain that doesnt have such difficult
    exam problems

52
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

53
Are Other Chemistry Texts Better?
  • We looked at Web explanations and at Chemistry
    Made Simple types of books
  • Discovered that each teacher explains it
    differently
  • Most jump right into quantitative formulas for
    computing where a reactions equilibrium lies
  • but our textbook teaches it qualitatively first,
    which is rare
  • Other sources are not any easier to process

54
Examples of Other Sources
  • Think of a Bronsted acid-base reaction as a
    competition between the 2 bases in the system for
    protons. The stronger base wins and forces the
    equilibrium in the direction of the weaker acid
    and base. (Web)
  • some books say that an acid is a proton donor
    the acid molecule does not give or donate
    the proton, it has it taken away. In the same
    sense, you do not donate your wallet to the
    pickpocket, you have it removed from you.
    (another website)
  • The base is a molecule with a built-in drive
    to collect protons. As soon as the base
    approaches the acid, it will (if it is strong
    enough) rip the proton off the acid molecule and
    add it to itself.

55
More Examples from Other Sources
  • You see, some bases are stronger than others,
    meaning some have a large desire for protons,
    while other bases have a weaker drive. Its the
    same way with acids, some have very weak bonds
    and the proton is easy to pick off, while other
    acids have stronger bonds, making it harder to
    get the proton.
  • Remember that an acid-base reaction is a
    competition between two bases (think about it!)
    for a proton. If the stronger of the two acids
    and the stronger of the two bases are reactants
    (appear on the left side of the equation), the
    reaction is said to proceed to a large extent.
  • Note the heavy use of metaphors in these
    qualitative explanations!
  • The more readable by humans, the less readable by
    computers!

56
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

57
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

58
Review
  • Earlier analysis
  • Much of the textbook is irrelevant for the
    purposes of computer-based reading
  • motivational material, illustrative material,
    humor
  • Other sentences/parts are critical
  • Questions
  • Can a computer automatically find the critical
    items?
  • What cues might indicate the important material?

59
This brief analysis
  • Here, just consider two categories
  • important vs. unimportant material
  • Categories of surface cues
  • linguistic
  • context
  • layout
  • typography (e.g., font changes)
  • Looked at several text books
  • BL, Chemistry Made Simple, Cliffs Notes

60
Cues for Importance/Unimportance
  • Verb tense past tense suggests irrelevance
  • chemical facts are generally presented in the
    present tense past tense usually signals a
    historical digression but biological facts
    include evolutionary facts, which require past
    tense.
  • Cue phrases for important generalizations
  • for example and (less so) thus precede
    examples but follow important generalizations.

61
Cues for Importance/Unimportance
  • Long sentences (gt20) suggest irrelevance
  • Average sentence length for chemistry is about 15
    words biology, ca. 24 words.
  • 15 words seems to allow a good balance of
    simplicity and complexity for stepping through
    explanations. CPL should target this number.
  • Summaries tend to have long complex sentences
    that are harder to process. Also true for
    sentences in review texts Cliffs Notes, Instant
    Notes.

62
Cues for Importance/Unimportance
  • Everyday words suggest applications.
  • Nominalized verbs suggest irrelevance
  • exception basic chemical changes (e.g.,
    reaction, combustion, evaporation)
  • Keywords
  • if, when, because, for indicate important
    sentences
  • For example precedes an illustration
  • also indicates stuff prior is an important
    generality
  • although typically part of fluffy sentence

63
Cues for Importance/Unimportance
  • Definitional patterns important!
  • x is substance y, x is a y that does z, x
    is called y
  • First and last sentences in a paragraph tend to
    be important (unless transitional)
  • set the topic of the paragraph
  • Text in bold or italics is often important
  • Repetition could this be exploited?

64
Summary
  • Many surface cues exist
  • Could identify important material by
  • surface cues
  • deeper model of the document structure
  • e.g. Motivation ? General principle ? Example ?
    Reinforce general principle
  • Could the document automatically be turned into a
    labeled, networked structure like this?
  • How document-specific are these patterns?

65
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

66
Principles for an Extensible KB
Elaboration Tolerance
A formalism is elaboration tolerant to the extent
that it is convenient to modify a set of facts
expressed in the formalism to take into account
new phenomena or changed circumstances. John
McCarty
e.g. add/modify knowledge (semantics) by (only)
adding formulae (syntactics)
Three Key Desirables for this
  • Syntactic simplicity
  • Metonymy-tolerant reasoning
  • Separate procedural and declarative knowledge

67
Syntactic Simplicity
  • ? Many syntactically large and complex structures
    in the original Halo KB, e.g.,

(every Acid-Role has (intensity ( (a
Intensity-Value with (value ( (pair
Case statement for Acids. (if ((the
played-by of Self) isa Ionic-Compound-Substance)
then (if (((the played-by of
Self) isa HCl-Substance) or ((the
played-by of Self) isa HBr-Substance) or
((the played-by of Self) isa HI-Substance)
or ((the played-by of Self) isa
HClO3-Substance) or ((the played-by of
Self) isa HClO4-Substance) or ((the
played-by of Self) isa H2SO4-Substance) or
((the played-by of Self) isa HNO3-Substance))
then strong else
Not elaboration-tolerant
68
Syntactic Simplicity
  • ? Better would be to factor them smaller units,
    e.g.,

Elaboration-tolerant
intensity(HCl-Substance, strong) intensity(HBr-Su
bstance, strong) intensity(HI-Substance,
strong) intensity(HClO3-Substance,
strong) intensity(HClO4-Substance,
strong) intensity(H2SO4-Substance,
strong) intensity(HNO3-Substance,
strong) intensity(HF-Substance,
weak) intensity(HC2H3O2-Substance,
weak) intensity(H2CO3-Substance, weak)
69
CPL Produces Syntactically Simple Structures
Traditional KM
(every Compare-Relative-Strengths-of-Acids has
(output ((if ((the intensity of (the first of
(the Chemicals)) strong) and ((the intensity
of (the second of (the Chemicals)) weak) then
(the strongest of (the Chemicals)) (the first
of (the Chemicals)))))
CPL triples
IF (_Intensity9 instance-of Intensity-Value)
(_Chemical8 instance-of Chemical)
(_Intensity5 instance-of Intensity-Value)
(_Chemical4 instance-of Chemical) (_Intensity5
property strong) (_Intensity5 intensity-of
_Chemical4) (_Intensity9 property weak)
(_Intensity9 intensity-of _Chemical8)
THEN (_Chemical4 stronger-than _Chemical8)
70
Metonymy/Loosespeak
  • Metonymy One word substitutes for a closely
    related word
  • Loosespeak More generally, the literal
    interpretation is wrong
  • Examples
  • The kettle is boiling.
  • Im just going to change the washing machine.
  • Its your turn to clean out the rabbit.
  • Remove a proton from the acid
  • The acid on the left of the equation
  • The reaction moves to the right
  • NaCl dissolves in water

71
Handling Metonymy/Loosespeak
  • 1. Detect inconsistencies / unusualities
  • Need extensive world knowledge for this
  • 2. If found, create and evaluate alternative
    interpretations
  • Metonymic transformation rules (e.g., Lakoff,
    Fass)
  • PART for WHOLE (Get your butt over here)
  • PLACE for INSTITUTION (The White House isnt
    saying anything)
  • PLACE for EVENT (Remember the Alamo)
  • SUBSTANCE for MOLECULE (NaCl dissolves)
  • FORMULA for SUBSTANCE (NaCl is on the left of
    the eqn)

72
Metonymy Tolerance (Loosespeak)
  • Could greatly reduce syntactic complexity
  • 50 of HaloKB is doing type conversions
  • Example of extensive metonymy

HC2H3O2(aq)C2H3O2-
basic-unit
formula
73
Metonymy Tolerance
(every Compare-Relative-Strengths-of-Acids has
(output ((if (((the1 of (the value of (the
intensity of (the Acid-Role plays of
(the first of (the input of Self))))))
strong) and ((the1 of (the value of (the
intensity of (the Acid-Role plays of
(the second of (the input of Self)))))) /
strong)) then (the first of
(the input of Self)))))
if we had a metonymy-tolerant reasoner, we could
instead write
(every Compare-Relative-Strengths-of-Acids has
(output ((if ((the intensity of (the first of
(the Chemicals)) strong) and ((the intensity
of (the second of (the Chemicals)) /
strong) then (the strongest of (the Chemicals))
(the first of (the Chemicals)))))
74
Separating Procedural and Declarative Knowledge
  • Procedural descriptions are uni-directional, and
    difficult to introspect on
  • Better domain-specific, declarative knowledge
    general-purpose procedural algorithms

Every acid has a conjugate base, formed by
removing a proton from the acid. ... Similarly,
every base has associated with it a conjugate
acid, formed by adding a proton to the base.
Acid-Chemical Base-Chemical H
75
Separating Procedural and Declarative Knowledge
Declarative
Mixed
(every Compare-Relative-Strengths-of-Acids has
(output ((if (((the1 of (the value of (the
intensity of (the Acid-Role plays of
(the first of (the input of Self))))))
strong) and ((the1 of (the value of (the
intensity of (the Acid-Role plays of
(the second of (the input of Self)))))) /
strong)) then (the first of
(the input of Self))
Compare-Relative-Strengths-of-Acids-output-1
)))
HCl strong H2CO3 weak

strong gt weak gt

Procedural (PSM)
Find object(s) with qualitatively
largest attribute value
Theory of magnitudes
76
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

77
Possible Quantitative Metrics
  • Behavioral
  • Ablation study Question-answering performance
  • Analytic
  • Complexity of CPL vs Halo KB encodings
  • Amount of domain K added by Boeing in writing CPL
  • of Halo KB that would be simplified if metonymy
    handled
  • of original text encodable in CPL
  • Time taken to encode the KBs
  • of source text which is important (vs. fluff)
  • Bar graph of textual phenomena vs. frequency of
    occurrence
  • e.g., metaphor, examples, metonymy, diagrams
  • Measure of redundancy in the text book

78
Behavioral EvaluationAblation Methodology
  • Approach
  • a. Create set of questions
  • b. Send qns to Halo KB, measure correct
  • c. Ablate the Halo KB, add in ours
  • d. Send qns to new KB, measure correct
  • Issues
  • How to ensure a fair comparison?
  • defining the space of questions to look at
  • How to ablate the UT KB?

79
Behavioral EvaluationRelevant AP Questions from
the Halo Pilot
  • Questions from Halo Pilot Syllabus Sample Qns
  • Q10. Given an equilibrium reaction, which species
    in the reaction act as bases?
  • Q33. Each of the following can act as both a
    Bronsted acid and a Bronsted base EXCEPT ...
  • Questions from Challenge Exam, Project Halo
  • Q18. Given an equilibrium reaction, the species
    that act as acids include which of the following?
  • Q19. Given an equilibrium reaction, the correct
    acid/conjugate base pair is ...
  • Q37. Which of the following species forms an acid
    when added to water?
  • Q38. Which of the following (lists of chemicals)
    is in correct order of increasing acidity?

80
Behavioral EvaluationVariations on a Theme
  • Four main question patterns
  • What is the conjugate base/acid of X?
  • Is X stronger/weaker acid/base than Y?
  • Find the conjugate acid-base pairs in equation E
  • What is the direction of the equilibrium?

81
Core Knowledge Encodings
Task
Halo KB
CPL
  • Conjugate pairs
  • Relative strengths
  • Labelling acid/bases in a reaction
  • Computing direction of the reaction
  • Giant KM procedure for formula manipulation
  • Qualitative absolute strengths (strong/weak/neglig
    ible)
  • qualitative comparison
  • Giant KM procedure for reaction manipulation
  • KM rule

Lookup table Relative strength
assertions if-then rule using conjugate
pairs if-then rule
82
Core Knowledge Encodings
More general
Task
Halo KB
CPL
  • Conjugate pairs
  • Relative strengths
  • Labelling acid/bases in a reaction
  • Computing direction of the reaction
  • Giant KM procedure for formula manipulation
  • Qualitative absolute strengths (strong/weak/neglig
    ible)
  • qualitative comparison
  • Giant KM procedure for reaction manipulation
  • KM rule

Lookup table Relative strength
assertions if-then rule using conjugate
pairs if-then rule


(equivalent)
83
Behavioral Evaluation Discussion Points
  • We can predict the outcome of any evaluation
  • can see the internals of each system
  • So what is a fair sample set?
  • Generate instantiations of the 4 templates?
  • AP exam questions?
  • Extend to cover other knowledge in the 5 pages?
  • none of it contained in Halo KB

84
Analytic Evaluation
  • Possible Metrics include
  • Complexity of CPL vs Halo KB encodings
  • Amount of domain K added by us in writing CPL
  • of KB simplified if metonymy handled
  • of original text encodable in CPL
  • Time taken to encode the KBs
  • of source text which is important (vs. fluff)
  • Bar graph of textual phenomena vs. frequency
  • e.g., metaphor, examples, metonymy, diagrams
  • Measure of redundancy in the text book

85
Agenda
  • This Seedling and Mobius
  • Major lessons learned
  • Reformulations in CPL
  • Whole 5 pages
  • Key Sentences
  • How do other texts compare?
  • Generics
  • How to identify important text
  • Principles for an extensible KB
  • Evaluation discussion
  • Tuples as another source of knowledge

86
Knowledge Mining
Schuberts Conjecture
There is a largely untapped source of general
knowledge in texts, lying at a level beneath the
explicit assertional content, and which can be
harnessed.
The camouflaged helicopter landed near the
embassy. ? helicopters can land ? helicopters
can be camouflaged
Our attempt lightweight LFs generated from
Reuters LF forms (S subject verb object (prep
noun) (prep noun) ) (NN noun
noun) (AN adj noun)
87
Knowledge Mining
Newswire Article
HUTCHINSON SEES HIGHER PAYOUT. HONG KONG. Mar
2. Li said Hong Kongs property market remains
strong while its economy is performing better
than forecast. Hong Kong Electric reorganized and
will spin off its non-electricity related
activities. Hongkong Electric shareholders will
receive one share in the new subsidiary for every
owned share in the sold company. Li said the
decision to spin off
88
Knowledge Mining our attempt
Fragment of the raw data (Brown Lemay)
Atoms can combine (S "atom" "combine") For
example, combustion reactions are redox reactions
because elemental oxygen is converted to
compounds of oxygen (Section 3.2). (S "reaction"
"be" "reaction") (S-ADJ "oxygen" "converted"
("to" "compound")) (AN "elemental" "oxygen")
Plan Metals react with acids to form salts and
gas. (S "metal" "react" (PP "with" "acid"))
Extensive oxidation can lead to the failure of
metal machinery parts or the deterioration of
metal structures. (S "oxidation" "lead" (PP "to"
"failure")) (S "oxidation" "lead" (PP "to"
"deterioration")) (AN "extensive" "oxidation")
89
(No Transcript)
90
Summary
  • Q3
  • Completed coding of key sentences in CPL
  • Demonstration of inference with that knowledge
  • Study of cues for identifying important text
  • Assembly of key lessons learned
  • Interaction with ISI
  • Exploration of shallow knowledge extraction
  • Q4
  • Finish interpretation of additional sentences
  • Assemble qualitative and quantitive evaluations
  • Continue interaction with ISI Side-by-side study
  • Final report
About PowerShow.com