Recognizing Textual Entailment Challenge PASCAL - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Recognizing Textual Entailment Challenge PASCAL

Description:

Vampire theorem prover and Paradox where used for entailment proof. A knowledge base was used to validate results with real world. WordNet ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 22
Provided by: suleiman4
Category:

less

Transcript and Presenter's Notes

Title: Recognizing Textual Entailment Challenge PASCAL


1
Recognizing Textual Entailment ChallengePASCAL
  • Suleiman BaniHani

2
Textual Entailment
  • Textual entailment recognition is the task of
    deciding, given two text fragments, whether the
    meaning of one text is entailed (can be inferred)
    from another text.

3
Task Definition
  • Given pairs of small text snippets, referred to
    as Text-Hypothesis (T-H) pairs. Build a system
    that will decide for each T-H pair whether T
    indeed entails H or not. Results will be compared
    to the manual gold standard generated by
    annotators.
  • Example
  • T Kurdistan Regional Government Prime Minister
    Dr. Barham Salih was unharmed after an
    assassination attempt.
  • Prime minister targeted for assassination

4
Dataset Collection and Application Settings
  • The dataset of Text-Hypothesis pairs was
    collected by human annotators. It consists of
    seven subsets
  • Information Retrieval (IR)
  • Comparable Documents (CD)
  • Reading Comprehension (RC)
  • Question Answering (QA)
  • Information Extraction (IE)
  • Machine Translation (MT)
  • Paraphrase Acquisition (PP)

5
Approaching textual entailment recognition
  • Solution approaches can be categorizes as.
  • Deep analysis or understanding
  • Using types of linguistic knowledge and resources
    to accurately recognize textual entailment
  • Patterns of entailment (e.g. lexical relations,
    syntactic alternations)
  • Processing technology (word co-occurrence
    statistics, thesaurus, parsing, etc.)
  • Shallow approach

6
Baseline
  • Given that half the pairs are FALSE, the simplest
    baseline is to label all pairs FALSE. This will
    achieve 50 accuracy.

7
Application of the BLEU (BiLingual Evaluation
Understudy) algorithm
  • Shallow based on lexical level.
  • It is based on calculating the percentage of
    n-grams for a given translation to the human
    standard one, a typical values for N are taken,
    i.e. 1, 2, 3, 4.
  • It limits each n-gram appearance to a maximum
    frequency.
  • The result of each n-gram is combined, and a
    penalty is added to short text.
  • Scored 54 for development set, and a 50 in the
    test set.
  • Good results in the CD, bad results in IE and IR.
  • Problem, does not recognize syntactical or
    semantics, such as synonyms and antonyms.

8
Syntactic similarities
  • Human annotators were asked to divide the data
    set to
  • True by syntax
  • False by syntax
  • Not syntax
  • Cannot decide
  • Then using a robust parser to establish the
    results.
  • A partial submission was provided. And humans
    were used for the test.

9
Tree edit distance
  • The text as well as the Hypothesis is transformed
    to a tree using a sentence splitter and a parser
    to create the syntactic representation.
  • A matching module, find the beast sequence of
    editing operations to obtain H from T.
  • Each editing operation (deletion, insertion and
    substitution) is given a relative score.
  • Finally the total score is evaluated, if it
    exceeds a certain limit them the pair is labeled
    as true.
  • High accuracy for CD but 55 overall accuracy.
  • Should be enriched by using resources as WordNet
    and other libraries.

10
Dependency Analysis and WordNet
  • A dependency parser is used to normalized data in
    appropriate tree representation.
  • Then a lexical entailment module is used, where
    the sub branches of T an H can be entailed from
    the other using
  • Synonymy and similarity
  • Hyponymy and WordNet entailment, i.e. death
    entail kill.
  • Multiwords, i.e. melanoma entails skin-cancer.
  • Negation and antonymy, where negation is
    propagated through tree leaves.
  • A matching between dependency trees using a
    matching algorithm searching for matching
    branches between T and H.
  • Results show high score in CD and a between 42 to
    55 in other fields

11
Syntactic Graph Distance a rule based and a SVM
based approach
  • Use a graph distance theory, where a graph is
    used to represent the H and T pair.
  • Use similarity measures to determine entailment
  • T semantically subsumes H, e.g. H The cat eats
    the mouse and T the cat devours the mouse,
    eat generalizes devour).
  • T syntactically subsumes H, e.g., H The cat
    eats the mouse and T the cat eats the mouse in
    the garden, T contains a specializing
    prepositional phrase).
  • T directly implies H (e.g., H The cat killed
    the mouse, T the cat devours the mouse).

12
Cont.
  • A rule based system realize the following
  • Node similarity
  • Syntactic similarity
  • Semantic similarity
  • Applying a machine learning technique to evaluate
    the parameters and make the final decision
  • Results high for CD .76 and .44-.59 for others

13
hierarchical knowledge representation
  • A hierarchical logic passed representation o the
    T H pairs, where a description logic inspired
    language is used, extended feature description
    login (EFDL) which is similar to concept graph.
  • Nodes in the graph represent words or phrases.
  • Manually generated rewriting rules are used for
    semantic and syntactic representations.
  • A sentence in the text can have different
    alternatives
  • The evaluation is based if any of the sentence
    representations can infer H.
  • Results in the system set 64.8 while in the test
    56.1, high CD lowest QA 50

14
Logic like formula representation
  • A parser is used to transfer the pair T and H to
    graph, of logical phrases, where the nodes are
    the words and the links are the relations.
  • A matching score is given for each pair of terms.
  • The theorem proof is used to find the proof with
    the lowest coast.
  • The final cost is evaluated is it is less than a
    threshold, then the entailment is proved.
  • High results in the CD 79, Lowest with MT 47
    average 55.

15
Atomic Propositions
  • Find entailment relation by comparing the atomic
    proposition contained in the T and H.
  • The comparison of the atomic propositions is done
    using a deduction system OTTER.
  • The atomic propositions are extracted from the
    text using a parser.
  • WordNet is used for word relations.
  • A semantic analyzer is used to transform the
    output of the parser to first order logic.
  • Low accuracy .5188 especially for QA 47.

16
Combining shallow over lapping technique with
deep theorem proving
  • In the shallow stage a simple frequency test of
    over lapping words is used.
  • In the deep stage CCGparser is used to generate
    DRS, discourse representation theory. Which is
    transformed to first order logic.
  • Vampire theorem prover and Paradox where used for
    entailment proof.
  • A knowledge base was used to validate results
    with real world.
  • WordNet
  • Geographical knowledge from CIA
  • Generic axioms for, for instance, the semantics
    of possessives, active-passives, and locations.
  • The combined system has accuracy of .562 while
    the shallow approach has an accuracy of 0.55.

17
Applying COGEX logic prover
  • First use parser to convert into logic.
  • Then use COGEX, which is a modified version of
    OTTER.
  • The prover requires a set of clauses called the
    set of support which is used to initiate the
    search for inferences.
  • The set of support is loaded with the negated
    form of the hypothesis as well as the predicates
    that make up the text passage.
  • Another list is required called the usable list,
    contains clauses used by OTTER to generate
    inferences.
  • The usable list consists of all the axioms that
    have been generated either automatically or by
    hand.
  • World Knowledge Axioms (Manually)
  • NLP Axioms(SS and SM)
  • WordNet Lexical Chains
  • Overall accuracy .551, a lot of errors in the
    parsing stage.

18
Comparing task accuracy
CD Comparable Documents IE Information
Extraction QA Question Answering IR
Information Retrieval MT Machine
Translation RC Reading Comprehension PP
Paraphrasing
19
Rsullt comparison
20
Future work
  • Search for a candidate parser to transform NL to
    first order logic.
  • Use the largest set of KB to caputre similarity.
  • Search for a robust theorem prover.

21
References
  • The first PASCAL Recognising Textual Entailment
    Challenge (RTE I)
  • Ido Dagan, Oren Glickman and Bernardo Magnini.
    The PASCAL Recognising Textual Entailment
    Challenge.In Proceedings of the PASCAL
    Challenges Workshop on Recognising Textual
    Entailment, 2005.
  • http//www.cs.biu.ac.il/glikmao/rte05/index.html
Write a Comment
User Comments (0)
About PowerShow.com