Use of Patterns for Detection of Answer Strings - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Use of Patterns for Detection of Answer Strings

Description:

A certain shift from deep text analysis and NLP methods to surface techniques ... Incorrectly identified answer strings = 13.6% (excluding NIL answers) ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 15
Provided by: EPLT
Category:

less

Transcript and Presenter's Notes

Title: Use of Patterns for Detection of Answer Strings


1
Use of Patterns for Detection of Answer Strings
  • Soubbotin and Soubbotin

2
Essentials of Approach
  • A certain shift from deep text analysis and NLP
    methods to surface techniques
  • Use of formulas describing the structure of
    strings likely bearing certain semantic
    information

3
Example
  • FBI Director Louis Freeh
  • A person represented by his/her first/last names
  • A person occupies a post in an organization

4
The formula
  • A word composed of capital letters
  • An item from a list of posts in an organization
  • An item from a list of first names
  • A capitalized word

5
Patterns
  • Formulas of such kind were called patterns
  • First used at TREC-10 QA track
  • Each pattern is characterized by a certain
    generalized semantics

6
Steps (Overview)
  • Identify strings corresponding to a formula
  • Identify the question terms (types)
  • Check for expressions negating the semantics of
    the found strings
  • Apply the set of formulas (for a particular
    question type) to match the strings in
    question-relevant passages

7
A Surface Approach
  • No need to distinguish linguistic entities
  • Formulas for strings look like regular
    expressions
  • But patterns include elements referring to lists
    of predefined words/phrases

8
Patterns and Question Types
  • Who is person X?
  • Who occupies post Y in organization Z?
  • A relationship is established between 2 or more
    entities person, post, organization etc
  • Where-question
  • suggest geographical items as answers
  • Construct formulas like item from list of
    cities/towns/counties, countries/states.

9
Examples
  • In what year questions
  • Find strings with a sequence of 4 digits
  • Questions regarding length, area, weight, speed,
    etc
  • Digits plus units of measurement
  • What is the area of Venezuela?
  • 340,569 square miles (a simple pattern match)

10
Complex Patterns
  • Strings expressing relationship between several
    semantic entities
  • The more complex a pattern is, the higher its
    reliability

11
Names and Dates
  • People Names
  • Items from first name list
  • Capitalized words
  • Specific name elements (bin, van, etc)
  • Abbreviations like Sr. and Jr.
  • Dates
  • Prepositions, articles, digits, month names,
    commas, dashes, brackets, phrases like early,
    in the period of, years ago, B.C.

12
Pattern-Matching Strings and Question Semantics
  • How question words are located in the
    pattern-matching string (distance, left/right,
    position to other matching strings etc)
  • Simplicity of a patterns structure is
    compensated by complexity of rules
  • Without applying heuristic rules, sufficiently
    reliable results cannot be ensured
  • Rank assigned to question words/phrases and score
    assigned to candidate answers

13
QA Process
  • Define question types for all questions
  • Order the questions with more reliable patterns
  • Form and rank queries from question terms
  • Modify queries (if score is below threshold)
  • Identify pattern-matching strings (apply complex
    and then simple)
  • Check correlation between patterns and question
    semantics
  • Identify exact answers and calculate their scores

14
Analysis of Results
  • TREC 2002
  • confidence-weighted score 0.691
  • 271 right answers, 209 wrong answers, 148 no
    answer
  • First 29 correct answers belonged to question
    types with highly reliable patterns
  • Incorrectly identified answer strings 13.6
  • (excluding NIL answers)
Write a Comment
User Comments (0)
About PowerShow.com