Open Information Extraction From The Web

About This Presentation

Title:

Description:

Number of Views:113

Avg rating:3.0/5.0

Slides: 12

Provided by: Rani157

Category:

Tags: extraction | information | knowledge | open | web

Transcript and Presenter's Notes

Title: Open Information Extraction From The Web

1
Open Information Extraction From The Web

2
What is Information Extraction

This article surveys a range of Information
Extraction methods. (Particularly Open)
A venerable technology that maps natural language
text into structured relational data.
Open Information Extraction is where the
identities of the relations to be extracted are
unknown and the billions of documents found on
the Web necessitate highly scalable processing.

3
Most common Ways to do IE

4
Direct Knowledge

5
Example of Supervised Learning
6
Self Supervised Knowledge

A system that labels its own training examples.
(Example KnowItAll)
For a given relation
Use generic pattern ? instantiate
relation-specific extraction rules ? learn
domain-specific extraction rules ? apply rules to
web pages and assign them probabilities.
Example X is a Y (X is a country).
China is a country.
Garth Brooks is a country singer

7
Open Information Extraction

The challenge of Web extraction is to be able to
do Open Information Extraction.
Unbounded number of relations
Web corpus contains billions of documents.

8
How open IE systems work

learn a general model of how relations are
expressed (in a particular language), based on
unlexicalized features such as part-of-speech
tags. (Identify a verb)
Learn domain-independent regular expressions.
(Punctuations, Commas).

9
Is there a general model of relationships in
English
10
TextRunner

Works in two phases.
Using a conditional random field, the extractor
learns to assign labels to each of the words in a
sentence.
Extracts one or more textual triples that aim to
capture (some of) the relationships in each
sentence.

11
Additional Tasks to Accomplish

Opinion mining in which Open IE can extract
opinion information about particular objects
(including products, political candidates, and
more) that are contained in blog, posts, reviews,
and other texts.
Fact checking in which Open IE can identify
assertions that directly or indirectly conflict
with the body of knowledge extracted from the Web
and various other knowledge bases.

Write a Comment

User Comments (0)