Information Extraction I: Kissler/Marais Web Language - PowerPoint PPT Presentation

1 / 7
About This Presentation
Title:

Information Extraction I: Kissler/Marais Web Language

Description:

Theoretically, this will be obviated by 'Semantic Web' and 'Web Services' ... Convert Latex2HTML-generated pages into printable form. Marais/Kistler Web Language ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 8
Provided by: raymie
Category:

less

Transcript and Presenter's Notes

Title: Information Extraction I: Kissler/Marais Web Language


1
Information Extraction IKissler/Marais Web
Language
2
Information extraction applications
  • Find useful information
  • Extract it into form that can be processed
  • Process it
  • Present it back

3
A model of info-extraction applications
Robustness is key criterion
Tricky part. Theoretically, this will be
obviated by Semantic Web and Web Services
Not necc. Web presentation
From Kistler/Marais WWW7
4
Example applications
  • Shopping robots
  • Personalized news
  • Financial applications
  • Use free data on Web
  • Intra/extranets
  • Manufacturing info
  • Project info
  • Meta-search engines
  • Convert Latex2HTML-generated pages into printable
    form

5
Marais/Kistler Web Language
  • Language for writing Web info extraction
    applications
  • Like Perl LWP, but specialized
  • Good for O(10K)-page applications
  • Manual/semi-automatic resource discovery
  • Manual (heuristics) for extraction

6
Challenges of info-extraction applications
  • Web is unreliable
  • Internet failures
  • Site failures
  • Resource-discovery problem
  • Where are pages with interesting data?
  • Pages are unstructured
  • Difficult to reliably extract information
  • Pages change frequently

7
Rest of todays lecture
  • From Marais SRI talk (slide 12)
Write a Comment
User Comments (0)
About PowerShow.com