Automatic Extraction of Information Behind Web Forms Based on Application Ontologies - PowerPoint PPT Presentation

About This Presentation
Title:

Automatic Extraction of Information Behind Web Forms Based on Application Ontologies

Description:

Sai Ho Yau. Brigham Young University. Next. Previous. Introduction ... Method: Construct the. Query String. Next. Previous. The Goal ... – PowerPoint PPT presentation

Number of Views:23
Avg rating:3.0/5.0
Slides: 12
Provided by: saih
Learn more at: https://www.deg.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: Automatic Extraction of Information Behind Web Forms Based on Application Ontologies


1
Automatic Extraction of Information Behind
Web Forms Based on Application Ontologies
  • by
  • Sai Ho Yau
  • Brigham Young University

2
Introduction
  • There are enormous amounts of information
    available from the Web, but it is difficult to
    extract the data automatically due to several
    reasons
  • Dynamically generated Web pages
  • Form interfaces
  • Relevant information can be obtained only after a
    Web form is filled out and submitted

3
Problems Dealing with Forms
  • No general Web form design
  • Required text fields
  • One form may lead to another
  • Resulting information embedded within forms
  • Returned error messages versus valid data
  • Elimination of possible duplicate data

4
The Framework
5
Tools
  • Language and Internet browser used
  • JavaScript, Java, PHP3., MySQL
  • Microsoft Internet Explorer
  • Platform
  • Solaris Intel (Unix), with Sun Java 1.1.6.

6
Method Construct the Query String
7
The Goal
  • Deal with as many Web forms as possible.
  • Retrieve all relevant information.
  • Automate the extraction process.

8
Returned Web Page
9
Suggested Solution
10
Conclusions
We can automatically
  • Fill in Web forms.
  • Extract information behind forms.
  • Screen out error messages and inapplicable Web
    pages.
  • Eliminate duplicate data.

11
Thank You
Write a Comment
User Comments (0)
About PowerShow.com