Web Mining - PowerPoint PPT Presentation

Loading...

PPT – Web Mining PowerPoint presentation | free to download - id: 696857-ODM2M



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Web Mining

Description:

Web Mining 4/17/2007 Hye Seon Yi Why Web mining? To mine semi-structured data with hyperlinks and html tags Traditional Data Mining, Information Retrieval and machine ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 10
Provided by: pslCsColu
Category:
Tags: mining | web

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Web Mining


1
Web Mining
  • 4/17/2007
  • Hye Seon Yi

2
Why Web mining?
  • To mine semi-structured data with hyperlinks and
    html tags
  • Traditional Data Mining, Information Retrieval
    and machine learning systems use well-structured
    data (ex) tabular data
  • To find information systematically from the vast
    collection of Web documents

3
Data for Web mining
  • Web pages
  • Written in HTML and XML
  • Web logs
  • generated and kept at Web servers
  • Hyperlink structures
  • Other related data
  • User profiles and registration information

4
Tasks of Web Mining
  • Resource finding
  • Collect data
  • Information selection and pre-processing
  • Convert data into a desired format
  • Generalization
  • Discover patterns
  • Analysis
  • Validate and/or interpret the patterns

5
Category in Web Mining
  • Web Usage Mining
  • Find usage patterns of users
  • Web Content Mining
  • Knowledge discovery in Web data
  • Web Structure Mining
  • Analyze the node and connection structure of a
    Web site

6
Web Usage Mining
  • Find usage patterns of users
  • Mining Objects
  • Web log records
  • Visits, clicks, accessed pages, and so on
  • User Input from forms and survey
  • Utilization at E-Commerce sites
  • Suggesting pages and resources according to
    users browsing trends
  • (ex) Amazon.com, Netflix.com
  • Active researching area

7
Web Content Mining
  • Knowledge discovery in Web data
  • Mining Objects
  • Textural data (unstructured, semi-structured,
    well-structured)
  • Multimedia data (image, audio, video and etc)
  • IR view (bag of words) vs. DB view (structured)
    of Web content
  • Agent-based approach (AI agent) vs. DB approach

8
Web Structure Mining
  • Mining Objects
  • Hyperlink Structure
  • Utilization for search engines
  • Rank search results
  • (ex) PageRank Algorithm
  • Developed and used by Google
  • Each link as a vote
  • Each Link is weighed differently by the
    importance of the page itself

9
Summary
  • Web Mining Mining Web documents
  • 3 sub-categories
  • Web Usage Mining
  • Web Content Mining
  • Web Structure Mining
About PowerShow.com