Introduction to Data Mining Chapter 1 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

Introduction to Data Mining Chapter 1

Description:

Introduction to Data Mining Chapter 1 – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 18
Provided by: SEAS80
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Data Mining Chapter 1


1
Introduction to Data MiningChapter 1

2
Chapter 1 Outline
  • Background
  • Information is Power
  • Knowledge is Power
  • Data Mining

3
Introduction

4
(No Transcript)
5
Information is Power
  • Relevant
  • Right Information
  • Globalised world
  • Vast amount of information available

6
What is an information
  • a collection of data
  • The act of human analysis and interpretation of
    activities
  • Decomposing it into various components and
    tackling them

7
What is Knowledge?
  • The act of human synthesis and evaluation of
    information
  • Integration of the relevant components and form
    as a relevant whole system.

8
Data Mining Definition I
  • The nontrivial extraction of hidden, previously
    unidentified, and potentially valuable knowledge
    from data
  • A variety of techniques such as neural networks,
    decision trees or standard statistical techniques
    to identify nuggets of information or
    decision-making knowledge in bodies of data, and
    extracting these in such a way that they can be
    put to use in areas such as decision support,
    prediction, forecasting, and estimation.

9
Data Mining Definition II
  • Finding hidden information in a database

10
Hidden Information
  • Number of years of experiences
  • Great secret recipes
  • Success Factors

11
Database Processing vs. Data Mining Processing
  • Query
  • Poorly defined
  • No precise query language
  • Query
  • Well defined
  • SQL
  • Data
  • Operational data
  • Data
  • Not operational data
  • Output
  • Precise
  • Subset of database
  • Output
  • Fuzzy
  • Not a subset of database

12
Query Examples
  • Database
  • Data Mining
  • Find all credit applicants with surname name of
    Lee.
  • Identify customers who have purchased more than
    100,000 in the last year.
  • Find all customers who have purchased bread
  • Find all credit applicants who are good credit
    risks. (classification)
  • Identify customers with similar eating habits.
    (Clustering)
  • Find all items which are frequently purchased
    with bread. (association rules)

13
Data Mining Models and Tasks
14
Data Mining vs. KDD
  • Knowledge Discovery in Databases (KDD) process
    of finding useful information and patterns in
    data.
  • Data Mining Use of algorithms to extract the
    information and patterns derived by the KDD
    process.

15
KDD Process
Modified from FPSS96C
  • Selection ( Pre-Mining 1) Obtain data from
    various sources.
  • Preprocessing (Pre-Mining 2) Cleanse data.
  • Transformation (Pre-Mining 3) Convert to common
    format. Transform to new format.
  • Data Mining Obtain desired results.
  • Interpretation/Evaluation (Post-Mining) Present
    results to user in meaningful manner.

16
KDD Process Ex Web Log
  • Selection
  • Select log data (dates and locations) to use
  • Preprocessing
  • Remove identifying URLs
  • Remove error logs
  • Transformation
  • Sessionize (sort and group)
  • Data Mining
  • Identify and count patterns
  • Construct data structure
  • Interpretation/Evaluation
  • Identify and display frequently accessed
    sequences.
  • Potential User Applications
  • Cache prediction
  • Personalisation

17
Data Mining Development
  • Similarity Measures
  • Hierarchical Clustering
  • IR Systems
  • Imprecise Queries
  • Textual Data
  • Web Search Engines
  • Relational Data Model
  • SQL
  • Association Rule Algorithms
  • Data Warehousing
  • Scalability Techniques
  • Bayes Theorem
  • Regression Analysis
  • EM Algorithm
  • K-Means Clustering
  • Time Series Analysis
  • Algorithm Design Techniques
  • Algorithm Analysis
  • Data Structures
  • Neural Networks
  • Decision Tree Algorithms
Write a Comment
User Comments (0)
About PowerShow.com