What is Data Mining ? - PowerPoint PPT Presentation

1 / 34
About This Presentation
Title:

What is Data Mining ?

Description:

What is Data Mining ? Jinseog Kim Department of Statistics & Information Science Dongguk University jinseog.kim_at_gmail.com ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 35
Provided by: admin2305
Category:
Tags: credit | data | mining | scoring

less

Transcript and Presenter's Notes

Title: What is Data Mining ?


1
What is Data Mining ?
  • Jinseog Kim
  • Department of Statistics Information Science
  • Dongguk University
  • jinseog.kim_at_gmail.com

2
????? ?? ?? ??
???
??
????
  • ????
  • Point of Sale
  • ATM
  • ????
  • ????
  • ??
  • ????
  • ????
  • ??????
  • A?? ???? 80? B??? ????
  • ????? ??? ???? 6??? ??
  • A??? ?? ??? B??? 2?
  • ?? ??? ??? ??
  • ????? ?
  • ??? ??
  • ??? ?? ??? ?
  • ????? ????? ?
  • ??? ?? ???? ?
  • ??? ?

3
Data Mining ?? ?
  • ???? ??????
  • ??? ??? ????
  • ???? ?? ??? ????
  • ?? ??? ????? ???? ??? ??
  • ??? ????? ??, ??, ??, ??,??? ???

4
? ?
  • ???? ??? ?? ??? ??
  • ?????? ???? ?? ??
  • ??? ??? ???-POS data, Internet Log
  • ??, ??? ?? (???)
  • ??? ??? ??
  • ????? ?? ??
  • ????(Machine Learning) ??? ??
  • Knowledge Discovery, Knowledge Extraction,
    Machine Learning, Data/Pattern Analysis

5
Data Mining ??
  • ??? ??
  • ??? ??? ?? ??
  • ??? ??
  • ?? ?? ??? ?? ?? ??
  • ???, ???, ???,
  • ?? ??
  • ?? ??
  • ??? (??), ?? ??
  • ??, ???

6
Data Mining ??
Select
Transform
Mine
Assimilate
????
????
????
?? ? ??
DATABASE
??? ???
Extracted Data
Selected Data
Assimilated data
Transformed Data
Visualization
???
7
??????(CRM)? ?
????
????
????
?? ? ??
DATABASE
??? ???
Targeting for Sales
?????? ??? ????
  • ????
  • POS Data
  • Survey data

60? ??? ??? ?? ??
?????? (buys the same brand 80 of time)
8
Data Mining?? ??
  • u ??? ??, ??? ??? ???
  • u ??? ??????? ??? ???
  • u ??? ?? ??? ???

9
Data Mining?? ?? ??? ??? ??? ???
  • Summarization (??)
  • Association (??? ??)
  • Classification (??)
  • Clustering (???)
  • Characterization (????)
  • Sequential Pattern Discovery (??????)
  • Trend (?? ??)
  • Deviation Detection (??????)

10
Data Mining?? ?? ??? DB? ??? ???
  • Relational DB
  • transactional DB
  • Object-oriented DB
  • Spatial DB
  • Temporal DB
  • Textual vs Multimedia
  • Hetrogeneous,

11
Data Mining?? ?? ?? ??? ???
  • ????, ???? ??
  • ??? ??, rule induction
  • ????? ??? functional mapping? ??
  • ??? ?? algorithm? ??
  • ??? ??/ ????
  • Statistical Classification(supervised learning)
  • Clustering Techniques(unsupervised learning)
  • Time Series Analysis,

12
??? ?? ??
  • Transaction DB? ????
  • lt??????gt???? ?
  • RULE ??? ??
  • A gt B support, confidence
  • support (A and B) / (total transactions)
  • confidence (A and B) / (A)
  • ? ?? gt ??? (Agrawal, ??? ?????? ??)
  • ?? 1 ??????? ????
  • ?? 2 AMAZON.COM
  • ????? ??
  • ?? 3 ??? ??????
  • ???? ?? ? ???? ?? ??(??????)

13
??? ?? ??
Association Rules with Maximum support of 50
?? ??
14
Classification
  • ?????? ??? ??? ??
  • ????? Class-label ? feature set?? ??
  • ????(Supervised Learning)? ??
  • ????? ??? ??, ??? ??
  • ??? ??? ??? ? ??? ?? ??
  • ?? Credit Approval, ?? ??
  • ? ??? ???? ? ????? ?? ???? ?? ?? ??
  • Decision Tree, ???, ??? ???(logistic model, LDA,
    QDA)

15
Classification Example
??, ???, ??, ???, ??????
Classifier
Class 1 ??? ??
Class 2 ??? ??
Class 3 ??? ??
16
Decision Tree Classifier
  • ?????? Decision Tree ???? ??
  • ID3, CART, C5.0

17
Neural Network Classifier
  • ??? ?????? ??? ???? ??
  • ??? Neuron? ????? ???
  • ?? ???? ??
  • Error-back-propagation ????????
  • ??? Functional Mapping? ?? ???

18
Neural Network Classifier
19
Sequential Pattern Discovery
  • Transaction ????? ??? ?? ??
  • ??
  • ??????? ?? ?? ??
  • ???? ?? ??
  • ?? ??? ?? ?? ??, ??
  • ??? ??? ?? ??, ??
  • ???
  • ??? ??? ??
  • Hidden Markov Model for doubly stochastic process
    modeling

20
Sequential Pattern Example
Sequential Pattern in DataBase
21
Similar Time Series
Matching Curve Found
22
Clustering(???)
  • ?? ???? ?? ???? ???? ??? ??? ?? ???? ??
  • ????? ??? ???
  • Unsupervised Learning Algorithms
  • Symbolic, Neural Network based (Kohonen Feature
    Map)
  • Statistical clustering ???
  • ??
  • ???? ??? ??? - ?? ??? ??
  • ??? ???, ????? ?? ?? ????

23
Clustering Example
24
Symbolic Clustering
Similarity 2
Similarity 2
Diff3
Diff2.83
Diff3
Similarity 3
Total Score for this cluster partition average
similarity average difference

2.33 2.94 5.27
25
Data Mining Interface
  • Interactive Mining
  • GUI? ?? Task? ??
  • Data Mining Query Language
  • find association rules
  • related to gpa, birth_place, family_income
  • from student
  • where major CS and birth_place Seoul
  • with support threshold 0.05
  • with confidence threshold 0.7

26
Kohonens Feature Map
  • ???? ??? ??? ??
  • ??? ??? ??? ???? ???? ??
  • ??? Feature Map??? ? ??? ??
  • ???? ??
  • Feature Map ?? ??? ?? Difference
  • ????? ?? ??
  • 1) ??? ?? X? ?? ? ?? ?? N? ??
  • 2) N? ? N? ???? ????? X? ???? ??
  • 3) ?? ??? ??? ??? ??? ?? ?? ??

27
???? ??? ?? ??(customer segmentation)
  • ?????? ?
  • ??? ?? ???? ??
  • ? ??? ???? ?
  • ?? ??? ??? ???? ??? ????? ?
  • ?? ??? ?? ??? ?????
  • ??? ?? ?? ? ?? ??? ??????
  • ?? ??? ??? ????? ?
  • ?? ??
  • ??? ?????(mass marketing)?? ????? ????
    ?????(personalization or target marketing)?? ??
  • ?? ??, ????, ?? ??, ?? ??

28
??? ?? ??
Scoring???
???? ???? ???? ??? ??
?? ?? ?? ?? ????
????
? ??? ????
29
??????? ??? Overview
???? ???? ???? ?????
???? DB
Credit ???
Decision Tree
??? ??
???? ??
?? ??? Scoring (Neural Network
Scoring ???
Credit ?? ? ???? ??
30
?? ?? ???? ????
  • LG?????
  • ???? ????? ??? ??
  • ?? ???? ???? ???? ?? ?? ??
  • ????? ?? ??
  • ????, ????, ??? ??, ??? ??
  • ??? ???? Fraud Score ??
  • 1995? LG???? ???? 14???? ??
  • ?? ??? ??

31
Data mining Tools
  • IBM Intelligent Miner
  • SAS E-miner
  • Splus Insightful

32
?? ????
?   ? ?    ?     ?    ?
??/??? ??? ????? ??? DM (Direct Mail)? ??? ???? ?? ?? ?? ??/??? ?? ?? ????? ??? ??? ?? ?? ????, ??? ???? ??, ????, ???
??/?? ???? ?? ?? ?? ?? ???? ?? ? ???? ???? ?? ? ?? ?? ???? ?? ???? ?? ???? ?? ?? ?? ?? ?? ??
?? ????? ?? ??? ?? ?? ?? ??? ?? ?? ?? ?? ??? ?? ??? ??? ??? ?? ??
?? ??? ??/?? ??? ??? ?? ?? ?? ?? ?? ?? ?? ? ?? ?? ????? ?? ?? ?? ?? ?? ?? ? ?? ??
33
?? ??
  • Mining Business Databases, Brachman, et al.,
    CACM, Vol39, No11, 1996
  • Mining Scientific Data, Fayyad, et al., CACM,
    Vol39, No11, 1996
  • Quest(IBM Almaden)
  • http//www.almaden.ibm.com/cs/quest
  • DBMiner(Simon Fraser Univ.)
  • http//db.cs.sfu.ca/DBMiner
  • KDD(GTE)
  • http//info.gte.com/kdd/index.html
  • International Conference on Knowledge Discovery
    and Data Mining
  • Advances in Knowledge Discovery and Data Mining,
    MIT press, 1996

34
? ?
  • ??? ?? ?? gt ??, ??? ?? ??
  • ??????? ??? ??
  • ??? ??????? ??? ??
  • ???? ??? ??? ?? ??? ??
  • ?? ?? ??? ?? ?
  • ??? ?????? ?? ?? ??
  • Hot Research Item
Write a Comment
User Comments (0)
About PowerShow.com