An Experiment with Fuzzy Sets in Data Mining International Conference on Computational Science Beiji

About This Presentation

Title:

An Experiment with Fuzzy Sets in Data Mining International Conference on Computational Science Beiji

Description:

An Experiment with Fuzzy Sets in Data Mining. International ... De and Krishna [2002] User transactions, recommend products. Measured transaction similarity ... – PowerPoint PPT presentation

Number of Views:185

Avg rating:3.0/5.0

Slides: 36

Provided by: CBA478

Category:

more less

Transcript and Presenter's Notes

Title: An Experiment with Fuzzy Sets in Data Mining International Conference on Computational Science Beiji

1
An Experiment with Fuzzy Sets in Data
MiningInternational Conference on Computational
ScienceBeijing, 2007

David L. Olson University of Nebraska
Helen Moshkovich University of Montevallo
Alexander Mechitov University of Montevallo

2
Data Mining Uncertainty

Data mining is highly useful
Deal with large datasets in many fields
Data often vague uncertain
Fuzzy set theory (Zadeh, 1965)
Rough set theory (Pawlak, 1982)
Probability theory (Pearl, 1988)
Set pair theory (Zhao, 1989)

3
Fuzzy Set Theory

Interval-valued fuzzy sets
Degree of membership an interval in 0-1 range
Vague sets, intuitionist sets essentially the
same
Grey-related analysis
Use interval membership as part of process
Rough set theory

4
Purpose

Review variants of fuzzy sets in data mining
Demonstrate fuzzy set implementation in decision
tree model
Examine relative number of rules, accuracy

5
Data Mining Use of Fuzzy Sets

Neural Networks
Pattern Classification
Cluster Analysis
Genetic Algorithms
Association Rules

6
Fuzzy Sets in Neural Networks

Simpson 1992
method using neural networks to classify fuzzy
data
Min-max function determined degree of membership
Generalization of k-nearest-neighbor classifier
Fuzzy min-max neural network classifier proved to
be at least as good as traditional methods

7
Other Fuzzy Neural Networks

Zhang et al. 2000
Procedure to process numerical linguistic data
Hu et al. 2004
Two-phased method
Build fuzzy knowledge base from transactional
data
Find weights through single-layer perceptron
Fuzzy Linguistic input customer evaluations

8
Neural Networks

Can be applied to many data mining applications
Prediction
Classification
Clustering (self-organizing maps)
Work relatively better over data with nonlinear
relationships
Including complex interactions

9
Fuzzy Pattern Classification

Abe 1995
Generated fuzzy rules over variable fuzzy regions
Used attribute hyperboxes
Classification data
License plate recognition
More rules, greater accuracy
Compared with fuzzy min-max neural network
approach of Simpson 1992
Neural networks better if data more complex

10
Fuzzy Pattern Classification

Liu et al (1999)
Fuzzy matching
Discover patterns against expectations
Sought to identify more interesting patterns
Led to fuzzy association rule generation

11
Fuzzy Linear Programming

Discriminant analysis
Find cutoff (or cutoffs) between categories
DEA fits fuzzy well

12
Cluster Analysis

Drobics et al. 2002
3-stage approach
Self-organizing maps represent input data
Fuzzy c-means clustering used on cleaned data
used to display fuzzy clusters
Fuzzy rules generated inductively
Tested on classification data, image segmentation

13
Clustering Web Data

De and Krishna 2002
User transactions, recommend products
Measured transaction similarity
Fuzzy proximity relations basis of clusters
Le 2003
Fuzzy logic to assess Website popularity
satisfaction
Association rules
Lee and Liu 2004
Framework for information retrieval filtering,
Internet shopping
Agents used to fuzzify data
Neural network model to select products

14
Fuzzy Clustering Methods

K-Means
Fuzzy c-means
Hierarchical
Bayesian Classification
ROUGH SET CLUSTERING
Put indiscernible objects together
If similarity index below threshold,
indiscernible
If higher, fewer clusters
Weighted sum of
Numerical Euclidean
Nominal Hamming distance

15
Genetic Algorithms

Bruha et al. 2000
Method to process symbolic attributes
CN4 beam search technology to categorize
numerical attributes
Data fuzzified 0, uncertain, 1
Genetic learning algorithm used to process each
observation into best-fitting category
Used on credit screening data
Fuzzy expected to be better, but insignificant
More hypotheses significantly better, but much
greater computational support required

16
Association Rules

If PRECEDENT then CONSEQUENT (single output
result)
Support
degree to which relationship appears in the data
Confidence
Probability that if precedent occurs, consequence
will occur

17
Fuzzy Association Rules

Many based on APriori algorithm
Treat all attributes (or at least linguistic) as
uniform
Lower support and confidence requirements lowers
algorithm efficiency
Generates many uninteresting rules
Gyenesie 2000 used weighted quantitative
association rules based on fuzzy data

18
Rough SetsPawlak 1991 book

Bayes theorem statistical inference
Given the number of times an unknown event has
happened and failed,
What is the chance that the probability of its
happening in a single trial lies between stated
probability limits
Rough Set Theory
Doesnt refer to prior or posterior probabilities
Reveals probabilistic structure of data
ANY DATA SET SATISFIES THE TOTAL PROBABILITY
THEOREM
BAYES THEOREM CAN BE USED TO DRAW CONCLUSIONS
FROM THE DATA
Can invert implications (give reasons for
decisions)_

19
Bayes Theorem

H hypothesis
D data
PrHD PrDH ? PrH/PrD
PrH probabilistic statement of belief of H
before obtaining data D (prior)
PrHD becomes probabilistic statement of belief
about H after obtaining D (posterior)
Given PrDH and PrD, can learn from data

20
Information System

Data table of Universe, Attributes, Values
Attributes C condition D decision
IF C THEN D
Each row of decision table determines decisions
that must be taken when specified conditions
satisfied
Crisp no conflict (universal truth)
Rough boundaries to certainty

21
Decision Table
22
Concepts

Support
Number of cases
Strength
Support / Universe
Certainty Factor
Conditional probability of D given C
(true / true false) for case
Coverage Factor
Conditional probability of C given D
Inverse decision rule

23
Calculations 1st Row

Young, Green, OK (250 of 1000)
Strength 250/1000 0.250
Certainty 250/25050 0.833
Coverage 250/25010044040 0.301
Young, Green, Problem (50 of 1000)
Strength 50/1000 0.050
Certainty 50/25050 0.167
Coverage 50/501001010 0.294

24
Results
25
Data

Shi et al., International Journal of Information
Technology Decision Making 2005
Bank credit card data (1990s)
6000 observations (5040 good, 960 bad)
Outcome good, bad
64 Explanatory variables
9 binary
3 categorical
52 continuous
Generated decision tree rules

26
Data Mining Software Supporting Fuzzy Data

PolyAnalyst
Claims fuzzy sets used in a number of algorithms
(discriminant analysis)
Inserts boundaries to data
User doesnt do anything
See5
User selects fuzzy option
Inserts buffer at boundaries
Based on sensitivity of classification to small
changes in threshold

27
Controls

PolyAnalyst
Minimum Support/minimum association
Minimum part of transactions that should contain
a basket of products
Too high no clusters - Default 1
Minimum Confidence
Probability if A then B
Too high no rules - Default 50
Minimum Improvement
How much better confidence of association rule is
to random
Default 1

28
Clementine Apriori Controls

Minimum rule support
Percentage of records for which antecedent true
Minimum rule confidence
Of the records where rule antecedents are true,
percentage where consequent ture
Maximum number of antecendents
Confidence ratio
Ratio of rule confidence to prior confidence

29
See5 Controls (tree)

Pruning confidence factor
Smaller values prune more of the tree
Minimum cases
Number of cases (support) required to keep rule
Too high fits training data less
Locked data for 5 replications
Pruning 10 - most 20 30 40 - least
MinSupport 10, 20, 30
So 60 runs replicated for crisp fuzzy,
ordinal, ordinal-fuzzy, categorical

30
Example Crisp Model

RULE 1 IF revtoPayNov 11.441 THEN good
RULE 2 IF RevtoPayNov gt 11.441 AND
IF CoverBal3 1 THEN good
RULE 3 IF RevtoPayNov gt 11.441 AND
IF CoverBal3 0 AND
IF OpentoBuyDec gt 5.35129 THEN good
RULE 4 IF RevtoPayNov gt 11.441 AND
IF CoverBal3 0 AND
IF OpentoBuyDec 5.35129 AND
IF NumPurchDec 2.30259 THEN bad
ELSE good

31
Example Fuzzy Model

RULE 1 IF revtoPayNov 11.50565 THEN good
RULE 2 IF RevtoPayNov gt 11.50565 AND
IF CoverBal3 1 THEN good
RULE 3 IF RevtoPayNov gt 11.50565 AND
IF CoverBal3 0 AND
IF OpentoBuyDec gt 5.351905 THEN good
RULE 4 IF RevtoPayNov gt 11.50565 AND
IF CoverBal3 0 AND
IF OpentoBuyDec 5.351905 AND
IF NumPurchDec 2.64916 THEN bad
ELSE good

32
Rules
33
Error of 3000 test cases
34
Fuzzy Data Mining

Fuzzy set theory found in almost every area of
data mining
Appropriate if
Large scale databases
Uncertain relationships
One approach
Partition data into categories to create fuzzy
grids

35
Conclusions

Fuzzy representation very appropriate
Humans perceive a great deal of uncertainty
A number of ways to incorporate fuzzy ideas
Fuzzifying data loses some detail
Ordinal could yield more robust models
Not necessarily more accurate
Humans could guess direction wrong
More pruning will focus on more interesting rules
Regardless of whether fuzzy or not
Categorical data more robust in this set of tests
Ordinal data treatment should be even better
FUZZIFYING DATA DOES NOT SEEM TO MAKE LESS
ACCURATE

Write a Comment

User Comments (0)

About PowerShow.com

An Experiment with Fuzzy Sets in Data Mining International Conference on Computational Science Beiji - PowerPoint PPT Presentation

An Experiment with Fuzzy Sets in Data Mining International Conference on Computational Science Beiji

An Experiment with Fuzzy Sets in Data Mining. International ... De and Krishna [2002] User transactions, recommend products. Measured transaction similarity ... – PowerPoint PPT presentation