Fuzzy-Rough Feature Significance for Fuzzy Decision Trees - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

Fuzzy-Rough Feature Significance for Fuzzy Decision Trees

Description:

Fuzzy decision trees (FDTs) follow similar principles to crisp decision trees ... Here, membership is any value in the range [0,1] ... – PowerPoint PPT presentation

Number of Views:228
Avg rating:3.0/5.0
Slides: 16
Provided by: dcsB
Category:

less

Transcript and Presenter's Notes

Title: Fuzzy-Rough Feature Significance for Fuzzy Decision Trees


1
Fuzzy-Rough Feature Significance for Fuzzy
Decision Trees
  • Richard Jensen
  • Qiang Shen

Advanced Reasoning Group Department of Computer
Science The University of Wales, Aberystwyth
2
Outline
  • Utility of decision tree induction
  • Importance of attribute selection
  • Introduction of fuzzy-rough concepts
  • Evaluation of the fuzzy-rough metric
  • Results of F-ID3 vs FR-ID3
  • Conclusions

3
Decision Trees
  • Popular classification algorithm in data mining
    and machine learning
  • Fuzzy decision trees (FDTs) follow similar
    principles to crisp decision trees
  • FDTs allow greater flexibility
  • Partitioning of the instance space attributes
    are selected to derive partitions
  • Hence, attribute selection is an important factor
    in decision tree quality

4
Fuzzy Decision Trees
  • Object membership
  • Traditionally, node membership of 0,1
  • Here, membership is any value in the range 0,1
  • Calculated from conjunction of membership degrees
    along path to the node
  • Fuzzy tests
  • Carried out within nodes to determine the
    membership of feature values to fuzzy sets
  • Stopping criteria
  • Measure of feature significance

5
Decision Tree Algorithm
  • Training set S and (optionally) depth of decision
    tree l
  • Start to form decision tree from the top level,
  • Do loop until
  • the depth of the tree gets to l or
  • there is no node to expand
  • a) Gauge significance of each attribute of S not
    already expanded in this branch
  • b) Expand the attribute with the most
    significance
  • c) Stop expansion of the leaf node of attribute
    if maximum significance obtained
  • End do loop

6
Feature Significance
  • Previous FDT inducers use fuzzy entropy
  • Little research in the area of alternatives
  • Fuzzy-rough feature significance has been used
    previously in feature selection with much success
  • This can also be used to gauge feature importance
    within FDT construction
  • The fuzzy-rough measure extends concepts from
    crisp rough set theory

7
Crisp Rough Sets
Upper Approximation
Set X
Lower Approximation
Equivalence class xB
  • xB is the set of all points which are
    indiscernible
  • with point x in terms of feature subset B.

8
Fuzzy Equivalence Classes
At the centre of Fuzzy-Rough Feature Selection
  • Incorporate vagueness
  • Handle real valued data
  • Cope with noisy data

Image Rough Fuzzy Hybridization A New Trend in
Decision Making, S. K. Pal and A. Skowron (eds),
Springer-Verlag, Singapore, 1999
9
Fuzzy-Rough Significance
  • Deals with real-valued features via fuzzy sets
  • Fuzzy lower approximation
  • Fuzzy positive region
  • Evaluation function
  • Feature importance is estimated with this

10
Evaluation
  • Is the ? metric a useful gauger of feature
    significance?
  • ? metric compared with leading feature rankers
  • Information Gain, Gain Ratio, Chi2, Relief, OneR
  • Applied to test data
  • 30 random feature values for 400 objects
  • 2 or 3 features used to determine classification
  • Task locate those features that affect the
    decision

11
Evaluation
  • Results for xyz2 gt 0.125
  • Results for (x y)3 lt 0.125
  • FR, IG and GR perform best
  • FR metric locates the most important features

12
FDT Experiments
  • Fuzzy ID3 (F-ID3) compared with Fuzzy-Rough ID3
    (FR-ID3)
  • Only difference between methods is the choice of
    feature significance measure
  • Datasets used taken from the machine learning
    repository
  • Data split into two equal halves training and
    testing
  • Resulting trees converted to equivalent rulesets

13
Results
  • Real-valued data
  • Average ruleset size
  • 56.7 for F-ID3
  • 88.6 for FR-ID3
  • F-ID3 performs marginally better than FR-ID3

14
Results
  • Crisp data
  • Average ruleset size
  • 30.2 for F-ID3
  • 28.8 for FR-ID3
  • FR-ID3 performs marginally better than F-ID3

15
Conclusion
  • Decision trees are a popular means of
    classification
  • The selection of branching attributes is key to
  • resulting tree quality
  • The use of a fuzzy-rough metric for this purpose
    looks promising
  • Future work
  • Further experimental evaluation
  • Fuzzy-rough feature reduction pre-processor
Write a Comment
User Comments (0)
About PowerShow.com