Decision Trees with Minimal Costs (ICML 2004, Banff, Canada) - PowerPoint PPT Presentation

About This Presentation
Title:

Decision Trees with Minimal Costs (ICML 2004, Banff, Canada)

Description:

Decision Trees with Minimal Costs (ICML 2004, Banff, Canada) ... Most inductive learning algorithms: minimizing classification errors ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 28
Provided by: Joh6256
Category:

less

Transcript and Presenter's Notes

Title: Decision Trees with Minimal Costs (ICML 2004, Banff, Canada)


1
Decision Trees with Minimal Costs(ICML 2004,
Banff, Canada)
  • Charles X. Ling, Univ of Western Ontario, Canada
  • Qiang Yang, HK UST, Hong Kong
  • Jianning Wang, Univ of Western Ontario, Canada
  • Shichao Zhang, UTS, Australia
  • Contact cling_at_csd.uwo.ca

2
Outline
  • Introduction
  • Building Trees with Minimal Total Costs
  • Testing Strategies
  • Experiments and Results
  • Conclusions

3
Costs in Machine Learning
  • Most inductive learning algorithms minimizing
    classification errors
  • Different types of misclassification have
    different costs, e.g. FP and FN
  • In this talk
  • Test costs should also be considered
  • Cost sensitive learning considers a variety of
    costs see survey by Peter Turney (2000)

4
Applications
  • Medical Practice
  • Doctors may ask a patient to go through a number
    of tests (e.g., Blood tests, X-rays)
  • Which of these new tests will bring about higher
    value?
  • Biological Experimental Design
  • When testing a new drug, new tests are costly
  • which experiments to perform?

5
Previous Work
  • Many previous works consider the two types of
    cost separately an obvious oversight
  • (Turney 1995) ICET, uses genetic algorithm to
    build trees to minimize the total cost
  • (Zubek and Dieterrich 2002) a Markov Decision
    Process (MDP), searches in a state space for
    optimal policies
  • (Greiner et al. 2002) PAC learning

6
An Example of Our Problem
  • Training with ?, cannot obtain values

IDC1 FeverC2 X-rayC3 Blood_1C4 Blood_2C5 D
12 101 ? H ? Yes
23 ? L M L No
Goal 1 build a tree that minimizes the total cost
Test with many ?, may obtain values at a cost
IDC1 FeverC2 X-rayC3 Blood_1C4 Blood_2C5 D
45 98 ? ? ? ?
58 ? ? ? ? ?
Goal 2 obtain test values at a cost to minimize
the total cost
7
Outline
  • Introduction
  • Building Trees with Minimal Total Costs
  • Testing Strategies
  • Experiments and Results
  • Conclusions

8
Building Trees with Minimal Total Costs
  • Assumption binary classes, costs FP and FN
  • Goal minimize total cost
  • Total cost misclassification cost test cost
  • Previous Work
  • Information Gain as a attribute selection
    criterion
  • In this work, need a new attribute selection
    criterion

9
Attribute Selection Criterion C4.5
  • Minimal total cost (C4.5 minimal entropy)
  • If growing a tree has a smaller total costthen
    choose an attribute with minimal total costelse
    stop and form a leaf

10
  • Label leaf according to minimal total cost
  • If (PFN ? NFP) then class positiveelse
    class negative

11
Difference on ? values
  • First, how to handle ? values in training data
  • Previous work
  • built ? branch
  • problematic
  • This work
  • deal with unknown values in the training set
  • no branch for ? will be built,
  • examples are gathered inside the internal nodes

12
A Tree Building Example???
If P0?FN gt N0?FP, then Total cost N0FP if no
split further
PN
P0N0 with ?
Potential attribute A with a test cost C
2
1
P1N1
P2N2 -
Total cost total test cost total
misclassification cost Total test cost
(P1N1P2N2)C Total misclassification cost
N1FP P2FN N0FP
13
Desirable Properties
  • 1. Effect of difference between misclassification
    costs and the test costs

14
  • 2. Prefer attribute with smaller test costs

15
  • 3. If test cost increases, attribute tends to be
    pushed down and falls out of the tree

16
Outline
  • Introduction
  • Building Trees with Minimal Total Costs
  • Testing Strategies
  • Experiments and Results
  • Conclusions

17
Missing values in test cases
A New patient arrives
Blood test X-ray result Urine test S-test
? good ? ?
18
OST Intuition
  • Explain the intuition of OST here

19
Four Testing Strategies
  • First Optimal Sequential Test (OST)(Simple
    batch test do all tests)
  • Second No test will be performed, predict with
    internal node
  • Third No test will be performed, predict with
    weighted sum of subtrees
  • Fourth A new tree is built dynamically for each
    test case using only the known attributes

20
Six Testing Strategies (5 - 6)
  • Fifth Batch test. When the test case is stopped
    at the first unknown attribute, all the unknown
    values in its sub-tree will be tested
  • Sixth always test the first unknown attribute
  • Baseline Sequential test in C4.5

21
Outline
  • Introduction
  • Building Trees with Minimal Total Costs
  • Testing Strategies
  • Experiments and Results
  • Conclusions

22
Experiment - settings
  • Five dataset, binary-class
  • 60/40 for training/testing, repeat 5 times
  • Unknown values for training/test examples are
    selected randomly by a specific probability
  • Also compare to C4.5 tree, using OST for testing

23
Results with different of unknown
No test, distributed
  • OST is best M4 and C4.5 next M3 is worst
  • OST not increase with more ? others do overall

24
Results with different test costs
No test, distributed
  • With large test costs, OST M2 M3 M4
  • C4.5 is much worse (tree building is
    cost-insensitive)

25
Results with unbalanced class costs
  • With large test costs, OST M2 M4
  • C4.5 is much worse (tree building is
    cost-insensitive)
  • M3 is worse than M2 (M3 is used in C4.5)

26
Comparing OST/C4.5 cross 6 datasets
  • OST always outperforms C4.5

27
Outline
  • Introduction
  • Building Trees with Minimal Total Costs
  • Testing Strategies
  • Experiments and Results
  • Conclusions

28
Conclusions
  • New tree building algorithm for minimal costs
  • Desirable properties
  • Computationally efficient (similar to C4.5)
  • Test strategies (OST and batch) are very
    effective
  • Can solve many real-world diagnosis problems

29
Future Work
  • More intelligent Batch Test methods
  • Consider cost of additional batch test
  • Optimal sequential batch testbatch 1 (test1,
    test 2)batch 2 (test 3, test 4, test 5),
  • Other learning algorithms with minimal total cost
  • A wrapper that works for any black box
Write a Comment
User Comments (0)
About PowerShow.com