Search Strategies for Ensemble Feature Selection in Medical Diagnostics

About This Presentation

Title:

Search Strategies for Ensemble Feature Selection in Medical Diagnostics

Description:

CBMS'2003 NYC, USA June 26-27, 2003. Search Strategies for. Ensemble ... 18 features are standardized by the World Organization of Gastroenterology (OMGE) ... – PowerPoint PPT presentation

Number of Views:75

Avg rating:3.0/5.0

Slides: 23

Provided by: tktl

Category:

more less

Transcript and Presenter's Notes

Title: Search Strategies for Ensemble Feature Selection in Medical Diagnostics

1
Search Strategies forEnsemble Feature Selection
in Medical Diagnostics

Alexey Tsymbal, Pádraig CunninghamDepartment of
Computer ScienceTrinity College
DublinIrelandMykola Pechenizkiy, Seppo
PuuronenDepartment of Computer
ScienceUniversity of Jyväskylä Finland

2
Contents

Introduction the task of classification
Classification of acute abdominal pain
Ensembles of classifiers and feature selection
ensemble feature selection
Simple Bayes ensembles pro et contra
HC, EFSS, EBSS, and GEFS strategies
Experimental results appendicitis
Conclusions and future work

3
The task of classification
J classes, n training observations, p instance
attributes
New instance to be classified
Training Set
CLASSIFICATION
Examples - prognostics of recurrence of breast
cancer - diagnosis of thyroid diseases - heart
attack prediction, etc.
Class Membership of the new instance
4
Classification of acute abdominal pain

3 large datasets with cases of acute abdominal
pain (AAP) 1254, 2286, and 4020 instances, and
18 parameters (features) from history-taking and
clinical examination
the task of separating acute appendicitis
the second most important cause of abdominal
surgeries
AAP I from 6 surgical departments in Germany,
AAP II from 14 centers in Germany, and AAP III
from 16 centers in Central and Eastern Europe
the 18 features are standardized by the World
Organization of Gastroenterology (OMGE)

Features 1 Sex 2 Age 3 Progress of pain 4
Duration of pain 5 Type of pain 6 Severity of
pain 7 Location of pain at present 8 Location of
pain at onset 9 Previous similar complaints 10
Previous abdominal operation 11 Distended
abdomen 12 Tenderness 13 Severity of
tenderness 14 Movement of abdominal wall 15
Rigidity 16 Rectal tenderness 17 Rebound
tenderness 18 Leukocytes
The data sets for research were kindly provided
by the Laboratory for System Design, Faculty
of Electrical Engineering and Computer Science,
University of Maribor, Slovenia, and the
Theoretical Surgery Unit, Department of General
and Trauma Surgery, Heinrich-Heine University,
Düsseldorf, Germany
5
Ensemble classification
6
Ensemble feature selection

How to prepare inputs for generation of the base
classifiers ?
Sampling the training set
Manipulation of input features
Manipulation of output targets (class values)
Goal of traditional feature selection
find and remove features that are unhelpful or
destructive to learning making one feature subset
for single classifier
Goal of ensemble feature selection
find and remove features that are unhelpful or
destructive to learning making different feature
subsets for a number of classifiers
find feature subsets that will promote
disagreement between classifiers

7
Simple Bayesian classification

Bayes theorem
P(CX) P(XC)P(C) / P(X)
Naïve assumption attribute independence
P(x1,,xkC)
P(x1C)P(xkC)
If i-th attribute is categoricalP(xiC) is
estimated as the relative freq of samples having
value xi as i-th attribute in class C
If i-th attribute is continuousP(xiC) is
estimated thru a Gaussian density function
Computationally easy in both cases

8
Bayesian ensembles pro et contra
CONTRA

naive feature independence assumption
extremely stable algorithm

PRO

simplicity, speed, interpretability
SB is optimal even when the independence
assumption is violated (theory experiments)
can be effectively used with boosting (bias
reduction)
feature selection reduces the error of the
naïve assumption

9
Integration of classifiers
Integration
Selection
Combination
Dynamic Voting with Selection (DVS)
Static
Voting-type
Meta-type
Dynamic
Weighted Voting (WV)
Dynamic Selection (DS)
Static Selection (CVM)
Motivation for the Dynamic Integration The
main assumption is that each classifier is best
in some sub-areas of the whole data set, where
its local error is comparatively less than the
corresponding errors of the other classifiers.
10
Search in EFS

Search space
2NumOfFeaturesNumOfClassifiers 21825 6
553 600
4 search strategies to heuristically explore the
search space
Hill-Climbing (HC) (CBMS2002)
Ensemble Forward Sequential Selection (EFSS)
Ensemble Backward Sequential Selection (EBSS)
Genetic Ensemble Feature Selection (GEFS)

11
Hill-Climbing (HC) strategy (CBMS2002)

Generation of initial feature subsets using the
random subspace method (RSM)
A number of refining passes on eachfeature set
while there is improvement in fitness

12
Ensemble Forward Sequential Selection (EFSS)
forward selection
13
Ensemble Backward Sequential Selection (EBSS)
.64
backward elimination
1,2,3,4
14
Genetic Ensemble Feature Selection (GEFS)
15
Computational complexity
EFSS and EBSS where S is the number of base
classifiers, N is the total number of features,
and N is the number of features included or
deleted on average in an FSS or BSS search.
Example EFSS 251831350 (and not 6 553
600!) HC where Npasses is the average number of
passes through the feature subsets in HC until
there is some improvement. GEFS where S is
the number of individuals (feature subsets) in
one generation, and Ngen is the number of
generations.
16
An Example EFSS on AAP III, alfa4
C1
C2
C3
f2 age
f6 severity of pain
f6 severity of pain
f7 location of pain at present
f13 severity of tenderness
f13 severity of tenderness
C4
C5
C6
f9 previous similar complaints
f3 progress of pain
f2 age
f14 movement of abdominal wall
f15 rigidity
f16 rectal tenderness
C7
C8
C9
f1 sex
f4 duration of pain
f4 duration of pain
f12 tenderness
f18 leukocytes
C10
f11 distended abdomen
17
Experiments with the AAP data sets

HC, EFSS, EBSS, and GEFS strategies
integration three DI variations (DS, DV, and
DVS), weighted voting (WV), and static selection
(SS)
30 test runs of Monte-Carlo cross-validation
collected characteristics classification
accuracy, sensitivity, specificity, relative
number of features in the base classifiers, total
ensemble diversity, and ensemble coverage
the test environment is implemented within the
MLC framework (the Machine Learning Library in
C)

18
Experiments results
19
Feature importance table (EFSS, alfa0)
20
Conclusions

4 new strategies proposed and analyzed
EFSS is the best strategy (only for this
domain!)
the best previously published specificity and
sensitivity achieved
only 7 features in each classifier on average
(less than 3)
importance of the features was analyzed

21
Future work

collaboration with medical experts is needed to
analyze the results obtained
other medical domains, especially including many
features with complex inter-feature dependencies
other search strategies (beam search, simulated
annealing, etc.)
better tuning of the GEFS parameters