Greedy Feature Grouping for Optimal Discriminant Subspaces - PowerPoint PPT Presentation

1 / 26

About This Presentation

Title:

Greedy Feature Grouping for Optimal Discriminant Subspaces

Description:

Discriminant information may well lie in a small subspace ... AML / ALL Leukaemia data. ALL. AML. 72 Patients. 200 Genes. 6 groups; random seeds. Conclusions ... – PowerPoint PPT presentation

Number of Views:41

Avg rating:3.0/5.0

Slides: 27

Provided by: velblodVid

Category:

more less

Transcript and Presenter's Notes

Title: Greedy Feature Grouping for Optimal Discriminant Subspaces

1
Greedy Feature Grouping for Optimal Discriminant
Subspaces

Mahesan Niranjan
Department of Computer Science
The University of Sheffield
European Bioinformatics Institute

2
Overview

Motivation
Feature Selection
Feature Grouping Algorithm
Simulations
Synthetic Data
Gene Expression Data
Conclusions and Future

3
Motivation

Many new high dimensional problems
Language processing
Synthetic chemical molecules
High throughput experiments in genomics
Discriminant information may well lie in a small
subspace
Better classifiers
Better interpretation of classifier

4
Curse of dimensionality
Density estimation in high dimensions is difficult
5
Support Vector Machines
Classification, not density estimation
6
Support Vector MachinesNonlinear Kernel Functions
7
Classifier design

Usually to minimize error rate
Error rates can be misleading
Large imbalance in classes
Cost of misclassification can change

8
Adverse Outcome
x
Benign Outcome
x
x
Class Boundary
x
x
x
x
x
x
x
x
x
Threshold
9
True Positive
False Positive
Area under the ROC Curve Neat Statistical
Interpretation
10
Convex Hull of ROC Curves
True Positive
False Positive
Provost Fawcette Scott, Niranjan Prager
11
Feature selection in classification

Filters
select subset that scores high
Wrappers
Sequential Forward Selection / Backward deletion
Parcel
Scott, Niranjan Prager uses convex hulls of
ROC curves

12
PARCEL Feature subset selection

Area under Convex Hull of multiple ROCs
Different classifier architectures (including
different features) in different operating
points.
Has been put to good use on independent
implementations
Oxford, UCL, Surrey
Sheffield Speech Group

13
Gene Expression Microarrays
14
Inference problems in Microarray Data

Clustering
Similar expression patterns might imply
similar function
regulated in the same way
e.g. activated by the same transcription
factor concentration maintained by same
mechanism etc
Classification
diagnostics - e.g. disease / not
prediction - e.g. survival

? discrimination with features that do cluster
15
Subspaces of gene expressions

Singular Value Decomposition (SVD)
Robust SVD for missing values outliers
Combining different datasets
Pseudo-inverse Projection
Generalized SVD

Eigenarrays Eigengenes
Alter, Brown Botstein PNAS,
2000 Alter Golub PNAS, 2004
16
Yeast Gene Classification Switch to MATLAB
here
2000 yeast genes 79 experiments Ribosome /
Not (125) (1750) First use of SVM
Brown et al PNAS 1999
17
Discriminant Subspaces
18
Seemingly similar models

Product of Experts ( Hinton )
Modular Mixture Model ( Attias )
mixture model in subspaces
Combined by hidden nodes

Full feature set
None of these search for combinations of features
19
Algorithm
Select M Initial Assignment -- one feature
per group Sequential search through remaining
-- which feature, which group -- maximize
average AUROC / Sum of Fisher Ratios Stopping
criterion
At random / domain knowledge
20
Another view
Within Class Scatter
Separation of Means
21
Another view
22
Block diagonal scatter matrix
23
Simulations