Recursive Partitioning for Tumor Classification with Gene Expression Microarray Data - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Recursive Partitioning for Tumor Classification with Gene Expression Microarray Data

Description:

Results From Classification Tree on the Data. Fig 1. Classification tree for tissue types by using ... Fig 3. A scatterplot of expression data from R15447 ... – PowerPoint PPT presentation

Number of Views:202
Avg rating:3.0/5.0
Slides: 14
Provided by: geneti
Category:

less

Transcript and Presenter's Notes

Title: Recursive Partitioning for Tumor Classification with Gene Expression Microarray Data


1
Recursive Partitioning for Tumor Classification
with Gene Expression Microarray Data
  • Heping Zhang, Chang-Yung Yu,
  • Burton Singer, Momian Xiong
  • Presented by Weihua Huang

2
Expression profiles of 2,000 genes using an
Affymetrix oligonucleotide array in 22 normal and
40 colon cancer tissuesThe response is binary
indicating normal or cancer tissue and the
predictor variables are the 2000 genes
Data used in the article
3
Classification Tree Using Recursive Partitioning
Goal To partition the feature space into
disjoint regions by growing a tree so that the
group in the same region are homogeneous in terms
of response. Algorithm Start with a root node
containing the study sample and split it into
smaller and smaller nodes according to whether a
particular selected predictor is above a chosen
cutoff value. At each splitting step, the
selected predictor and its corresponding level
are chosen to maximize the reduction in node
impurity ?I P(A)I(A) P(AL)I(AL) P(AR)I(AR)
4
Classification Tree using Recursive Partitioning
Node impurity One example of node impurity is
measured by entropy function
- P log(P) - (1-P) log(1-P), where P is
the probability of a tissue being normal within
the node
  • Minimum impurity ( 0 )
  • When all tissues are of the same type within the
    node ( P 0 or 1)
  • Maximum impurity ( log2)
  • When half normal tissues and half cancer tissues
    are within the node (P0.5)

5
Results From Classification Tree on the DataFig
1. Classification tree for tissue types by using
expression data from three genes ( M26383,
R15447, M28214)
6
Another Way to Visualize the Recursive
PartitioningFig 3. A scatterplot of expression
data from R15447 and M28214 for a subset of
tissues (node 3 in Fig. 1).
7
Results from Recursive partitioning
  • Quality of the tree-based classification
  • Using localized 5-fold cross validation error
    rate
  • The same genes to the same nodes
  • Randomly divide the 40 cancer tissues into 5
    subsamples of 8, and the 22 normal tissues into 5
    subsamples of 4,4,4,5, and 5 four subsamples
    each from the cancer and normal tissues were
    used to choose the cutoff values for the three
    splits. The remaining samples were used to count
    the misclassified tissues as a result of new
    cutoff values.
  • The error rate is between 6-8 from two runs of
    cross validation, which is much better than that
    obtained by existing analysis.

8
Correlation Analysis on Genes
  • Functional expressions from various genes are
  • correlated.
  • Examine the correlation patterns of the three
  • selected genes in Fig. 1.

9
Correlation Between the Three Selected Genes and
the Remaining Expression Data
10
Another Tree Based on a Different Set of Three
GenesFig. 6. Classification tree for tissue
types using expression data from three genes
(R87126, T62947, X15183)
11
Correlation Matrix Among Genes in Fig.1 and Fig.
6

12
1. Efficient with large number of genes2.
Automatically selects valuable and user-friendly
genes as predictors3. More precise than some
other classification methods such as support
vector machine and linear discriminant analysis
Advantages of the Classification Tree
13
1. It is likely that the information contained in
a large number of genes can be captured by a
small optimal set of genes without significant
loss of information. 2. The precision of
classification of recursive partitioning is
important for clinical application.
Conclusions
Write a Comment
User Comments (0)
About PowerShow.com