Probing the systems biology of Mycobacterium tuberculosis through gene expression and genomic data - PowerPoint PPT Presentation

About This Presentation
Title:

Probing the systems biology of Mycobacterium tuberculosis through gene expression and genomic data

Description:

Source: http://staff.vbi.vt.edu/pathport/pathinfo_images ... Operon. Gene Pair (k) (N) (m) (n) ... Bimax Biclustering Operon Overlap ... – PowerPoint PPT presentation

Number of Views:164
Avg rating:3.0/5.0
Slides: 13
Provided by: lukey
Category:

less

Transcript and Presenter's Notes

Title: Probing the systems biology of Mycobacterium tuberculosis through gene expression and genomic data


1
Probing the systems biology of Mycobacterium
tuberculosis through gene expression and genomic
data
  • Luke Alden Yancy, Jr.
  • Mentor Robert Riley
  • Broad Institute of MIT Harvard
  • Cambridge, MA

2
What is Tuberculosis?
Source http//staff.vbi.vt.edu/pathport/pathinfo_
images/Mycobacterium_tuberculosis/AerosolTransmiss
ion.jpg
3
The Problem
TB mortality, all forms (per 100 000 population
per year), By Country, Total, 2006
Deaths Causes by TB (Estimated by WHO) Deaths Causes by TB (Estimated by WHO)
1998 1,751,858
2006 1,654,805
Source WHO Stop TB Department, website
www.who.int/tb
4
Why this study?
  • Learn more about Mycobacterium Tuberculosis (Mtb)
    using analysis of gene expression data
  • Biclustering
  • Bimax (Prelic et al. 2006)
  • CC (Cheng and Church, 2000)
  • Plaid Model (Turner et al. 2003)
  • Spectral (Kluger et al. 2003)
  • Xmotifs (Murali and Kasif, 2003)
  • Traditional Clustering
  • K-Means (MacQueen, 1967)
  • Hierarchical (Eisen et al. 1998)

5
What are clustering and biclustering?
6
Biclustering vs. Standard Clustering
Traditional Clustering Biclustering
Gene Clusters Based on All Experiments Subsets of Experiments
Genes Assigned to Clusters One-to-One Many-to-Many/ One-to-Many
Reproducibility Yes No (due to random steps in algorithm)
Source Machine Learning and Its Applications to
Biology, Tarca et al. 2007. (Editor Fran
Lewitter, Whitehead Institute)
7
What did we do?
Bimax
K-Means
Boshoff Data (Processed 3924 Genes, 359
Experiments)
Clusters of Genes
Source The Transcriptional Responses of
Mycobacterium tuberculosis to Inhibitors of
Metabolism. (Boshoff et al. 2004)
8
Benchmarking Biclusters Using Operons
(proS loci of Mtb )
(N)
Significance of overlap k estimated using
hypergeometric distribution
Operon
Cluster
(k)
(m)
(n)
Gene Pair
(Source http//www.nature.com/nature/journal/v409
/n6823/full/4091007a0.html)
9
Algorithm Performance
Bimax Biclustering Operon Overlap
Source Prolinks a database of protein
functional linkages derived from coevolution
(Bowers et al. 2005)
10
Problems with Biclustering
  • Random step lacks reproducibility
  • No biological soundness
  • Artificial arrangement of data
  • Large data sets produce statistically
    significant, but small clusters
  • Practicality
  • Implementation
  • Large Input Data Sets

11
Conclusions Next Steps
  • K-Means clustering performs better than
    biclustering on our data set
  • Next, use motif recognition methods to identify
    regulatory motifs in clusters
  • Further development of improved biclustering
    algorithms

12
Acknowledgments
  • Project Team
  • Robert Riley (Mentor)
  • Brian Weiner
  • Summer Research Program in Genomics (SRPG)
  • Shawna Young
  • Bruce Birren
  • Lucia Vielma
  • Maura Silverstein
  • The Broad Institue
  • Eric Lander
  • Core Members
  • SRPG Program Members
Write a Comment
User Comments (0)
About PowerShow.com