Statistical Modeling of Time Course Gene Expressions Ping Ma Department of Statistics Center for Adv - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Statistical Modeling of Time Course Gene Expressions Ping Ma Department of Statistics Center for Adv

Description:

A series of microarray conducted sequentially during a biological process ... Experiment 3: Factorial Microarray on Zebrafish Retina Development ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 33
Provided by: ping69
Category:

less

Transcript and Presenter's Notes

Title: Statistical Modeling of Time Course Gene Expressions Ping Ma Department of Statistics Center for Adv


1
Statistical Modeling of Time Course Gene
ExpressionsPing MaDepartment of
StatisticsCenter for Advanced StudyInstitute
for Genomic Biology
2
mRNA Expression
  • Microarray A snapshot of mRNA expression levels
    of thousands of genes
  • Expression profiles expression measurements
    under different conditions and for different
    types of cells.

3
mRNA expression pattern
4
Time Course Microarray
  • A series of microarray conducted sequentially
    during a biological process
  • provide insight on the underlying biology
  • help decipher the dynamic gene regulatory
    network.

5
Experiment 1 Comparative Genomics
  • Worm and fruit fly has last common ancestor one
    billion years ago

6
Life Cycle Microarray of Worm and Fruitfly
  • cDNA array 17,871 genes in 6 time points from
    egg to adulthood
  • Jiang et al 2000 PNAS

cDNA array 4028 genes expression at 67 time
points Arbeitman et al 2002 Science
7
Experiment 2 Anaerobic and Aerobic Microarray
in Yeast
8
Experiment 3 Factorial Microarray on Zebrafish
Retina Development
Leung, Y. F. , Ma, P., Link, B. A. and Dowling,
J.(2008) PNAS, 105, 12909-12914.
9
Experiment 3 Factorial Microarray on Zebrafish
Retina Development
Leung, Y. F. , Ma, P., Link, B. A. and Dowling,
J.(2008) PNAS, 105, 12909-12914.
10
Objective
  • How to analyze time course gene expression while
    taking in account of time dependence and
    biological conditions
  • Establish a flexible framework to facilitate
    information extraction

11
Challenges
  • Both continuous and discrete factors
  • Time dependence correlation
  • Different sampling strategy
  • --- sample separately for adult male and
    female in Arbeitman (2002)
  • --- break point in oxygen in Lai (2006)

12
Current Research
  • Conventional methods, e.g. Kmeans and
    hierarchical clustering, ignores these factors
  • Multivariate Gaussian does not account the time
    interval
  • Time series requires stationary and Markov
    property

13
Functional Data Approach
  • The true expressions are modeled by curves, which
    is described using functional in mathematics

14
Mixed-Effect Representation
  • The expression profile of ith gene is yi

15
Illustration
16
Functional ANOVA
  • Decomposition
  • More generally

17
Branching Spline
18
Penalized Hendersons Likelihood
19
Matrix Representation
20
Smoothing Parameter Selection
  • How to choose ? and O?
  • Generalized Cross-validation
  • The asymptotic optimality of GCV was shown by Gu
    and Ma (2005 Ann Stat) in a decision-theoretic
    framework

21
Clustering Analysis
  • The expression profile of ith gene is yi

22
EM algorithm
23
Rejection-Controlled EM
  • Typical EM is infeasible for large scale data
  • Rejection-controlled step for alleviate the
    computation cost

24
Model Selection
  • Assessing the number of components in mixture
    model
  • Bayesian factor
  • Bayesian information criteria (BIC) as an
  • approximation

25
Comparative Genomics
Gene A1 of Species 1
  • 808 ortholog genes
  • We identified 34 clusters
  • Annotated functions using Gene Ontology analysis
  • 21 of 34 clusters are biological functions
    enriched


Gene A of Ancestor
Gene A2 of Species 2
26
Pattern Formation
embryonic development (P-value 0.0003)
post-embryonic body morphogenesis (P-value 0.007)
mRNA processing (P-value 0.002)
27
Development
larval development (P-value0.008)
growth regulation (P-value lt 10-6 ).
28
Reproduction
embryonic development (P-value lt 10-6)
reproduction (P-value lt 10-7)
29
Software
  • SSClust http//www.stat.uiuc.edu/pingma/research/
    software/SSClust.html
  • MFDA
  • http//cran.r-project.org/

30
Time Course Microarray Database
  • http//error.stat.uiuc.edu/timeseries/pradoproject
    /index.php

31
Reference
  • Ma, P. and Zhong, W.(2008) JASA
  • Leung, Y. F. , Ma, P., Link, B. A. and Dowling,
    J. (2008) PNAS
  • Ma, P., Castillo-Davis, C., Zhong, W., and Liu,
    J. S. (2006) NAR

32
Acknowledgement
  • Leung, Yuk Fai (Purdue), Wenxuan Zhong (UIUC),
    Liu, Jun S. (Harvard)
  • Zamdborg, Leonid, Kim, Ji Young
  • NSF DMS
Write a Comment
User Comments (0)
About PowerShow.com