Course Module: Brief Introduction to Systems Biology Sven Bergmann Department of Medical Genetics Un - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Course Module: Brief Introduction to Systems Biology Sven Bergmann Department of Medical Genetics Un

Description:

J Ihmels, G Friedlander, SB, O Sarig, Y Ziv & N Barkai Nature Genetics (2002) Trip to the 'Amazon' ... scale expression data bears great potential to understand ... – PowerPoint PPT presentation

Number of Views:246
Avg rating:3.0/5.0
Slides: 37
Provided by: serverd9
Category:

less

Transcript and Presenter's Notes

Title: Course Module: Brief Introduction to Systems Biology Sven Bergmann Department of Medical Genetics Un


1
Course Module Brief Introduction to Systems
Biology Sven BergmannDepartment of Medical
GeneticsUniversity of LausanneRue de Bugnon 27
- DGM 328 CH-1005 Lausanne Switzerland work
41-21-692-5452cell 41-78-663-4980
http//serverdgm.unil.ch/bergmann
2
Recap Part 2 Standard Analysis Tools
  • Motivation
  • Why to study a large heterogeneous set of
    expression data?
  • What biological questions can we ask?
  • Supervised vs. unsupervised approaches
  • Practical Part
  • K-means clustering
  • Coupled two-way clustering (CTWC)
  • Principle component analysis (PCA)
  • Singular value decomposition (SVD)

3
Part 3 Advanced Analysis Tools
  • Motivation
  • What are the limitations of standard tools?
  • What is a modular approach?
  • How can we integrate different datasets?
  • Practical Part
  • Overview of Advanced Analysis Tools
  • The (Iterative) Signature Algorithm
  • The Ping-Pong Algorithm

4
How to make sense of millions of numbers?
Hundreds of samples
Thousands of genes
New Analysis and Visualization Tools are needed!
5
Pooling genome-wide expression measurements from
many experiments
cell- cycle
sets of specific conditions
6
The challenge of many datasets How to integrate
all the information?
  • Protein expression
  • Tissue specific expression
  • Interaction data
  • Localization data

7
How to extract biological information from
large-scale expression data?
  • Clusters cannot overlap!
  • Clustering based on correlations over all
    conditions - sensitive to noise -
    computation intensive

8
How to extract biological information from
large-scale expression data?
9
Overview of modular analysis tools
  • Cheng Y and Church GM. Biclustering of
    expression data.(Proc Int Conf Intell Syst Mol
    Biol. 2000893-103)
  • Getz G, Levine E, Domany E. Coupled two-way
    clustering analysis of gene microarray data.
    (Proc Natl Acad Sci U S A. 2000 Oct
    2497(22)12079-84)
  • Tanay A, Sharan R, Kupiec M, Shamir R. Revealing
    modularity and organization in the yeast
    molecular network by integrated analysis of
    highly heterogeneous genomewide data. (Proc Natl
    Acad Sci U S A. 2004 Mar 2101(9)2981-6)
  • Sheng Q, Moreau Y, De Moor B. Biclustering
    microarray data by Gibbs sampling.
    (Bioinformatics. 2003 Oct19 Suppl 2ii196-205)
  • Gasch AP and Eisen MB. Exploring the conditional
    coregulation of yeast gene expression through
    fuzzy k-means clustering.(Genome Biol. 2002 Oct
    103(11)RESEARCH0059)
  • Hastie T, Tibshirani R, Eisen MB, Alizadeh A,
    Levy R, Staudt L, Chan WC, Botstein D, Brown P.
    'Gene shaving' as a method for identifying
    distinct sets of genes with similar expression
    patterns. (Genome Biol. 20001(2)RESEARCH0003.)
  • and many more!

http//serverdgm.unil.ch/bergmann/Publications/rev
iew.pdf
10
One example in more detail The (Iterative)
Signature Algorithm
  • No need for correlations!
  • decomposes data into transcription modules
  • integrates external information
  • allows for interspecies comparative analysis

J Ihmels, G Friedlander, SB, O Sarig, Y Ziv N
Barkai Nature Genetics (2002)
11
Trip to the Amazon
12
How to find related items?
items
10
20
30
40
50
60
70
80
90
100
5
10
15
20
25
30
35
40
45
50
customers
13
How to find related genes?
genes
10
20
30
40
50
60
70
80
90
100
5
10
15
20
25
30
35
40
45
50
conditions
J Ihmels, G Friedlander, SB, O Sarig, Y Ziv N
Barkai Nature Genetics (2002)
14
Signature Algorithm Score definitions
15
How to find related genes? Scores and thresholds!
condition scores
initial guesses (genes)
16
How to find related genes? Scores and thresholds!
condition scores
gene scores
thresholding
17
How to find related genes? Scores and thresholds!
condition scores
gene scores
18
Iterative Signature Algorithm
OUTPUT
SB, J Ihmels N Barkai Physical Review E (2003)
19
Identification of transcription modules using
many random seeds
20
New Tools Module Visualization
http//serverdgm.unil.ch/bergmann/Fibroblasts/visu
aliser.html
21
Gene enrichment analysis
  • The hypergeometric distribution f(M,A,K,T) gives
    the probability
  • that K out of A genes with a particular
    annotation match with a
  • module having M genes if there are T genes in
    total.

http//en.wikipedia.org/wiki/Hypergeometric_distri
bution
22
Decomposing expression data into annotated
transcriptional modules
identified gt100 transcriptional modules in yeast
high functional consistency! many functional
links waiting to be verified experimentally
J Ihmels, SB N Barkai Bioinformatics 2005
23
Module hierarchies and networks
24
Higher-order structure
correlated
C
anti-correlated
25
Mapping Transcription Modules
BLAST
26
For distant organisms correlation patterns
generally are distinct
SB, J Ihmels N Barkai PLoS Biology (2004)
27
What about related organisms?
genes
pairwise correlation (over all arrays)
J Ihmels, SB, J Berman N Barkai Science (2005)
28
Promoter analysis The Rapid Growth Element
AATTTT
29
Data Integration Example NCI60
60 cancer cell lines (9 tissue types)
30
Modules and Co-modules
Z Kutalik, J Beckmann SB, Nature Biotechnology
(2008)
31
How to identify Co-modules?
Iteratively refine genes, cell-lines and drugs to
get co-modules
32
Gene (g)-drug (d) associations
Association score
Drug module score Gene module score
Sum over all modules m
33
Co-modules predict drug-gene associations
recorded in Drugbank
True Positive Rate
False Positive Rate
34
Geneset (G)-drug (d) associations
Association score
Drug module score Gene set module score
Sum over all modules m
35
Predictive power for drug-geneset associations
is even stronger
36
Take-home Messages
  • Analysis of large-scale expression data bears
    great potential to understand global
    transcription programs and their evolution
  • Innovative analysis tools needed to extract
    information from such data
  • (Iterative) Signature Ping-Pong Algorithms
  • decomposes data into transcription modules
  • integrates external information
  • allows for interspecies comparative analysis
Write a Comment
User Comments (0)
About PowerShow.com