Course Module: Brief Introduction to Systems Biology Sven Bergmann Department of Medical Genetics Un - PowerPoint PPT Presentation

1 / 36

About This Presentation

Title:

Course Module: Brief Introduction to Systems Biology Sven Bergmann Department of Medical Genetics Un

Description:

J Ihmels, G Friedlander, SB, O Sarig, Y Ziv & N Barkai Nature Genetics (2002) Trip to the 'Amazon' ... scale expression data bears great potential to understand ... – PowerPoint PPT presentation

Number of Views:246

Avg rating:3.0/5.0

Slides: 37

Provided by: serverd9

Category:

more less

Transcript and Presenter's Notes

Title: Course Module: Brief Introduction to Systems Biology Sven Bergmann Department of Medical Genetics Un

1
Course Module Brief Introduction to Systems
Biology Sven BergmannDepartment of Medical
GeneticsUniversity of LausanneRue de Bugnon 27
- DGM 328 CH-1005 Lausanne Switzerland work
41-21-692-5452cell 41-78-663-4980
http//serverdgm.unil.ch/bergmann
2
Recap Part 2 Standard Analysis Tools

Motivation
Why to study a large heterogeneous set of
expression data?
What biological questions can we ask?
Supervised vs. unsupervised approaches
Practical Part
K-means clustering
Coupled two-way clustering (CTWC)
Principle component analysis (PCA)
Singular value decomposition (SVD)

3
Part 3 Advanced Analysis Tools

Motivation
What are the limitations of standard tools?
What is a modular approach?
How can we integrate different datasets?
Practical Part
Overview of Advanced Analysis Tools
The (Iterative) Signature Algorithm
The Ping-Pong Algorithm

4
How to make sense of millions of numbers?
Hundreds of samples
Thousands of genes
New Analysis and Visualization Tools are needed!
5
Pooling genome-wide expression measurements from
many experiments
cell- cycle
sets of specific conditions
6
The challenge of many datasets How to integrate
all the information?

Protein expression
Tissue specific expression
Interaction data
Localization data

7
How to extract biological information from
large-scale expression data?

Clusters cannot overlap!
Clustering based on correlations over all
conditions - sensitive to noise -
computation intensive

8
How to extract biological information from
large-scale expression data?
9
Overview of modular analysis tools

Cheng Y and Church GM. Biclustering of
expression data.(Proc Int Conf Intell Syst Mol
Biol. 2000893-103)
Getz G, Levine E, Domany E. Coupled two-way
clustering analysis of gene microarray data.
(Proc Natl Acad Sci U S A. 2000 Oct
2497(22)12079-84)
Tanay A, Sharan R, Kupiec M, Shamir R. Revealing
modularity and organization in the yeast
molecular network by integrated analysis of
highly heterogeneous genomewide data. (Proc Natl
Acad Sci U S A. 2004 Mar 2101(9)2981-6)
Sheng Q, Moreau Y, De Moor B. Biclustering
microarray data by Gibbs sampling.
(Bioinformatics. 2003 Oct19 Suppl 2ii196-205)
Gasch AP and Eisen MB. Exploring the conditional
coregulation of yeast gene expression through
fuzzy k-means clustering.(Genome Biol. 2002 Oct
103(11)RESEARCH0059)
Hastie T, Tibshirani R, Eisen MB, Alizadeh A,
Levy R, Staudt L, Chan WC, Botstein D, Brown P.
'Gene shaving' as a method for identifying
distinct sets of genes with similar expression
patterns. (Genome Biol. 20001(2)RESEARCH0003.)
and many more!

http//serverdgm.unil.ch/bergmann/Publications/rev
iew.pdf
10
One example in more detail The (Iterative)
Signature Algorithm

No need for correlations!
decomposes data into transcription modules
integrates external information
allows for interspecies comparative analysis

J Ihmels, G Friedlander, SB, O Sarig, Y Ziv N
Barkai Nature Genetics (2002)
11
Trip to the Amazon
12
How to find related items?
items
10
20
30
40
50
60
70
80
90
100
5
10
15
20
25
30
35
40
45
50
customers
13
How to find related genes?
genes
10
20
30
40
50
60
70
80
90
100
5
10
15
20
25
30
35
40
45
50
conditions
J Ihmels, G Friedlander, SB, O Sarig, Y Ziv N
Barkai Nature Genetics (2002)
14
Signature Algorithm Score definitions
15
How to find related genes? Scores and thresholds!
condition scores
initial guesses (genes)
16
How to find related genes? Scores and thresholds!
condition scores
gene scores
thresholding
17
How to find related genes? Scores and thresholds!
condition scores
gene scores
18
Iterative Signature Algorithm
OUTPUT
SB, J Ihmels N Barkai Physical Review E (2003)
19
Identification of transcription modules using
many random seeds
20
New Tools Module Visualization
http//serverdgm.unil.ch/bergmann/Fibroblasts/visu
aliser.html
21
Gene enrichment analysis

The hypergeometric distribution f(M,A,K,T) gives
the probability
that K out of A genes with a particular
annotation match with a
module having M genes if there are T genes in
total.

http//en.wikipedia.org/wiki/Hypergeometric_distri
bution
22
Decomposing expression data into annotated
transcriptional modules
identified gt100 transcriptional modules in yeast
high functional consistency! many functional
links waiting to be verified experimentally
J Ihmels, SB N Barkai Bioinformatics 2005
23
Module hierarchies and networks
24
Higher-order structure
correlated
C
anti-correlated
25
Mapping Transcription Modules
BLAST
26
For distant organisms correlation patterns
generally are distinct
SB, J Ihmels N Barkai PLoS Biology (2004)
27
What about related organisms?
genes
pairwise correlation (over all arrays)
J Ihmels, SB, J Berman N Barkai Science (2005)
28
Promoter analysis The Rapid Growth Element
AATTTT
29
Data Integration Example NCI60
60 cancer cell lines (9 tissue types)
30
Modules and Co-modules
Z Kutalik, J Beckmann SB, Nature Biotechnology
(2008)
31
How to identify Co-modules?
Iteratively refine genes, cell-lines and drugs to
get co-modules
32
Gene (g)-drug (d) associations
Association score
Drug module score Gene module score
Sum over all modules m
33
Co-modules predict drug-gene associations
recorded in Drugbank
True Positive Rate
False Positive Rate
34
Geneset (G)-drug (d) associations
Association score
Drug module score Gene set module score
Sum over all modules m
35
Predictive power for drug-geneset associations
is even stronger
36
Take-home Messages

Analysis of large-scale expression data bears
great potential to understand global
transcription programs and their evolution
Innovative analysis tools needed to extract
information from such data
(Iterative) Signature Ping-Pong Algorithms
decomposes data into transcription modules
integrates external information
allows for interspecies comparative analysis

Write a Comment

User Comments (0)