Identify regulatory modules from gene expression data presentation

About This Presentation

Transcript and Presenter's Notes

Title: Identify regulatory modules from gene expression data

1
Identify regulatory modules from gene expression
data

2
Introduction

Much of a cells activity is organized as a
network of interacting modules sets of genes
coregulated to respond to different conditions.
Identifying this organization is crucial for
understanding cellular responses to internal and
external signals.
Genome-wide expression profiles (e.g., DNA
microarray) provide important information about
regulatory mechanisms.
With the availability of complete genome
sequences, identifying cis-regulatory elements
via a bioinformatics approach on a genome-wide
manner comes out as a promising solution.

3
Tasks

4
General scheme (1)

clustering-based approaches for finding motifs
from gene expression and sequence data

classify
5
General scheme (2)

sequence(/knowledge)-based approaches for finding
motifs from gene expression and sequence data

6
General scheme (3)

Comparative genomics has also been applied to
identify eukaryotic regulatory elements (e.g.,
Human-Mouse) because functional noncoding
sequences may be conserved across species from
evolutionary constraints.
Finding a good pair of species to compare and
choosing a good sequence conservation threshold
are critical and such information is not
available for most species.

7
Related work

Predicting gene expression from sequence
Michael A. Beer and Saeed Tavazoie
Cell, 2004, 117 185-198
A successful application of existing
computational approaches in studying the yeast
transcriptional regulation network

8
Approach

Clustering (k-means) modules of coregulated
genes
Motif Finding (AlignACE) putative regulatory
elements (TFBSs)
Bayesian network learning regulation conditions
(motifs, positional and combinatorial constraints)

9
Bayesian Network

Sequence features (x1,,xn) ? expression patterns
(ei)
Sequence feature (xi) presence of motifs,
positional constraints, and combinatorial
constraints
Expression pattern (ei) a binary one layer
network
Maximizing P(eix1,,xn), the probability that
genes with these sequence features will
participate in expression pattern i

10
Properties

Easy to integrate all kinds of sequence features
Explicit Sequence features
To avoid complex networks overfit the training
data, a parameter for penalizing dense networks
is used.
Optimal network is greedily learned.

11
Motif finding approaches

12
MEME

Sequence is broken up into all overlapping
subsequences of length W which it contains.
Two-component finite mixture model Motif (a
set of similar subsequences of fixed width)
Background (all other positions in the
sequences)
Motif model each example of the motif is assumed
to be generated by a sequence of independent,
multinomial random variables.
Background model each position (which is not
part of a motif) is generated independently by a
multinomial random variable.
Maximize the likelihood of the model M given the
data D L(MD)p(DM) by EM algorithm

13
Gibbs motif sampler

Dealing with a specific model alignment rather
than a weighted average as EM does.
Iteratively sample motif models (or possibly
background model) for each subsequence and
thereby partition motif-encoding regions into
different motifs.
Iterative heuristic method, which combines
gradient search steps with random jumps in the
search space, hence not guaranteed to reach
optimal, but wont stuck at local maximums as EM
does.
Identify the most probable motif models by
locating the optimum alignments, which maximize
the ratios of the corresponding target
probabilities to the background probabilities
(MAP (maximum a posteriori) score).

14
Future work

Ab initio motif finding approach from gene
expression and sequence data by attempting new
heuristic or statistic model.
Integrating prior knowledge (e.g., GO) to
facilitate identification of regulatory elements
and transcriptional network.

Write a Comment

User Comments (0)

About PowerShow.com

Identify regulatory modules from gene expression data PowerPoint PPT Presentation