Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple Alignment - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple Alignment

Description:

Motif Finding Problem ... Motif model ( ) : qij = P(symbol at the i-th position = j) Background model: pj = P(symbol = j) ... Gibbs Sampling in Motif Finding ... – PowerPoint PPT presentation

Number of Views:518
Avg rating:3.0/5.0
Slides: 14
Provided by: xin4
Category:

less

Transcript and Presenter's Notes

Title: Detecting Subtle Sequence Signals: a Gibbs Sampling Strategy for Multiple Alignment


1
Detecting Subtle Sequence Signals a Gibbs
Sampling Strategy for Multiple Alignment
  • Lawrence, Altschul, Boguski, Liu, Neuwald,
    Wootton, Science, 1993

2
Motif Finding Problem
  • Biological description given a set of sequences,
    find the motif shared by all or most sequences,
    while its starting position in each sequence is
    unknown
  • Assumption
  • Each motif appears exactly once in one sequence
  • The motif has fixed length

3
Generative Model
  • Suppose the sequences are aligned, the aligned
    regions are generated from a motif model
    (position-specific multinomial distribution)
  • The unaligned regions are generated from a
    background model

4
Notations
  • Set of symbols
  • Sequences S S1, S2, , SN
  • Starting positions of motifs A a1, a2, , aN
  • Motif model ( ) qij P(symbol at the i-th
    position j)
  • Background model pj P(symbol j)
  • Count of symbols in each column cij count of
    symbol, j, in the i-th column in the aligned
    region

5
Motif Finding Problem
  • Problem find starting positions and model
    parameters simultaneously to maximize the
    posterior probability
  • This is equivalent to maximizing the likelihood
    by Bayes Theorem, assuming uniform prior
    distribution

6
Equivalent Scoring Function
  • Maximize the log-odds ratio

7
Sampling and Optimization
  • To maximize a function, f(x)
  • Brute force method try all possible x
  • Sample method sample x from probability
    distribution p(x) f(x)
  • Idea suppose xmax is argmax of f(x), then it is
    also argmax of p(x), thus we have a high
    probability of selecting xmax

8
Detailed Balance of Markov Chain
  • To sample from a probability distribution p(x),
    we set up a Markov chain s.t. each state
    represents a value of x and for any two states, x
    and y, the transitional probabilities satisfy
  • This would then imply

9
Gibbs Sampling
  • Idea a joint distribution may be hard to sample
    from, but it may be easy to sample from the
    conditional distributions where all variables are
    fixed except one
  • To sample from p(x1, x2, xn), let each state of
    the Markov chain represent (x1, x2, xn), the
    probability of moving to a state (x1, x2, xn)
    is p(xi x1, xi-1,xi1,xn). Then the detailed
    balance is satisfied.

10
Gibbs Sampling
11
Gibbs Sampling in Motif Finding
  • We should sample from joint distribution of A and
    ?, but ? has a high dimension, and thus this is
    inefficient.
  • Instead, we draw sample from
  • It can be shown that to find the conditional
    probability, p(azA,S) is roughly equivalent to
    compute the estimator of ? from A, then compute
    the probability of generating az according to
    this ?.

12
Estimator of ?
  • Given an alignment A, i.e. the starting positions
    of motifs, ? can be estimated by its MLE with
    smoothing (alternatively, Dirichlet prior with
    parameter bj)

13
Algorithm
  • Randomly initialize A0
  • Repeat
  • (1) randomly choose a sequence z from S
  • A At \ az
  • compute ?t estimator of ? given S and A
  • (2) sample az according to P(az x), which is
    proportional to Qx/Px
  • update At1 A union x
  • Select At that maximizes F

Qx the probability of generating x according to
?t Px the probability of generating x
according to the background model
Write a Comment
User Comments (0)
About PowerShow.com