Bayesian inference - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Bayesian inference

Description:

... group of organisms is monophyletic: Find the number of sampled ... If it is monophyletic in 74% of the trees, it has a 74% probability of being monophyletic ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 28
Provided by: csta3
Category:

less

Transcript and Presenter's Notes

Title: Bayesian inference


1
Bayesian inference
  • Based on Bayesian inference using Markov Chain
    Monte Carlo in phylogenetic studies by TorbojÖrn
    Karfunkel

Presented by Amir Hadadi Bioinformatics seminar,
spring 2005
2
What is Bayesian inference ?
  • Definition an approach to statistics in which
    all forms of uncertainty are expressed in terms
    of probability (Radford M. Neal)

3
Probability reminder
  • Conditional probability
  • P(D?T) P(DT)?P(T)
  • P(D?T) P(TD)?P(D)

Bayes theorem P(TD) P(DT)?P(T)/P(D)
  • P(TD) is called the posterior probability of T
  • P(T) is the prior probability, that is the
    probability assigned to T before seeing the data
  • P(DT) is the likelihood of T, which is what we
    try to maximize in ML
  • P(D) is the probability of observing the data D
    disregarding which tree is correct

4
posterior vs. likelihood probabilitiesBayesian
inference vs. Maximum likelihood
Observation Fair Biased
1/6
1/21
1/6
2/21
1/6
3/21
1/6
4/21
  • 100 dice
  • some fair, some biased

1/6
5/21
1/6
6/21
5
Example continued
  • A die is drawn at random from the box
  • Rolling the die twice gives us a and a
  • Using the ML approach we get
  • P( Fair) 1/6 ? 1/6 0.028
  • P( Biased) 4/21 ? 6/21 0.054
  • ML Conclusion the die is biased

6
Example continued further
  • Assume we have a prior knowledge about the dice
    distribution inside the box
  • We know that in the box there are 90 fair dice
    and 10 biased dice

7
Example conclusion
  • Prior probability fair 0.1, biased 0.9
  • Rolling the die twice gives us a and a
  • Using the Bayesian approach we get
  • P(Biased ) P(
    Biased)?P(Biased)/P( )0.179
  • B.I. Conclusion the die is fair
  • Conclusion ML and BI do not necessarily agree
  • Resemblance of BI and ML results depends on the
    strength of prior assumptions we introduce

8
Steps in B.I.
  • formulate a model of the problem
  • Formulate a prior distribution which captures
    your beliefs before seeing the data
  • Obtain posterior distribution for the model
    parameters

9
B.I. In phylogenetic reconstruction
  • Phylogenetic reconstruction
  • Finding an evolutionary tree which explains the
    Data (observed species)
  • Methods of phylogenetic reconstruction
  • Using a model of sequence evolution, e.g. maximum
    likelihood
  • Not using sequence evolution, e.g. maximum
    parsimony, neighbor joining etc.
  • Bayesian inference belongs to the first category

10
Bayesian inference vs. Maximum likelihood
  • The basic question in Bayesian inference
  • What is the probability that this model (T) is
    correct, given the data (D) that we have observed
    ?
  • Maximum likelihood asks a different question
  • What is the probability of seeing the observed
    data (D) given that a certain model (T) is true
    ?
  • B.I. seeks P(TD), while ML maximizes P(DT)

11
Which priors should we assume ?
  • Knowledge about a parameter can be used to
    approximate its prior distribution
  • Usually we dont have prior knowledge about a
    parameters distribution. In this case a flat or
    vague prior is assumed.

12
A flat prior
A vague prior
13
How to find the posterior probability P(TD) ?
  • P(T) is the assumed prior
  • P(DT) is the likelihood
  • Finding P(D) is infeasible we need to sum
    P(DT)P(T) over the entire tree space
  • Markov Chain Monte Carlo (MCMC) gives us an
    indirect way of finding P(TD) without having to
    calculate P(D)

14
MCMC Example
P1/2
P1/2
P1/2
P1/2
,
,
,
,
,
,
P(Palestine) 3/7, P(Tree) 4/7
15
Symmetric simple random walk
  • Definition A sequence of steps in ?, starting at
    0 and moving one step left or right with
    probability ½
  • Properties
  • After n steps the average distance from 0 is of
    magnitude ?n
  • A random walk in one or two dimensions is
    recurrent
  • A random walk in three dimensions or more is
    transient
  • The Brownian motion is a limit of a random walk

16
Definition of a markov chain
  • A special type of stochastic process
  • A sequence of random variables X0, X1, X2, such
    that
  • Each Xi takes values in a state space S s1,
    s2,
  • If x0, x1,, xn1 are elements of S, then
  • P(Xn1 xn1Xn xn, Xn-1 xn-1,,X0 x0)
  • P(Xn1 xn1Xn xn)

17
Using MCMC to calculate posterior probabilities
  • set S the set of parameters (e.g. tree
    topology, mutation probability, branch lengths
    etc.)
  • Construct an MCMC with a stationary distribution
    equal to the posterior probability of the
    parameters
  • Run the chain for a long time and sample from it
    regularly
  • Use the samples to find the stationary
    distribution

18
Constructing our MCMC
  • The state space S is defined as the parameter
    space
  • Start with random tree and parameters
  • In each new generation, randomly propose either
  • A new tree topology
  • A new value for a model parameter
  • If the proposed tree has higher posterior
    probability, ?proposed, than the current tree,
    ?current, the transition is accepted
  • Otherwise the transition is accepted with
    probability ?proposed / ?current

19
Algorithm visualization
20
Convergence issues
  • An MCMC might run for a long time until its
    sampled distribution is close to the stationary
    distribution
  • The initial convergence phase is called the
    burn-in phase
  • We wish to minimize burn-in time

21
Avoiding getting stuck on local maxima
  • Assume our landscape looks like this

22
Avoiding local maxima (contd)
  • descending a maximum can take a long time
  • MCMCMC (Metropolis coupled MCMC) speeds the
    chains mixing rate
  • Instead of running a single chain, multiple
    chains are run simultaneously
  • The chains are heated to different degrees

23
Chain heating
The cold chain has stationary distribution P(TD)
Heated chain number i has Stationary distribution
P(TD)1/i
24
The MC3 algorithm
  • Run multiple heated chains
  • At each generation, attempt a swap between two
    chains
  • If the swap is accepted, the hotter and cooler
    chains will swap states
  • sample only from the cold chain

25
Drawing conclusions
  • To Decide the value of a parameter
  • Draw a histogram showing the number of trees in
    each interval and calculate mean, mode,
    credibility intervals etc.
  • To find the most likely tree topologies
  • sort all sampled trees according to their
    posterior probabilities
  • Pick the most probable trees until the cumulative
    probability is 0.95
  • To Check whether a certain group of organisms is
    monophyletic
  • Find the number of sampled trees in which it is
    monophyletic
  • If it is monophyletic in 74 of the trees, it has
    a 74 probability of being monophyletic

26
Summary
  • Bayesian inference is very popular in many fields
    requiring statistical observations
  • The advent of fast computers gave rise to the use
    of MCMC in B.I., enabling multi-parameter
    analysis
  • Fields of genomics using Bayesian methods
  • Identification of SNPs
  • Inferring levels of gene expression and
    regulation
  • Association mapping
  • Etc.

27
THE END
28
A sample histogram
Write a Comment
User Comments (0)
About PowerShow.com