Bioinformatics and Graphical Models: Computation, approximation, and their value - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Bioinformatics and Graphical Models: Computation, approximation, and their value

Description:

Bioinformatics and Graphical Models: Computation, approximation, and their value MSR: Nebojsa Jojic, Vladimir Jojic, Chris Meek, David Heckerman – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 20
Provided by: researchM6
Category:

less

Transcript and Presenter's Notes

Title: Bioinformatics and Graphical Models: Computation, approximation, and their value


1
Bioinformatics and Graphical ModelsComputation,
approximation, and their value
  • MSR
  • Nebojsa Jojic, Vladimir Jojic, Chris Meek, David
    Heckerman

UW Jim Mullins, Mark Jensen, Jerry Learn
2
Overview
  • Computational cost of usual algorithms
  • State of the art
  • Phylogeny alignment
  • Phylogeny sequence modeling
  • Approximations and their pitfalls
  • Recombination
  • Analogy to other ML domains
  • Graphical model
  • Experiments and computational cost
  • Value of the computation
  • Potential applications
  • Drug discovery cycle
  • Value of time and clinical success
  • Market size and growth
  • Discussion

3
Rational vaccine design(Jim Mullins et al)
  • Rational design
  • Analysis of sequences to form a model of virus
    evolution (phylogenies, etc.)
  • Develop vaccines that target as much variability
    as possible
  • Traditional design
  • Trial and error
  • Educated guesses

4
State of the art sequence analysis programs
  • Example
  • Rational AIDS vaccine design
  • Analysis of the envelope gene from a single
    patient in one visit
  • 200 sequences with 600 base pairs each
  • Overnight to align
  • 1-2 hours to 2-3 days to build a tree, depending
    on how much search you are willing to do
  • This does not include modeling the inter-sequence
    dependencies, coupling alignment and tree search,
    and it ignores recombination
  • The total length of the HIV genome is 10000 and
    the number of samples is practically only limited
    by cost

5
Computational cost of a slightly more detailed
analysis
  • Metropolis search over all trees on 400 sequences
    of the full genome (10k) would last around 2
    years on one machine
  • Exact search intractable!

6
Approximation
  • Free energy as a bound on negative log-likelihood
  • Computation and approximation of the free energy
  • Iterative conditional modes
  • Mean-field method
  • Structured variational techniques
  • (Loopy) belief propagation
  • Sampling techniques
  • How tight is the bound?
  • What does the looseness translate to?

7
An example of the approximation issues
8
An example of the approximation issues
9
An example of the approximation issuesTightness
of the bounds
Variational technique
Exact EM algorithm
10
Recombination
  • In HIV, the rate of recombination has recently
    been estimated to be ¼ of the rate of mutation!
  • Combinatorial explosion in inference

11
Similar situations in other domains where
graphical models work well
  • Occlusion in video
  • Source interaction in audio
  • Composition of images

12
Occlusion in audio
Speaker2
Speaker1
M
1-M




Retrieved Speaker1
Retrieved Speaker2
13
Epitome of an image
A set of image patches
Input image
Epitome
14
Layers from a single photograph
es
em
S1
s2
M
x
15
Modeling alignment and recombination by learning
a library of gene patterns
16
Experimental results
17
Value of computation(from Tufts Center)
18
Growth
  • Human viruses
  • West Nile
  • SARS
  • Hepatitis C
  • Polio
  • Animal viruses
  • FIV
  • Pig, chicken and cow viruses
  • Most bacterial diseases
  • Parasitic diseases
  • The first sign of success of rational design
    might trigger great increase in the number of
    diseases tackled

19
How can MS/MSR be involved?
  • MS Architecture, platform, tools
  • Storage, transmission, computation
  • E.g., parallelizable computation on a single
    machine pear-to-pear networks for parallel
    computation on multiple machines
  • MSR
  • Helping to speed up the scientific progress
    leading to the new opportunities for growth
  • Advising MS on the research direction in the
    community and the future requirements for the
    platform
Write a Comment
User Comments (0)
About PowerShow.com