Parallel - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

Parallel

Description:

Parallel & Distributed Systems and Algorithms for Inference of Large ... Guidon et al (May 2003) PHYML very fast & accurate ML program for real ... – PowerPoint PPT presentation

Number of Views:157
Avg rating:3.0/5.0
Slides: 45
Provided by: gnther1
Category:
Tags: guidon | parallel

less

Transcript and Presenter's Notes

Title: Parallel


1
Parallel Distributed Systems and Algorithms for
Inference of Large Phylogenetic Trees with
Maximum Likelihood
  • Alexandros Stamatakis
  • LRR TU München
  • Contact stamatak_at_cs.tum.edu

2
Outline
  • Motivation
  • Introduction to phylogenetic tree inference
  • Statistical inference methods
  • Maximum Likelihood associated problems
  • Solutions
  • 2 simple heuristics
  • parallel distributed implementation
  • Results
  • Conclusion
  • Availability Future Work

3
Motivation Towards a Tree of Life
  • 30.000 organisms available, current trees lt 1000

Where we are
4
Motivation Towards a Tree of Life
  • 30.000 organisms available, current trees lt 1000

Where we want to get
5
Phylogenetic Tree Inference
  • Input good multiple alignment of a
    distinguished, highly conserved part of DNA
    sequences
  • Output unrooted binary tree with the sequences
    at its leaves (all nodes either degree 1 or 3)
  • Various methods for phylogenetic tree inference
  • Differ in computational complexity and quality of
    trees
  • Most accurate methods Maximum Likelihood Method
    (ML) and Bayesian Phylogenetic Inference
  • most sound and flexible methods
  • other methods not suited for
    large/complex trees
  • -- most computationally intensive methods

6
ML and Bayesian methods
  • T.Williams et al (March 2003) comparative
    analysis with simulated data shows MrBayes is
    best program
  • Guidon et al (May 2003) PHYML very fast
    accurate ML program for real simulated data
    faster than MrBayes
  • ML (PHYML, RAxML2)
  • Significantly faster than MrBayes
  • Reference/starting trees for bayesian methods
  • -- Less powerful statistical model
  • Bayesian Inference (MrBayes)
  • Powerful statistical model
  • -- MCMC convergence problem
  • Memory requirements for 1000/10000-taxon
    alignment
  • RAxML 200MB/750MB
  • PHYML 900MB/8.8GB
  • MrBayes 1150MB/unknown

7
MCMC Convergence Problem
8
What does ML compute?
  • Maximum Likelihood calculates
  • Topologies
  • Branch lengths vi
  • Likelihood of the tree

S1
v1
S3
v5
S4
v3
v7
v4
v2
S2
v6
S5
Goal Find tree topology wich maximizes
likelihood Problem I Number of possible
topologies is exponential in n Problem II
Computation of likelihood value branch length
optimization is expensive Solution
Algorithmic Optimizations (previous work) New
heuristics HPC
9
New Heuristics for RAxML
  • Two common methods to build a tree
  • Progressive addition of organisms e.g. stepwise
    addition algorithm
  • Use a (random, simple) starting tree containing
    all organisms and optimize likelihood by
    application of topological changes
  • RAxML (Randomized Axelerated Maximum Likelihood)
    computes parsimony starting tree with dnapars
  • -gt fast and relatively good initial likelihood
  • dnapars uses stepwise addition -gt randomized
    sequence input order to obtain distinct starting
    trees
  • Optimize starting tree by application of
    rearrangements
  • Accelerate rearrangements by two simple ideas

10
Subtree Rearrangements
11
Subtree Rearrangements
ST2
ST1
ST3
ST6
ST4
ST5
12
Subtree Rearrangements
1
ST2
ST1
ST3
ST6
ST4
ST5
13
Subtree Rearrangements
1
ST2
ST1
ST3
ST6
ST4
ST5
14
Subtree Rearrangements
1
ST6
ST2
ST1
ST3
ST4
ST5
15
Subtree Rearrangements
1
ST6
ST2
ST1
ST3
ST4
ST5
16
Subtree Rearrangements
2
ST2
ST1
ST3
ST4
ST5
ST6
17
Subtree Rearrangements
2
ST2
ST1
ST3
ST4
ST5
ST6
18
Subtree Rearrangements
ST2
ST1
Optimize all branches
ST3
ST4
ST5
ST6
19
Subtree Rearrangements
ST2
ST1
Need to optimize all branches ?
ST3
ST4
ST5
ST6
20
Idea 1 Local Optimization of Branch Length
ST2
ST1
ST3
ST6
ST4
ST5
21
Idea 1 Local Optimization of Branch Length
ST2
ST1
ST3
ST6
ST4
ST5
22
Why is Idea 1 useful?
  • Local optimization of branch lengths
  • Update less likelihood vectors -gt significantly
    faster
  • Allows higher rearrangement settings -gt better
    trees
  • Likelihood depends strongly on topology
  • Fast exploration of large number of topologies
  • Straight-forward parallelization
  • Store best 20 trees from each rearrangement step
  • Branch length optimization of best 20 trees only
  • Experimental results justify this mechanism

23
Idea 2Subsequent Application of Topological
Changes
24
Idea 2Subsequent Application of Topological
Changes
ST3
25
Idea 2Subsequent Application of Topological
Changes
ST3
ST3
26
Idea 2Subsequent Application of Topological
Changes
ST2
ST1
ST3
ST3
ST6
ST4
ST5
ST2
ST1
ST3
ST3
ST6
ST4
ST5
27
Why is Idea 2 useful?
  • During inital 5-10 rearrengement steps many
    improved topologies are encountered
  • Acceleration of likelihood improvment in initial
    optimization phase
  • Enables fast optimization of random starting trees

28
Remainder of this Talk
  • Motivation
  • Introduction to phylogenetic tree inference
  • Statistical inference methods
  • Maximum Likelihood associated problems
  • Solutions
  • 2 simple heuristics
  • parallel distributed implementation
  • Results
  • Conclusion
  • Availability Future Work

29
Basic Parallel Distributed Algorithm
  • Basic idea Distribute work by subtrees instead
    of topologies (e.g. parallel fastDNAml)
  • Simple Master-Worker architecture
  • Subsequent application of topological changes
    introduces non-determinism

ST2
ST1
ST3
ST6
ST4
ST5
30
Basic Parallel Distributed Algorithm
  • Basic idea Distribute work by subtrees instead
    of topologies (e.g. parallel fastDNAml)
  • Simple Master-Worker architecture
  • Subsequent application of topological changes
    introduces non-determinism

ST2
ST1
ST3
ST6
ST4
ST5
MPI_Send(ST3_ID, tree)
31
Basic Parallel Distributed Algorithm
  • Basic idea Distribute work by subtrees instead
    of topologies (e.g. parallel fastDNAml)
  • Simple Master-Worker architecture
  • Subsequent application of topological changes
    introduces non-determinism

ST2
ST1
MPI_Send(ST2_ID, tree)
ST3
ST6
ST4
ST5
MPI_Send(ST3_ID, tree)
32
Differences between Parallel Distributed
Algorithm
  • Parallel best tree list of max(20, workers)
    maintained and merged at the master
  • Parallel Master distributes max(20, workers) as
    toplogy-strings to workers for branch length
    optimization
  • Distributed Each worker maintains local best
    list of 20 trees
  • Distributed Worker performs fast branch length
    optimizations locally on all 20 trees -gt returns
    only best topology to the master

33
Sequential Results
  • 50 distinct simulated 100-taxon alignments
  • Measured average execution times topological
    distance (RF-rate) from true tree
  • PHYML 35.21 seconds, RF-rate 0.0796
  • MrBayes 945.32 seconds, RF-rate 0.0741
  • RAxML 29.27 seconds, RF-rate 0.0818
  • 9 distinct real alignments containing 101-1000
    taxa
  • Measured execution times final likelihood
    values
  • RAxML yields best-known likelihood for all data
    sets
  • RAxML faster than PHYML MrBayes


34
Sequential Results Real Data
data PHYML secs MrBayes secs RAxML secs R gt PHY secs PAXML hrs
101_SC -74097.6 153 -77191.5 40527 -73919.3 617 31 -73975.9 47
150_SC -44298.1 158 -52028.4 49427 -44142.6 390 33 -44146.9 164
150_ARB -77219.7 313 -77196.7 29383 -77189.7 178 67 -77189.8 300
200_ARB -104826.5 477 -104856.4 156419 -104742.6 272 99 -104743.3 775
250_ARB -131560.3 787 -133238.3 158418 -131468.0 1067 249 -131469.0 1947
500_ARB -253354.2 2235 -263217.8 366496 -252499.4 26124 493 -252588.1 7372
1000_ARB -402215.0 16594 -459392.4 509148 -400925.3 50729 1893 -402282.1 9898
218_RDPII -157923.1 403 -158911.6 138453 -157526.0 6774 244 n/a n/a
500_ZILLA -22186.8 2400 -22259.0 96557 -21033.9 29916 67 n/a n/a
35
Sequential Results Real Data
data PHYML secs MrBayes secs RAxML secs R gt PHY secs PAXML hrs
101_SC -74097.6 153 -77191.5 40527 -73919.3 617 31 -73975.9 47
150_SC -44298.1 158 -52028.4 49427 -44142.6 390 33 -44146.9 164
150_ARB -77219.7 313 -77196.7 29383 -77189.7 178 67 -77189.8 300
200_ARB -104826.5 477 -104856.4 156419 -104742.6 272 99 -104743.3 775
250_ARB -131560.3 787 -133238.3 158418 -131468.0 1067 249 -131469.0 1947
500_ARB -253354.2 2235 -263217.8 366496 -252499.4 26124 493 -252588.1 7372
1000_ARB -402215.0 16594 -459392.4 509148 -400925.3 50729 1893 -402282.1 9898
218_RDPII -157923.1 403 -158911.6 138453 -157526.0 6774 244 n/a n/a
500_ZILLA -22186.8 2400 -22259.0 96557 -21033.9 29916 67 n/a n/a
36
Sequential Results Real Data
37
Sequential Results Real Data
38
Sequential Results Real Data
39
Parallel Results Speedup 1000_ARB
40
Distributed Results First Tests
  • Platforms
  • Infiniband-Cluster 10 Intel Xeon 2.4 GHz
  • Sunhalle 50 Sun-Workstations for CS students
  • Alignments
  • 1000_ARB
  • 2025_ARB
  • Larger trees to come ..........
  • Results
  • Program executed correctly terminated
  • RAxML_at_home yielded best-known tree for 2025_ARB

41
Biological Results 1st ML 10.000-taxon tree
  • Calculated 5 parsimony starting trees 3-4
    initial rearrangement steps sequentially on Xeon
    2.4GHz
  • Further rearrangements of those 5 trees in
    parallel on 32 or 64 Xeon 2.66GHz at RRZE
  • Accumulated CPU hours/tree 3200hours
  • Best ln likelihood -949539 worst -950026
  • Problems
  • Quality assessment? bootstrap not feasible
  • Consense crashes for gt 5 trees
  • MrBayes/PHYML crash on 32-bit/4GB
  • MrBayes crashed on Itanium
  • Visualization?

42
(No Transcript)
43
Conclusion
  • RAxML not able to handle protein data
  • RAxML not able to perform model parameter
    optimization
  • BUT
  • RAxML easy to parallelize/distribute
  • Accurate fast for large trees
  • Significantly lower memory requirements than
    MrBayes/PHYML
  • Conclusion Imlement model parameter optimization
    protein data in RAxML

44
Availability Future Work
  • Further development distribution of RAxML_at_home
  • Big production runs with RAxML_at_home
  • Survey ML supertrees vs. integral trees
  • Alignment split-up methods for ML supertrees
  • RAxML implementation on GPUs
  • RAxML2 download, benchmark, code
    wwwbode.in.tum.de/stamatak
  • RAxML_at_home development www.sourceforge.com/projec
    ts/axml
Write a Comment
User Comments (0)
About PowerShow.com