Correlating traits with phylogenies - PowerPoint PPT Presentation

Loading...

PPT – Correlating traits with phylogenies PowerPoint presentation | free to download - id: 3d8fcb-NjBiO



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Correlating traits with phylogenies

Description:

Correlating traits with phylogenies Using BaTS Phylogeny and trait values A phylogeny describes a hypothesis about the evolutionary relationship between individuals ... – PowerPoint PPT presentation

Number of Views:14
Avg rating:3.0/5.0
Slides: 24
Provided by: evolveZo
Learn more at: http://evolve.zoo.ox.ac.uk
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Correlating traits with phylogenies


1
Correlating traits with phylogenies
  • Using BaTS

2
Phylogeny and trait values
  • A phylogeny describes a hypothesis about the
    evolutionary relationship between individuals
    sampled from a population
  • Discrete character traits of interest can be
    mapped onto the phylogeny
  • A significant association between a particular
    trait value and its distribution on a phylogeny
    indicates a potential causative relationship

3
Phylogeny and trait values
  • A phylogeny describes a hypothesis about the
    evolutionary relationship between individuals
    sampled from a population

4
Phylogeny and trait values
  • Discrete character traits of interest can be
    mapped onto the phylogeny

5
Phylogeny and trait values
  • A significant association between a particular
    trait value and its distribution on a phylogeny
    indicates a potential causative relationship

6
Phylogeny and trait values
  • Often, the phylogeny-trait relationship does not
    appear unequivocal by eye an analytical
    framework may be needed.

(clear association)
(no association)
????
7
Phylogeny and trait values
  • The null hypothesis
  • The null hypothesis under test is one of random
    phylogeny-trait association that is, that
  • No single tip bearing a given character trait is
    any more likely to share that trait with
    adjoining taxa than we would expect due to chance

8
An example
  • Salemi et al (2005) Dataset of HIV sequences
    sampled from CNS tissues post mortem
  • Analysis by Slatkin-Maddison (1989) method,
    reanalyzed in BaTS.
  • Compartmentalization by tissue type circulating
    viral populations defined by location in the
    body
  • Salemi et al. (2005) J. Virol 79(17)
    11343-11352.
  • Parker, Rambaut Pybus (2008) MEEGID
    8(3)239-246.

9
Available methods
  • Non-phylogenetic ANOVA
  • Ignores shared ancestry
  • Phylogenetic
  • Single tree mapping
  • Slatkin-Maddison AI
  • BaTS

10
Methods Single-tree mapping
  • Method
  • Map traits onto a tree
  • Look for correlation
  • Pros
  • Fast
  • Simple
  • Cons
  • No indication of significance
  • Statistically weak (high Type II error)
  • Conditional on a single topology

11
Methods Slatkin-Maddison AI
  • Method
  • Map traits onto a tree by parsimony count
    migration events (Slatkin-Maddison) or measure
    association index within clades recursively
    (AI)
  • Compare observed value with a null (expected)
    value obtained by bootstrapping
  • Pros
  • Still reasonably fast
  • Indication of significance
  • Cons
  • Still conditional on a single topology

12
Methods BaTS
  • Method
  • See below(!)
  • Pros
  • Indication of significance
  • Statistically powerful and Type I error is
    correct
  • Accounts for phylogenetic uncertainty
  • Cons
  • Requires Bayesian MCMC sequence analysis
  • Slower

13
BaTS under the bonnet
  • Use a posterior distribution of phylogenies from
    Bayesian MCMC analysis
  • Calculates migrations, AI and a variety of other
    measures of association
  • Both observed and expected (null) values
    posterior distributions sampled
  • Significance obtained by comparing observed vs.
    expected

14
BaTS analysis workflow
  • Preparation
  • Sequence alignment
  • Bayesian MCMC phylogeny reconstruction (BEAST,
    MrBAYES) to obtain posterior distribution of
    trees (PST)
  • Taxa in PST marked up with discrete traits
  • BaTS analysis
  • Interpretation

15
Workflow Preparation (i)
  • Sequence alignment
  • CLUSTAL, BioEdit, SE-Al
  • Bayesian MCMC analysis
  • MRBAYES, BEAST
  • Taxa marked-up with traits

16
Workflow Preparation (ii)
  • Taxa marked-up with traits
  • Typical NEXUS format

17
Workflow Preparation (iii)
  • Taxa marked-up with traits

18
Workflow BaTS analysis
  • To use BaTS from the command-line, type
  • java jar BaTS_beta_build2.jar singlebatch
    lttreefile_namegt ltrepsgt ltstatesgt
  • Where
  • single or batch asks BaTS to analyse either a
    single input file, or a whole directory (batch
    analysis)
  • lttreefile_namegt is the name and full location of
    the treefile or directory to be analysed,
  • ltrepsgt is the number (an integer gt 1, typically
    100 at least) of state randomizations to perform
    to yield a null distribution, and
  • ltstatesgt is the number of different states seen.

19
The analysis
  • C\joeWork\apps\BaTS\BaTS_beta_build2\BaTS_beta_bu
    ild2gtjava -jar BaTS_beta_build 2.jar single
    example.trees 100 7
  • Performing single analysis.
  • File example.trees
  • Null replicates 100
  • Maximum number of discrete character
    states 7
  • analysing... 30 trees, with 7 states
  • analysing observed (using obs state data)
  • 30 29
  • 30 29
  • 30 29
  • 30 29
  • Statistic observed mean lower 95 CI
    upper 95 CU null mean lower 95 CI upper
    95 CI significance
  • AI 1.5555052757263184
    1.1128820180892944 2.160351037979126
    12.03488540649414 11.475320040039
    12.6391201928711 0.0
  • PS 18.5 17.0 20.0 80.7713394165039
    77.86666870117188 83.56666564941406
    0.0
  • MC (state 0) 12.633333206176758 9.0
    16.0 1.7496669292449951 1.399999976158142
    2.1666667461395264 0.009999990463256836
  • MC (state 1) 19.0 19.0 19.0
    1.7480005025863647 1.33333337306976 32
    2.0999999046325684 0.009999990463256836

20
Workflow Interpretation
  • The null hypothesis
  • The null hypothesis under test is one of random
    phylogeny-trait association that is, that
  • No single tip bearing a given character trait is
    any more likely to share that trait with
    adjoining taxa than we would expect due to chance

21
Workflow Interpretation
  • The statistics
  • Larger values ? increased phylogeny-trait
    association
  • Significance indicated by p-value
  • In addition, observed posterior values are
    informative for some statistics
  • PS indicates migration events between trait
    values
  • MC(trait value) indicates number of taxon in
    largest clade monophyletic for that trait value

22
FAQs / common pitfalls
  • Java 1.5 or higher is required. See java.sun.com
    for more.
  • Large datasets can be slow, so down-sample input
    tree files (uniformly, not randomly) where
    necessary, or to check BaTS input files are
    marked-up correctly.
  • A RAM (memory) shortage can slow the analysis,
    use Xmx switch to allocate virtual RAM
  • Check input file mark-up carefully if in doubt.
  • See more http//edocs.bea.com/wls/docs70/perform
    /JVMTuning.html

23
Author contact Joe Parker Department of
Zoology Oxford University, UK OX1
3PS joe_at_kitserve.org.uk http//evolve.zoo.ox.ac.u
k
About PowerShow.com