Molecular Evolution NUI Maynooth June 2001 - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Molecular Evolution NUI Maynooth June 2001

Description:

Charles Darwin 'The natural system is based upon descent with modification. ... Charles Darwin, Origin of species 1859 p. 413. Darwin and homology ... – PowerPoint PPT presentation

Number of Views:65
Avg rating:3.0/5.0
Slides: 43
Provided by: marti283
Category:

less

Transcript and Presenter's Notes

Title: Molecular Evolution NUI Maynooth June 2001


1
Molecular Evolution NUI Maynooth June 2001
2
Aims of the course
  • To introduce the theory and practice of
    phylogenetic inference from molecular data
  • To provide an introduction to some of the most
    useful methods and computer programmes

3
Introduction to the course
  • Some basic concepts e.g. phylogeny, monophyly,
    homology analogy
  • Exploring patterns in sequence data
  • Alignment - ClustalW (done)
  • Phylogenetic analysis
  • Parsimony
  • Distance matrix analysis
  • Maximum likelihood
  • How robust are phylogenetic hypotheses?

4
Phylogenetics (Cladistics)
  • Based upon evolutionary relationships i.e. upon
    common ancestry
  • Cladogram is a tree diagram which depicts a
    hypothesised evolutionary history
  • A Phylogram is a tree which indicates by branch
    length the degree of change believed to have
    occurred along each lineage

5
Cladograms and phylograms
Bacteria 1
Bacteria 2
Cladograms show branching order - branch lengths
are meaningless
Bacteria 3
Eukaryote 1
Eukaryote 2
Eukaryote 3
Eukaryote 4
Phylograms show branch order and branch lengths
Bacteria 1
Bacteria 2
Bacteria 3
Eukaryote 1
Eukaryote 2
Eukaryote 3
Eukaryote 4
6
Rooting using outgroups
Archaea outgroup
Bacteria 1
Rooted by outgroup
Bacteria 2
Bacteria 3
Eukaryote 1
Eukaryote 2
Eukaryote 3
Root
Eukaryote 4
7
How construct a phylogeny?
  • What kind of data?
  • How to analyse it?

8
Richard Owen
9
Owens definition of homology
  • Homologue the same organ under every variety of
    form and function (true or essential
    correspondence)
  • Analogy superficial or misleading similarity
  • Richard Owen 1843

10
Charles Darwin
11
Darwin and homology
  • The natural system is based upon descent with
    modification .. the characters that naturalists
    consider as showing true affinity (i.e.
    homologies) are those which have been inherited
    from a common parent, and, in so far as all true
    classification is genealogical that community of
    descent is the common bond that naturalists have
    been seeking
  • Charles Darwin, Origin of species
    1859 p. 413

12
Homology is...
  • Homology similarity that is the result of
    inheritance from a common ancestor -
    identification and analysis of homologies is
    central to phylogenetics

13
Phylogenetics
  • Sees homology as evidence of common ancestry
  • Uses tree diagrams to portray relationships based
    upon recency of common ancestry
  • Monophyletic groups (clades) - contain species
    which are more closely related to each other than
    to any outside of the group

14
Monophyletic groups
Archaea outgroup
Bacteria
monophyletic groups (clades)
Bacteria
Bacteria
Eukaryote
Eukaryote
Eukaryote
Eukaryote
15
How construct a phylogeny?
  • What kind of data?
  • How to analyse it?

16
Fossil primate skulls
17
Microbial morphologies - some are complex but
many are simple - for example look at a drop of
lake water
18
Linus Pauling
Linus Pauling and his co-workers asked the
questions Where in organisms has the greatest
amount of information about their past history
survived ? How it can be extracted ?
19
Molecules can tell us about the past
The sequences of DNA, RNA and protein molecules
are documents of evolutionary history
20
What sequences should we use?
  • Choice of sequence - appropriate for question
    (fast or slow evolving - close or distant
    relationships).
  • Many sequences are a mosaic of different rates
  • 16S rRNA different structural regions evolve at
    different rates
  • Proteins - synonymous (silent) rate (codon
    position 3) is often faster than nonsynonymous
    (positions 1 2 - changes aa) rate of change
  • Transitions occur more readily than transversions

21
16S rRNA structure
22
Exploring patterns in sequence data
  • Do the sequences contain phylogenetic signal for
    the relationships of interest? (too conserved or
    too variable)
  • Are sequences saturated for change at the level
    of relationship to be investigated?
  • Do sequences manifest biased base compositions
    (e.g thermophilic convergence) or biased codon
    usage patterns which may obscure phylogenetic
    signal

23
Saturation in sequence data
  • Saturation is due to multiple changes at the same
    site subsequent to lineage splitting
  • Models of evolution attempt to infer the missing
    information through correcting for multiple
    hits
  • Most data will contain some fast evolving sites
    which are potentially saturated (e.g. in proteins
    often position 3)
  • In severe cases the data becomes essentially
    random and all information about relationships
    can be lost

24
Multiple changes at a single site
Seq 1 AGCGAG Seq 2 GCGGAC
Number of changes
Seq 1

Seq 2
25
Biased base compositions?
  • Do sequences manifest biased base compositions
    (e.g thermophilic convergence) or biased codon
    usage patterns which may obscure phylogenetic
    signal

26
A case study in phylogenetic analysisDeinococcus
and Thermus
  • Deinococcus are radiation resistant bacteria
  • Thermus are thermophilic bacteria
  • BUT
  • Both have the same very unusual cell wall based
    upon ornithine
  • Both have the same menaquinones (Mk 9)
  • Both have the same unusual polar lipids
  • Congruence between these complex characters
    supports a phylogenetic relationship between
    Deinococcus and Thermus

27
Guanine Cytosine in 16S rRNA genes
Thermophiles Thermus thermophilus Aquifex
pyrophilus Mesophiles Deinococcus
radiodurans Bacillus subtilis
guanine cytosine at variable sites
72 73 52 50
28
A four taxon problem for Deinococcus and
Thermus(Thermus, Deinococcus, Bacillus, Aquifex)
  • Aquifex and Bacillus are thermophiles and
    mesophiles, respectively
  • No data suggest that Aquifex and Bacillus are
    specifically related to either Deinococcus or
    Thermus
  • If all four bacteria are included in an analysis
    the true tree should place Thermus and
    Deinococcus together

Thermus
Aquifex
The true tree
Deinococcus
Bacillus
29
Most methods of analysis will be fooled by the
base compositional biases in the data
Aquifex 73 GC
Bacillus 50 GC
Deinococcus 52 GC
Thermus 72 GC
The wrong tree - places taxa which share similar
base compositions together
30
Is there a molecular clock?
  • The idea of a molecular clock was initially
    suggested by Zuckerkandl and Pauling in 1962
  • They noted that rates of amino acid replacements
    in animal haemoglobins were roughly proportional
    to time - as judged against the fossil record

31
The molecular clock for alpha-globinEach point
represents the number of substitutions separating
each animal from humans
shark
carp
platypus
number of substitutions
chicken
cow
Time to common ancestor (millions of years)
32
There is no universal molecular clock
  • The initial proposal saw the clock as a Poisson
    process with a constant rate
  • Now known to be more complex - differences in
    rates occur for
  • different sites in a molecule
  • different genes
  • different regions of genomes
  • different genomes in the same cell
  • different taxonomic groups for the same gene
  • there is no universal molecular clock

33
Rates of amino acid replacement in different
proteins
34
Are there local molecular clocks?
  • If there is no universal molecular clock are
    there local clocks?
  • Can individual molecular data sets yield useful
    estimates of times of divergence?
  • Requires
  • demonstration of rate constancy in the data set
    (some kind of relative rate test)
  • sufficient external data - fossils - to reliably
    calibrate the clock

35
Relative rate test (Wilson Sarich, 1973)
Under a molecular clock the distance (K) from A
to O (the common ancestor of A and B), and B to
O, should be the same We can measure relative
rates for A and B by reference to an outgroup
C KAC - KBC 0 gt0 indicates rate Agt rate B lt0
indicates rate Bgtrate A
O
A
B
C
one can therefore exclude taxa which violate rate
constancy
36
Some potential problems with clocks
  • Need a good fossil record to calibrate the clock
    - often missing (e.g. for bacteria?)
  • Windows on fossil time estimates are often large
  • How calculate amount of divergence between 2
    sequences? (use a model - subject of much of the
    present course)

37
Using fossils to date splitting events
inferred timing of split
A
B
C
A
B
C
Time
therefore estimates may be very imprecise
38
Phylogenetic inferences are premised on
  • Phylogenetic inferences are premised on the
    inheritance of ancestral characters, and on the
    existence of an evolutionary history defined by
    changes in these characters
  • A tree like model of evolution (paralogy, lateral
    transfer?)

39
Gene trees and species trees
ORTHOLOGY
40
Paralogy can produce misleading trees
Gene phylogenies
Organism phylogeny
A
a1
b1
B
Misleading tree from incomplete sampling
c1
C
a2
b2
c2
gene duplication
PARALOGY
41
The malic enzyme tree contains paralogues
Anas a duck !
42
Phylogenetic analysis requires careful thought
  • Phylogenetic analysis is frequently treated as a
    black box into which data are fed (often gathered
    at considerable cost) and out of which The Tree
    springs
  • (Hillis, Moritz Mable 1996, Molecular
    Systematics)
Write a Comment
User Comments (0)
About PowerShow.com