Introduction to phylogeny - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

Introduction to phylogeny

Description:

b. c. d. supertree. T. 1. T. 2. 8. Some desirable properties of a. supertree method (Steel et al., 2000) ... max. S. E. T. 1. T. 2. T. 1. T. 2. 16. My mincut ... – PowerPoint PPT presentation

Number of Views:25
Avg rating:3.0/5.0
Slides: 30
Provided by: Roderic
Category:

less

Transcript and Presenter's Notes

Title: Introduction to phylogeny


1
Modified Mincut Supertrees
Roderic Page University of Glasgow
2
Tree of Life
  • About 1.7 million species described.
  • What we have so far
  • TreeBASE database (15,000 taxa)
  • Ribosomal Database Project (RDP II) (20,000
    sequences)
  • The Tree of Life Project (11,000 taxa)

3
Recent interest in the Tree of Life
NSF sponsored Tree of Life workshops (2000-2001)
US 10 million to construct a phylogeny for the
1.7 million described species of Life announced
February 15th 2002
Assembling the Tree of Life Science, Relevance,
and Challenges AMNH, New York, May 2002
European initiative (ATOL) under FP6
4
Problem how to build the tree of life
  • Solutions
  • Find one or more magic markers that will allow
    us to recover the whole tree in one go (problems
    combinability and complexity)
  • Assemble big tree from many smaller trees derived
    from many kinds of data (supertrees)

5
Tree terminology
d
a
b
c
leaf

a,b

edge
internal node

a,b,c

cluster
root

a,b,c,d

6
Nestings and triplets
d
a
b
c
Nestings
a,b ltT a,b,c,d
b,c ltT a,b,c,d
Triplets
(bc)d
bcd
7
Supertree
d
a
b
c
a
b
c
b
c
d


T
T
1
2
supertree
8
Some desirable properties of a supertree
method(Steel et al., 2000)
  • The supertree can be computed in polynomial time
  • A grouping in one or more trees that is not
    contradicted by any other tree occurs in the
    supertree

9
1 2 3
MRP (Matrix Representation Parsimony)
Homo sapiens 1 1 1 Pan paniscus 1 1
1 Gorilla gorilla 1 1 0 Pongo
pygmaeus 1 0 0 Hylobates 0 0 0
3
2
1
  • NP-hard
  • Can generate many solutions

10
Aho et al.s algorithm (OneTree)
  • Aho, A. V., Sagiv, Y., Syzmanski, T. G., and
    Ullman, J. D. 1981. Inferring a tree from lowest
    common ancestors with an application to the
    optimization of relational expressions. SIAM J.
    Comput. 10 405-421.
  • Input set of rooted trees
  • 1. If set is compatible (i.e., will agree on a
    tree), output that tree.
  • 2. If set is not compatible, stop!

11
a
b
c
b
c
d
Aho et al.s OneTree algorithm
T
T
1
2
supertree
12
Mincut supertrees
  • Semple, C., and Steel, M. 2000. A supertree
    method for rooted trees. Discrete Appl. Math.
    105 147-158.
  • Modifies OneTree by cutting graph
  • Requires rooted trees (no analogue of OneTree for
    unrooted trees)
  • Recursive
  • Polynomial time

13
a
b
c
d
e
a
b
c
d
T
T
1
2
S

T
,
T

1
2
Semple and Steel (2000)
14
Collapsing the graph(Semple and Steel mincut
algorithm)
This edge has maximum weight
b
a,b
1
2
1
c
a
c
1
1
1
d
e
d
e
1
1
max
S
S
/
E

T
,
T


T
,
T


T
,
T

1
2
1
2
1
2
15
Cut the graph to get supertree
a,b
a
b
c
d
e
1
c
1
d
e
1
max
S
/
E

T
,
T


T
,
T

1
2
1
2
supertree
16
My mincut supertree implementationdarwin.zoology.
gla.ac.uk/rpage/supertree
  • Written in C
  • Uses GTL (Graph Template Library) to handle
    graphs (formerly a free alternative to LEDA)
  • Finds all mincuts of a graph faster than Semple
    and Steels algorithm

17
A counter example two input trees...
a
c
b
b
a
c
y
1
x
1
y
2
x
2
y
3
x
y
3
4
18
Mincut gives this (strange) result
  • Disputed relationships among a, b, and c are
    resolved
  • x1, x2, and x3 collapsed into polytomy

c
x
1
x
2
x
3
b
a
y
1
y
2
y
3
y
4
19
ProblemCuts depend on connectivity(in this
example it is a function of tree size)
y4
x3
y1
x2
y2
b
x1
y3
c
a
20
So, mincut doesnt work
  • But, Semple and Steel said it did
  • My program seems to work
  • Argh!!! What is happening.?

21
What mincut does and does not do
  • Mincut supertree is guaranteed to include any
    nesting which occurs in all input trees
  • Makes no claims about nestings which occur in
    only some of the trees
  • Does exactly what it says on the tin

22
Modifying mincut supertree
  • Can we incorporate more of the information in the
    input trees?
  • Three categories of information
  • Unanimous (all trees have that grouping)
  • Contradicted (trees explicitly disagree)
  • Uncontradicted (some trees have information that
    no other tree disagrees with)

23
Uncontradicted informationassume we have k input
trees
a and b co-occur in a tree
a and b nested in a tree
n
c
a
b
a
b
c - n 0 ? uncontradicted (if c k then
unanimous)
c - n gt 0 ? contradicted
24
Uncontradicted informationassume we have k input
trees
a and b in a fan
a and b co-occur in a tree
a and b nested in a tree
f
n
c
a
b
a
b
a
b
c - n -f 0 ? uncontradicted (if c k then
unanimous)
c - n - f gt 0 ? contradicted
25
Classifying edges
S

T
,
T

1
2
y
x
1
1
y
y
1
2
x
x
y
2
1
2
y
y
x
3
4
2
x
3
b
y
b
4
y
x
3
3
a
c
a
c
Uncontradicted
Uncontradicted but adjacent to contradicted
Contradicted
26
Modified mincut
  • Species a, b, and c form a polytomy
  • x1, x2, and x3 resolved as per the input tree

modified
mincut
a
b
c
x
1
x
2
x
3
y
1
y
2
y
3
y
4
27
If no tree contradicts an item of information, is
that information always in the supertree?
(23)5
(12)5
(45)1
(34)1
28
No!Steel, Dress, Böcker 2000
  • The four trees display (12)5, (23)5, (34)1, and
    (45)1
  • No tree displays (IK)J or (JK)I for any (IJ)K
    above
  • Triplets are uncontradicted, but cannot form a
    tree

29
Future directions
  • Improve handling of uncontradicted information
  • Add support for constraints
  • Visualising very big trees
  • Better integration into phylogeny
  • databases (www.treebase.org)
  • darwin.zoology.gla.ac.uk/rpage/supertree
Write a Comment
User Comments (0)
About PowerShow.com