DNA variation in Ecology and Evolution IV- Clustering methods and Phylogenetic reconstruction - PowerPoint PPT Presentation

Loading...

PPT – DNA variation in Ecology and Evolution IV- Clustering methods and Phylogenetic reconstruction PowerPoint presentation | free to download - id: 6b91c6-NTQ4O



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

DNA variation in Ecology and Evolution IV- Clustering methods and Phylogenetic reconstruction

Description:

DNA variation in Ecology and Evolution IV- Clustering methods and Phylogenetic reconstruction Maria Eugenia D Amato BCB 705:Biodiversity – PowerPoint PPT presentation

Number of Views:2
Avg rating:3.0/5.0
Date added: 28 August 2019
Slides: 21
Provided by: RichardK182
Learn more at: http://planet.uwc.ac.za
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: DNA variation in Ecology and Evolution IV- Clustering methods and Phylogenetic reconstruction


1
DNA variation in Ecology and Evolution IV-
Clustering methods and Phylogenetic reconstruction
  • Maria Eugenia DAmato

2
Organization of the presentation
Distance ML MP
  • Phylogenetic reconstruction
  • Networks
  • Multivariate analysis

3
Characters independent homologous
  • Continuous
  • Discrete

Binary Multistate
4
DNA sequence characters
Alignment hypothesizing of a homology
relationship for each site
Sequence comparison BLAST search - GenBank
Coding sequence blastn blastx
Non-coding DNA blastn
5
Blast search results
Score E Sequences producing
significant alignments
(Bits) Value gi87299397dbjAB239568.1
Mantella baroni mitochondrial ND5... 101
3e-18 gi343991dbjD10368.1FRGMTURF2 Rana
catesbeiana mitochondri... 97.6
5e-17 gi14209845gbAF314017.1AF314017 Rana
sylvatica NADH dehydr... 93.7 8e-16
The lower the E-value, the better the alignment
GeneBank Accession numbers for the sequence
Species that match the query
6
Blast search results
 
Description of the genes contained in the
sequence with this Accession number
 
Strands aligned
 5end
alignment
7
Phylogenetic reconstruction Distance methods
C1 C2 C3 C4 C5 C6 C7
1 2 3 4 5
5 X 7
Distance criterion
5 x 5
Similarity / dissimilarity criterion
dendrogram
8
Distances criterion for binary data
a a b c   a bands common to a and
b b bands exclusive to a c bands exclusive to
b
J
Jaccards distance
P1
(x2, y2)
Manhattan distance
M
Euclidean distance
? (x1-x2) 2 (x2-y2) 2
P2
(x1, y1)
9
Distance criterion for DNA data- Models of DNA
susbstitution
p n of different nucleotides/ total n
nucleotides
 
10
Models of DNA susbstitution
D 1 ( a f k p)
Equal rate
Jukes and Cantor
dxy - ¾ ln (1- 4/3 D)
F81
B 1 ( ?2A ?2C ?2G ?2T)
Unequal base freqs
dxy - B ln (1- D/B)
P c h i n Transitions Q b d e g
j l m o Transversions  
K2P
11
Distances criterion for diploid data
I
Nei 1972
Jx ?xi2 Jx ?yi2 Jxy ?xiyi
Dn -ln Jxiyi ? JxiJyi

Cavalli Sforza 1967
Darc ? (1/L) ? (2?/?)2   ? cos-1 ? ?xiyi
12
Phylogenetic reconstruction criterion for
distance data
Ultrametric tree (UPGMA)
Additive tree (NJ)
A
C
A
V1
V1
V4
B
V3
V3
V2
V2
V5
D
C
V4
B
Properties
Properties
dAB v1 v2 dAC v1 v3 v4 dAD v1 v3
v5 dBC v2 v3 v5 dCD v4 v5
dAB v1 v2 v3 dAC v1 v2 v4 dBC v3 v4
v3 v4 v1 v2 v3 v2 v4
13
Maximum Likelihood
LD Pr (D?H)
Tree after rooting at an internal node
Unrooted tree
1 J n
  1. C.GGACACGTTTA.C
  2. C.AGACACCTCTA.C
  3. C.GGATAAGTTAA.C
  4. C.GGATAGCCTAG.C

L L1 x L2 x L3x LN. ? Lj   LnL ln L1 ln
L2 . LN ? ln Lj
14
Hypothesis testing Likelihood ratio test
Rate variation
? log L1 log L0
Appropriate substitution Model
2 ? ?2 distribution d.f. N sequences
in the tree 2 or d.f
difference number of parameters H1 and H0
15
Bootstrapping How well supported are the groups?
Trumpet fish
16
Maximum Parsimony
Minimize tree length
To obtain rooted trees (and character polarity)
use an outgroup . The ingroup is monophyletic.
Tree (first site)
5 changes
1 change
G
A
  1. ATATT
  2. ATCGT
  3. GCAGT
  4. GCCGT

G
A
3
1
G
A
A
G
G
A
2
A
4
G
17
Maximum Parsimony- example
Site 2
Site 3
C
T
A
A
A
A
C
T
A
A
C
C
C
T
C
C
C
C
Site 5 No changes
Site 4
Tree length
T
G
T
T
L ?ki1 li
T
T
G
G
T
G
T
G
18
Maximum parsimony example
Sites 1 2 3 4 5 Total
Tree
((1,2),(3,4)) 1 1 2 1
0 5 ((1,3),(2,4)) 2
2 1 1 0
6 ((1,4),(2,3)) 2 2 2
1 0 7
Phylogenetically informative sites
19
Networks
  • Phylogenetic representation allowing
    reticulation
  • More appropriate for intraespecific data
  • Ancestor is alive
  • hybridization, recombination, horizontal
    transfer, polyploidization

agct
1
acat
agct
ac
ct
2
3
4
5
7
6
acat
agct
acct
20
Multivariate clustering
C1 C2 C3 C4 C5 C6 C7
1 2 3 4 5
5 X 7

Y 2nd axis
similarity criterion correlations

Z 3rd axis


7 x 7

X 1st axis
Calculate eigenvectors with highest eigenvalues
Project data onto new axes (eigenvectors)
About PowerShow.com