Title: Ferramentas moleculares e estatsticas para estudos de estrutura de populaes
1Ferramentas moleculares e estatísticas para
estudos de estrutura de populações
- Guillermo Ortí
- University of Nebraska
- Lincoln, NE, USA
2PDF versions of papers and powerpoints are (will
be) downloadable fromhttp//golab.unl.edu/teachi
ng
3Comparison of some markers
Other criteria Automation? Sequence info needed?
Start-up costs? Radioactivity needed?
4Case studies
8 microsatellite loci Individuals (22-105 per
locality) sampled from 19 localities upstream and
downstream of dam (constructed in 1952) Test 4
hypotheses about the origin of fish congregating
at the dam site during breeding season every year
5DATA ANALYSIS Basic population parameters, Hs
(heterozygosity), allele freq, Fst, deviations
from H-W, genetic distances among populations
(GENEPOP) Assignment tests (Paetkau et al 1997
online calculator) to determine the most probable
origin of each individual based on expected
genotypic freq of each sample. Gives proportion
of indiv from each population (locality)
assigned to every other population
6Assignment tests are consistent with result Dam
fish are more likely to originate from upstream
populations
7Assignment tests rely on previously defined
populations, but this definition could be
subjective or based on unreliable information
(typically geographic location, morphotype,
etc) So, how well does this initial assignment
correspond to genetic information? Some species
may have cryptic populations, not easily
defined by any visible trait(s), but have strong
genetic signatures
8- Clustering methods typically use a distance
measure (e.g. chord distance) among individuals
and a clustering algorithm, UPGMA or NJ.
These are appealing and easy to apply, good
exploratory tools but have shortcomings - Clusters obtained may depend on distance measure
or clustering algorithm used - Difficult to assess confidence (bootstrap?).
- Difficult to add other sources of information,
such as geographical information of the samples - Model based methods assume that observations
from each cluster are random draws from a
parametric model. Inference of the parameters
corresponding to each cluster is done jointly
with inference of cluster membership for each
individual (ML or MCMC approach). P(X Z, P)
where X is the genotype of an individual, Z the
(unknown) population of origin, and P the
(unknown) allelic freq of all populations.
9Taita bird (Turdus helleri) in Africa Sampled
from 4 locations 7 microsatellite loci used NJ
tree of individuals Based on pairwise dist
10- STRUCTURE
- Can accurately cluster individuals into their
appropriate populations, even with modest number
of loci. - May be applied to Codominant markers such as
microsatellites, RFLP, SNPs. - Assumes H-W equilibrium and unlinked loci
- Accuracy depends on number of indiv sampled,
number of loci, degree of admixture and allelic
freq divergence among subpopulations - Assigns individuals to subpopulations
- May identify migrants (with incorporation of
information on geographic location) or hybrids
11Study based on 18 microsatellite markers scored
for 380 individuals Use of clustering methods and
STRUCTURE to detect discrete genetic
populations Inference of past bottleneck effects
(demography) using three different approaches
extent and timing of demographic events estimated
(coalescent methods) Good source of analytical
tools currently available to apply to
microsatellite data
12Comparison of some markers
Other criteria Automation? Sequence info needed?
Start-up costs? Radioactivity needed?
13DNA sequence analysis
- Homologous sections of the genome are amplified
(PCR) and directly sequenced for comparison. - Easily scored, high-quality information
sometimes alignment is a problem - Automated procedures available (but expensive)
- Codominant marker but cloning is necessary to
separate alleles from heterozygotes - Universal primers facilitate sequencing for
many taxa without prior knowledge of their DNA
sequence (not always true) - Once polymorphism has been identified by
sequencing, other approaches can be used to
screen large number of individuals (PCR-RFLP,
SNPs, SSCP)
14DNA sequence analysis
- APPLICATIONS
- Population diversity and structure (subdivision)
- Hybridization, introgression (among species or
populations), gene flow - Greatest value for phylogenetic and
phylogeographic analysis - ANALYSIS
- Population genetic parameters, genetic distances,
Fst, AMOVA, etc - Neutrality tests (Fu, Fu an Li)
- Phylogenetic inference interpretation of
phylogeny - Coalescent approaches becoming more common,
increasing power of interpretation
15mtDNA phylogeography...
16Interspecific phylogeny
- Foundations of cladistic analysis developed by
Hennig 1966. - Cladogenesis only above the species level
- Species are the minimal units for phylogenetic
analysis - Other (many) species concepts available...
17Difficulties of interspecific methods at the
intraspecific level
- Few characters for analysis (less variation
within species than between species - Ancestors not assumed to be extinct, most alleles
exist as sets of multiple, identical copies
because of past DNA replication - As mutation occur to create new alleles, they
rarely result in the extinction of the ancestral
allele. - Coalescent theory predicts that the most common
allele in the gene pool will be the oldest, and
most of these will be interior nodes of the
haplotype tree - Multifurcations (rather than strict bifurcations)
are expected since common alleles will mutate
many times
18Big problem with gene genealogies is recombination
- Traditional methods assume that recombination
does not occur (at all levels) - Need to test for recombination before attempting
phylogenetic analysis - A methods that accounts for all these issues was
develop to study intraspecific gene genealogies
(TCS 1992)
19Intraspecific gene genealogy (parsimony network)
20Nested Clade Analysis...
- Nested clades in an mtDNA network for tiger
salamanders collected in the central USA
(Templeton et al 1995) - Circles encompassing letters a-v indicate
observed haplotypes - Branches connect haplotypes differing by a single
mutational step - Differences between subspecies are 14 steps.
- Nested clades of increasing hierarchical level
are indicated by numbers
21Nested Clade Analysis...
Recent paper criticized strict interpretation of
results obtained with this technique
22Nested Clade Analysis...
A reply was prompt by the lead author proposing
NCA (Alan Templeton)
23Case study comparing mtDNA and nucDNA sequence
variation for phylogeography