An introduction to genetic analysis softwares- genotype date analysis - PowerPoint PPT Presentation

1 / 50
About This Presentation
Title:

An introduction to genetic analysis softwares- genotype date analysis

Description:

BATWING; COLONISE; FDIST2; Hickory; IM; LAMARC; Migrate; MSVAR; bottleneck... HICKORY. Intruduced softwares. MSA (Microsatellite analysis) Genepop ... – PowerPoint PPT presentation

Number of Views:3351
Avg rating:1.0/5.0
Slides: 51
Provided by: ecology
Category:

less

Transcript and Presenter's Notes

Title: An introduction to genetic analysis softwares- genotype date analysis


1
An introduction to genetic analysis
softwares-genotype date analysis
2
Content of population genetic
  • genetic diversity
  • genetic distance
  • F- statistics
  • population structure
  • detection of new immigrants
  • Population size

3
Softwares for population genetic
  • Multi-purpose packages
  • Arlequin DnaSP FSTAT Genepop GDA MEGA MSA
    Arlequin, SPAGeDi GENETIX popgene GeneStrut
    TFPGAgenalex
  • Individual-centred programs
  • BayesAss BAPS GeneClass Geneland
    NewHybrids Structure
  • Specialized programs
  • BATWING COLONISE FDIST2 Hickory IM LAMARC
    Migrate MSVAR bottleneck
  • Tree construction programs
  • Phylip PAUP Dispan
  • AFLP specifical
  • AFLPOP AFLPdat AFLPsure

4
How to get and choose the softwares
  • Related research papers (methods)
  • Software reviews
  • eg. Laurent Excoffier and Gerald Heckel 2006
    NATURE REVIEWS, GENETICS 7 745-758
  • Computer programs for population genetics data
    analysis a survival guide

5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
  • http//evolution.genetics.washington.edu/phylip/so
    ftware.html

9
(No Transcript)
10
(No Transcript)
11
Content of data test
  • Test for loci
  • Null allele (SSR)
  • linkage disequilibrium
  • selective neutrality tests
  • Test population
  • Hardy-Weinberg equilibrium

12
Null allele (SSR)
  • null allele frequencies estimated methods
  • Chakraborty Chakraborty et al 1992 and
    Brookfield Brookfield 1996.
  • Software
  • Micro-Checker Genepop

13
linkage disequilibrium
  • Exact tests for linkage disequilibrium (Guo and
    Thompson, 1992)
  • GDA Genepop FSTAT
  • likelihood ratio test (genotypic data) (Slatkin
    and Excoffier, 1996)
  • Arlequin Genepop
  • Popgenec2 tests for significance (Weir 1979)
  • D-Statistics (Ohtas 1982) for Multiple
    Populations.

14
selective neutrality tests
  • For based on an infinite-alleles model
  • Ewens-Watterson (Ewens, 1972 Watterson, 1978)
  • popgene Arlequin
  • Ewens-Watterson-Slatkin (Slatkin 1994b, 1996)
  • Arlequin
  • Chakra- bortys (Chakraborty, 1990) test.
  • Arlequin
  • FDIS2 can also test, but which model?

15
Hardy-Weinberg equilibrium
  • Chi-square tests
  • popgene
  • G-tests Levenes (1949)
  • popgene
  • exact tests (Guo and Thompson, 1992)
  • genepop GDA Arlequin FSTAT
  • Genepop heterozygote excess or deciency
  • (Rousset Raymond, 1995).
  • exact tests generally more conservative than
    traditional Chi-square tests and G-tests

16
Content of genetic data analysis
  • genetic diversity
  • genetic distance
  • genetic distance tree isolation by distance
  • F- statistics
  • population differentiation migration rate
  • population structure
  • AMOVA analysis clustering patterns
  • immigrants
  • detection of new immigrants hybridization

17
genetic diversity
  • Percentage of polymorphic loci
  • Popgene Arlequin GDA
  • Expected and Observed heterozygosity
  • almost all Multi-purpose packages
  • Number of alleles
  • Popgene MSA FSTAT GDA SPAGeDi
  • Allele and/or genotype frequencies
  • Popgene Genepop MSA FSTAT SPAGeDi
  • Allelic richness (FSTAT MSA)
  • Private allele (GDA)

18
genetic distances and F-statistics
19
genetic distances
  • Standard genetic distance Nei, 1978

the probability of identity of two randomly
chosen genes in population X
the probability of identity of two randomly
chosen genes in population Y
the probability of identity of a gene from X and
a genen from Y
JX, JY and JXY are the arithmetic means of jX, jY
and jXY over all loci
20
genetic distances
  • Da Nei et al. 1983

Come from Dc Cavalli-Sforca and Edwards, 1967
xi the frequency of the ith allele in one
population yi the frequency of the ith allele in
the other population m number of allele for each
locus r number of locus
21
genetic distances
  • (?m)2, Ddm Goldstein et al. 1995a
  • D1, ASD, Average Square Goldstein et al., 1995b
    Slatkin, 1995
  • Dps, Proportion of shared alleles Bowcock et
    al., 1994
  • Dfs, Fuzzy set similarity Dubois and Prade,
    1980
  • Dkf, Kinship coefficient Cavalli-Sforza and
    Bodmer, 1971
  • Dad, absolute difference
  • coancestry (Reynolds et al., 1983)

22
F-statistics
  • ?, f, F Weir BS, Cockerham CC (1984) Evolution,
    38, 13581370.
  • RST,RIT, RIS Slatkin, M. 1995. Genetics 139
    457-462.
  • NST Pons O, Petit RJ (1996) Genetics 144,
    1237-1245.
  • ?ST Rousset, F., 1996. Genetics 142 13571362.

23
Distance-based phylogenetic tree
Genetic distance matrix
  • UPGMA (unweighted pair group method with
    arithmetic averages)
  • NJ (the neighbor-joining method)
  • Phylip PAUP Dispan

24
Genetic distance tree
UPGMA tree
(Muir, Fleming and Schlotterer 2000, Nature)
25
Genetic distance tree
NJ tree Dendrogram of 7 Quercus mengolica
populations based on Standard genetic distance
(Nei 1978) ,computed using NJ approach in PHYLIP.
Numbers are bootstrap support values
26
Isolation by distance
Geographic distance matrix (km)
genetic distance matrix
  • Compute correlation between distance matrices
    Mantel test (Mantel, 1967 Smouse et al. 1986)

27
Mantal test based on 37 Quercus mengolica
population
Genetic distance
Geographic distance
FSTAT Arlequin SPAGeDi Genepop
28
Population structure
  • AMOVA (Analysis of molecular variance)
  • Population genetic structure inferred by
    analysis of variance Arlequin AMOVA GDA

29
migration rate
  • Gene flow based on island model

Wright, S. 1978. Evolution and the genetics of
populations. Chicago University Press, Chicago.
Hartl, D. L., A. G. Clark. 1989. Principles of
population genetics, 2 edition. Sinauer
Associates, Sunderland
Popgene Genepop (private allele method)
30
migration rate
  • Immigrant based on individual genetic distance

Neighbor-joining tree of individuals in the T.
helleri data set. Pritchard et al (2000) Genetics
The pairwise distance matrix was computed as
follows (Mountain and Cavalli-Sforza 1997)
31
Detecting recent immigrants
  • Bayesian clustering approach
  • Markov chain Monte Carlo
  • NewHybrids Geneland IM BAPS, BayesAss
    Structure

32
(No Transcript)
33
(No Transcript)
34
Data convert software
  • SSR data
  • convert Glaubitz J.C. 2004 Molecular Ecology
    Notes 4 309-310
  • GDA
  • GENEPOP
  • ARLEQUIN
  • POPGENE
  • MICROSAT
  • PHYLIP
  • STUCTURE
  • Table of allele frequencies

35
Data convert software
  • 2. Formatomatic
  • Genepop
  • Arlequin
  • IMMANC
  • MSA
  • Msvar
  • Arlequin
  • Migrate
  • im

36
Data convert software
  • Arlequin
  • GenePop ver. 3.0,
  • Biosys ver.1.0,
  • Phylip ver. 3.5
  • Mega ver. 1.0
  • Win Amova ver. 1.55.

37
Flow chart of possible data exchange between
different population genetics programs
38
(No Transcript)
39
Data convert software
  • AFLP data
  • AFLPdat Ehrich, D. 2006 Mol. Ecol. Notes, 6,
    603604
  • ARLEQUIN
  • STRUCTURE (version 2.1)
  • TREECON
  • PAUP
  • HICKORY

40
Intruduced softwares
  • MSA (Microsatellite analysis)
  • Genepop
  • Structure

41
(No Transcript)
42
STRUCTURE -inference of population structure
using multilocus genotype data
43
Applications
  • Demonstrate the presence of population structure
  • Assign individuals to populations
  • Identifying migrants and admixed individuals
  • Markers Microsatellites (SSR), SNPs, RFLP,
    sequence and AFLP

44
models
  • Main modeling assumptions
  • Hardy-Weinberg equilibrium within population
  • Complete linkage equilibrium between loci within
    populations

45
Inference for the number of populations (K)
Propability for K2
Simplify to
46
Easy to Infer K
K3
47
Difficult to determine K
Fig. 3-B,F Log probability of data L(K) as a
function of K, from Evanno, G., Regnaut, S., and
Goudet, J. (2005)
48
?K to determine K
  • based on the second order rate of change of the
    likelihood function (?K)
  • L'(K) L(K) - L(K - 1)
  • L''(K) L'(K 1) - L'(K)
  • ?K m(L''(K))/sL(K)
  • ?K m(L(K 1) - 2 L(K) L(K - 1))/sL(K)
  • Evanno, G., Regnaut, S., and Goudet, J. (2005).
    Detecting the number of clusters of individuals
    using the software STRUCTURE a simulation study.
    Mol. Ecol., 142611-2620

49
(No Transcript)
50
Similarity Coefficients to determin K
  • Similarity coefficients between runs of the same
    K value
  • The coefficient C(Q1Q2)
  • 1-(minP Q1 -P(Q2)F )/Q1 1/KF
  • Rosenberg et al., 2002. Genetic Structure of
    Human Populations. Science 298 2381-2385
Write a Comment
User Comments (0)
About PowerShow.com