mStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

mStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations

Description:

State-of-the-art method: Structure. Structure Model. x: Microsatellite alleles : unique set ... coefficient vector. Pitfall of Structure ... – PowerPoint PPT presentation

Number of Views:49
Avg rating:3.0/5.0
Slides: 14
Provided by: people3
Category:

less

Transcript and Presenter's Notes

Title: mStruct: A New Admixture Model for Inference of Population Structure in Light of Both Genetic Admixing and Allele Mutations


1
mStructA New Admixture Model for Inference of
Population Structure in Light of Both Genetic
Admixing and Allele Mutations
  • Suyash Shringarpure and Eric Xing
  • School of Computer Science
  • Carnegie Mellon University
  • ICML 2008

Presented by Haojun Chen
2
Outline
  • Background
  • Structure Model
  • mStruct Model
  • Experiment Results
  • Summary

3
Background
  • Allele one member of a pair or series of
    different forms of a gene
  • Population structure analysis aim to shed light
    on evolutionary history of modern human
    population
  • Microsatellites and single nucleotide
    polymorphisms (SNP) data base of population
    structure analysis
  • State-of-the-art method Structure

4
Structure Model
x Microsatellite alleles
unique set of population-specific
multinomial distributions
vector of multinomial parameters,
a.k.a., allele frequency profile (AP), of the
allele distribution at locus i in ancestral
population k total number of observed
marker alleles at locus I total
number of marker loci total number of
individuals individual-specific admixing
coefficient vector
5
Pitfall of Structure
  • There is no mutation model for modern individual
    alleles with respect to common prototypes in the
    modern populations
  • Every unique allele in the modern population is
    assumed to have a distinct ancestral frequency,
    rather than allowing the possibility of it just
    being a descendent of some common ancestral allele

6
mStruct Model
set of
ancestral alleles mutation parameter
associated with locus frequencies of the
ancestral alleles total number of ancestral
alleles
Microsatellite mutation model SNP mutation model
7
Generative Process
  • Generative process for Structure
  • where
  • Generative process for mStruct
  • step 2.2 above is replaced by

8
mStruct Model Inference
  • MCMC slow
  • Variational inference for hidden variable
  • variational EM for hyperparameter

9
Synthetic Data
Twenty microsatellite genotype datasets with 100
individuals from 3 ancestral populations at 50
genotype loci
10
HGDP Microsatellite Data
  • Model selection by BIC (Bayesian Information
    Criterion) score

11
HGDP Microsatellite Data
1056 individuals from 52 populations at 377
autosomal microsatellite loci
am-spectrum spectrums of different ancestral
populations gm-spectrum spectrums of
different geographical populations
12
Contour of Mutation Rates
13
Summary
  • mStruct takes into account genetic admixture and
    allele mutation effects
  • mStruct extended LDA which allows noisy
    observations
  • Variational inference algorithm that allows
    tractable inference developed for mStruct
  • Other application images, text and so on
Write a Comment
User Comments (0)
About PowerShow.com