IBD%20Estimation%20in%20Pedigrees - PowerPoint PPT Presentation

About This Presentation
Title:

IBD%20Estimation%20in%20Pedigrees

Description:

In a pedigree with n non-founders, there are 2n meioses each with 2 possible outcomes ... Limited pedigree size. Computational advantages. Reduced recombination ... – PowerPoint PPT presentation

Number of Views:51
Avg rating:3.0/5.0
Slides: 37
Provided by: GoncaloA6
Category:

less

Transcript and Presenter's Notes

Title: IBD%20Estimation%20in%20Pedigrees


1
IBD Estimation in Pedigrees
  • Gonçalo Abecasis
  • University of Oxford

2
3 Stages of Genetic Mapping
  • Are there genes influencing this trait?
  • Epidemiological studies
  • Where are those genes?
  • Linkage analysis
  • What are those genes?
  • Association analysis

3
(No Transcript)
4
Relationship Checking
5
Where are those genes?
6
Tracing Chromosomes
7
Sometimes it is easy
8
Sharing, or Not?
9
Data
  • Polymorphic markers
  • Eg. Microsatellite repeats, SNPs
  • Allele frequency
  • Location
  • Task
  • Phase markers
  • Place recombinants

10
Complexity of the Problem
  • For each meiosis
  • In a pedigree with n non-founders, there are 2n
    meioses each with 2 possible outcomes
  • For each location
  • One for each of m markers
  • Up to 4nm distinct outcomes

11
Elston-Stewart Algorithm
  • Factorize likelihood by individual
  • Each step assigns phase
  • for all markers
  • for one individual
  • Complexity ? n 4m
  • Small number of markers
  • Large pedigrees
  • With little inbreeding

12
Lander-Green Algorithm
  • Factorize likelihood by marker
  • Each step assigns phase
  • For one marker
  • For all individuals in the pedigree
  • Complexity ? m 4n
  • Large number of markers
  • Assumes no interference
  • Relatively small pedigrees

13
Markov-Chain Monte-Carlo
  • Approximate solutions
  • Explore only most likely outcomes
  • Remove restrictions
  • Pedigree size
  • Number of markers
  • Inbreeding
  • Assuming no interference
  • Computationally intensive

14
Popular Packages
  • Elston-Stewart Algorithm
  • LINKAGE / FASTLINK (Lathrop et al, 1985)
  • VITESSE (OConnell and Weeks, 1995)
  • Lander-Green Algorithm
  • Genehunter (Kruglyak et al, 1995)
  • Allegro (Gudbjartsson et al, 2000)
  • MCMC
  • Simwalk2 (Sobel et al, 1996)
  • LOKI (Heath, 1998)

15
1. Enumerate Possibilities
  • Enumerate gene-flow patterns
  • Gene-flow pattern
  • Sets transmitted allele for each meiosis
  • Implies founder allele for each individual

16
2. Founder Allele Sets
  • For each gene flow pattern v
  • Enumerate set A(G,v)
  • All allele states a a1, , a2f
  • Compatible with both
  • Gene flow v
  • Genotypes G
  • The likelihood is L(vG) 2-2n?a?i f(ai)
  • f(ai) is the frequency of allele ai

17
For example ...
Genotypes
Gene Flow
Founder Alleles
Four meioses. Three one alleles
required. Likelihood ½4 f(a1)3
18
Single Marker Probabilities
  • We now have ...
  • Likelihood for each gene flow pattern
  • Conditional on genotypes
  • Conditional on allele frequencies
  • Conditional on a single marker
  • Probability for each gene-flow pattern
  • P(v) L(v) / ?vL(v)

19
3. Allowing for Recombination
  • Transition Probability
  • T(va?vb, ?) (1-?)nr(Va,Vb)?r(Va,Vb)
  • Transition Matrix

Location A
Location B
20
Moving along chromosome
  • Input
  • Vector v of likelihoods at location A
  • Matrix T of transition probabilities A?B
  • Output
  • Vector v of likelihoods at location B
  • Conditional on likelihoods at A
  • For k vectors, requires k2 operations

21
Elston and Idury Algorithm
  • Requires k log2 k operations

22
Moving Along Chromosome
23
Markov-Chains
  • Single Marker
  • Left Conditional
  • Right Conditional
  • Full Likelihood

24
MERLIN
  • Fast multipoint calculations
  • Non-parametric linkage analyses
  • Error detection
  • e.g., unlikely obligate recombinants
  • Haplotyping
  • most likely, exhaustive lists, sampling

25
Sparse Gene Flow Trees
26
Dense maps
  • Computational challenge
  • Require more memory
  • Require Lander-Green algorithm
  • Limited pedigree size
  • Computational advantages
  • Reduced recombination between markers
  • Approximate solutions possible if steps with many
    recombinants are ignored

27
MERLIN Example Pedigrees
28
MERLIN Timings
29
MERLIN Memory Usage
30
Command Line Options
31
Effect of Genotyping Error
  • Modest levels are likely
  • Up to 1 may be typical
  • Mendelian inheritance checks
  • Detect up to 30 of errors for SNPs
  • Effect on power
  • Linkage vs. Association
  • SNPs vs. Microsatellites

32
Affected Sib Pair Sample
33
Unselected Sample
34
Association Analysis
35
Error Detection
  • Genotype errors can introduce unlikely
    recombinants
  • Change likelihood
  • Replace (1-q) with q
  • Test sensitivity of likelihood to each genotype
  • Detects errors that have largest effect on linkage

36
Practical Exercise
  • Lon Cardon
  • Stacey Cherny

37
Tracing Chromosomes
  • Get Picture of Monogenic Pedigree
  • Comment how
  • Individuals sharing some variant are similar
  • Have disease
  • Effects might be more subtle
  • Mention challenge reconstructing chromosomes

38
Error Detection
  • Using all the data
  • Lfull using known recombination fractions
  • Ufull assuming unlinked markers
  • Excluding a single genotype
  • Lno geno using known recombination fractions
  • Uno geno assuming unlinked markers
  • Calculate R (Lno genoUfull)/(LfullUno geno)
  • Large R imply recombinants
  • not supported by other data
Write a Comment
User Comments (0)
About PowerShow.com