Association Analysis - PowerPoint PPT Presentation

About This Presentation
Title:

Association Analysis

Description:

Association Analysis. Spotted history. Many real and presumed false positives ... exaggerate false-positives (esp in homozygosity mapping) ... – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 28
Provided by: LonCa3
Category:

less

Transcript and Presenter's Notes

Title: Association Analysis


1
Association Analysis
  • Spotted history
  • Many real and presumed false positives
  • Very difficult to know which results are real

2
(No Transcript)
3
Why so few successes in human complex trait
genetics?
  • Obvious explanations
  • Polygenic systems too complicated
  • GxE interaction
  • epistasis
  • too many genes genes of small effect
  • heterogeneity
  • Phenotypes poorly defined/unreliable low
    validity
  • Too few markers available
  • Sample sizes (effect sizes) too small
  • Multiple testing problem unresolved

4
Genotyping Error
  • Genotyping accuracy one of most critical
    components of any
  • mapping study
  • Small amounts error cause real findings to be
    missed or lead to false claims of real effects
  • Once genotyping completed, several main ways to
    detect errors
  • 1) Look at departures from Hardy-Weinberg
    Equilibrium (HWE)
  • 2) Look for sample mixups, incorrect
    relationships
  • 3) Identify Mendelian inconsistencies in
    families
  • (also can detect excess recombinants)
  • Note that (1) is at marker level (good SNP,
    bad SNP), (2) is at sample level while (3) is
    at level of individual genotype
  • None of these guaranteed to detect majority of
    errors
  • Best solution is to emphasise accuracy before
    analysis starts

5
Genotyping ErrorHardy-Weinberg Equilibrium
For a SNP with two alleles, A1 and A2, and
frequencies p f(A1) and q f(A2). If there
is no selection, excess mutation or nonrandom
mating, The genotype frequencies will
be Genotype A1A1 p2 Genotype A1A2,
A2A1 2pq Genotype A2A2 q2
Genotyping error perturbs these ratios - errors
often have directional bias (e.g, under-represent
heterozygotes) - can have dramatic results
exaggerate false-positives (esp in homozygosity
mapping) lose statistical power (esp acute in
complex traits)
The program pedstats tests for HWE deviations
6
Are Pedigree Errors Still an Issue?
Excerpt from Am J Hum Genet, 2000
7
Pedigree Errors
  • Type I error increases come from, e.g.
  • MZ twins coded as full-sibs, who share 2 alleles
    IBD at all loci
  • Full-siblings coded as half-sibs (expect ¼
    sharing, observe ½)
  • Any close relative coded as more distant
  • Power reduction comes from
  • Half-siblings coded as full-sibs
  • Any distant relative coded as more related than
    they are

How many studies have unknowingly suffered (Type
I or power loss) because of this?
8
How can this be fixed?
  • Different relative pairs are characterized by
    different patterns of allele sharing
  • half-sibs share more alleles on average (ibs)
    than full sibs
  • Parent-offspring pairs share the same number of
    alleles on average as sib pairs, but with less
    variability (they always share one allele)
  • Unrelated pairs share less than relatives

9
Identity by State
  • AA x AA
  • Aa x Aa
  • aa x aa
  • AA x Aa
  • Aa x aa
  • AA x aa

2 alleles shared ibs
1 allele shared ibs
0 alleles shared ibs
With genome scan of G markers, can easily compute
mean and variance of genome-wide ibs sharing for
any pair of individuals i,j (the individuals need
not be in the same pedigree)
10
Pedigree errors amongst close relatives are easy
to detect in genome scans - data published in
last 2 years -
GRR (Abecasis et al, 2001), for other methods see
McPeek Sun (2000), Epstein et al. (2000)
11
Mendelian Inheritance Errors
  • Modest levels are likely
  • Up to 1 may be typical
  • Mendelian inheritance checks
  • Can detect up 30 of errors for SNPs
  • (Gordon, Heath, Ott, Hum Hered, 1998)
  • Large effect on power, accuracy
  • Linkage vs. Association
  • SNPs vs. Microsatellites
  • Pairwise LD
  • Haplotype estimation

(Abecasis et al, EJHG 2001 Akey et al., AJHG
2001, Kirk Cardon, EJHG 2002)
12
Mendelian Error Detection
11
12
12
22
13
Nuclear families individually consistent with
Mendelian inheritance
14
Consistent only if missing offspring has 22
genotype
Consistent only if missing parent has 12 genotype
Error detection by direct observation can miss
errors
15
Genotyping Error Affected Sib Pair Sample
No error
0.5 error
1 error
2 error
5 error
ls 1.5 Lods calculated using Kong Cox
(signed) procedure
16
Genotyping Error QuantitativeTrait Linkage
Analysis
0.5 error
1 error
2 error
5 error
10 error
Dense SNP map (1 SNP/2cM)
17
Association Analysis
Allele frequency differences
18
Genotype Error
  • Small error rates can have dramatic consequences
  • Effects depend on study design
  • ASPs lose power DSPs inflate Type I common
    allele association not great influence rare
    allele worse
  • Crucial issue is detection
  • not essential that errors are resolved, just
    detected (LRC2003 this may turn out to be
    wrong!)
  • What levels can be tolerated in pharmacogenetics,
    pooling or large-scale association studies?
  • Detection without families hard problem

Is genotype error partly responsible for
marginal linkage outcomes and/or unreplicable
associations?
19
Genotyping Error Effects on Haplotype Estimation
  • Estimating haplotypes important for LD,
    association studies
  • Several different methods available to estimate
    haplotypes
  • Families (segregation)
  • Molecular (haploid cell lines)
  • Unrelated individuals (if high LD)
  • What effect does genotyping error have on
    haplotype estimation?

Kirk Cardon, Euro J Hum Genet 2002
20
Unrelateds Trios 4-sibs
21
Given methodological differences in haplotype
accuracy, what is influence of error on each
design?
22
Genotyping Error and Haplotype Estimation
  • At modest levels, genotyping error not great
    concern for family designs
  • Haplotype estimation in unrelateds is
    surprisingly robust when LD is high
  • But when LD low or many common alleles, serious
    consequences
  • Problem Generally dont know LD in advance so
    cant predict outcome
  • Trios inefficient design
  • Perform slightly better than unrelateds, but too
    little power to detect many errors
  • With regard to error, trios least desirable
    approach
  • Conditional on baseline differences in haplotype
    estimation, individual haplotype estimation
    influenced about same in all designs
  • Genotyping error serious problem for linkage,
    association studies, but less so for estimation
    of haplotypes themselves

23
Simulation Study
Genome of 22 autosomes each of 100 cM (a
lie) 10 markers/chromosome 5 equifrequent
alleles/marker 252 unselected sib pairs gt 1
QTL somewhere in the genome background h2
moderate (30)
24
How many QTLs? Where are they?
25
Simulation Study Exercise
  • FILES F\lon\2003\scan?.ped, scan?.dat,
    scan.map
  • Run pedstats to view HWE tests
  • pedstats p scan1.ped d scan1.ped --ignore
    --hardy more
  • 2) Find the sample mixups using GRR. How many
    mixups are there? What family(ies) are involved?
  • Check for Mendelian errors using pedstats or
    merlin. Are there any? What would you do about
    this?
  • pedstats p scan1.ped d scan1.dat more
  • merlin p scan1.ped d scan1.dat m scan.map
    more
  • What differences do you see between the programs?
    Can you predict the impact on the results?

26
(No Transcript)
27
Clean Data
Mixed-up Data
28
Clean Data
Genotype-error Data
Write a Comment
User Comments (0)
About PowerShow.com