Association Analysis - PowerPoint PPT Presentation

About This Presentation

Title:

Association Analysis

Description:

Association Analysis. Spotted history. Many real and presumed false positives ... exaggerate false-positives (esp in homozygosity mapping) ... – PowerPoint PPT presentation

Number of Views:43

Avg rating:3.0/5.0

Slides: 28

Provided by: LonCa3

Learn more at: http://ibgwww.colorado.edu

Category:

more less

Transcript and Presenter's Notes

Title: Association Analysis

1
Association Analysis

Spotted history
Many real and presumed false positives
Very difficult to know which results are real

2
(No Transcript)
3
Why so few successes in human complex trait
genetics?

Obvious explanations
Polygenic systems too complicated
GxE interaction
epistasis
too many genes genes of small effect
heterogeneity
Phenotypes poorly defined/unreliable low
validity
Too few markers available
Sample sizes (effect sizes) too small
Multiple testing problem unresolved

4
Genotyping Error

Genotyping accuracy one of most critical
components of any
mapping study
Small amounts error cause real findings to be
missed or lead to false claims of real effects
Once genotyping completed, several main ways to
detect errors
1) Look at departures from Hardy-Weinberg
Equilibrium (HWE)
2) Look for sample mixups, incorrect
relationships
3) Identify Mendelian inconsistencies in
families
(also can detect excess recombinants)
Note that (1) is at marker level (good SNP,
bad SNP), (2) is at sample level while (3) is
at level of individual genotype
None of these guaranteed to detect majority of
errors
Best solution is to emphasise accuracy before
analysis starts

5
Genotyping ErrorHardy-Weinberg Equilibrium
For a SNP with two alleles, A1 and A2, and
frequencies p f(A1) and q f(A2). If there
is no selection, excess mutation or nonrandom
mating, The genotype frequencies will
be Genotype A1A1 p2 Genotype A1A2,
A2A1 2pq Genotype A2A2 q2
Genotyping error perturbs these ratios - errors
often have directional bias (e.g, under-represent
heterozygotes) - can have dramatic results
exaggerate false-positives (esp in homozygosity
mapping) lose statistical power (esp acute in
complex traits)
The program pedstats tests for HWE deviations
6
Are Pedigree Errors Still an Issue?
Excerpt from Am J Hum Genet, 2000
7
Pedigree Errors

Type I error increases come from, e.g.
MZ twins coded as full-sibs, who share 2 alleles
IBD at all loci
Full-siblings coded as half-sibs (expect ¼
sharing, observe ½)
Any close relative coded as more distant
Power reduction comes from
Half-siblings coded as full-sibs
Any distant relative coded as more related than
they are

How many studies have unknowingly suffered (Type
I or power loss) because of this?
8
How can this be fixed?

Different relative pairs are characterized by
different patterns of allele sharing
half-sibs share more alleles on average (ibs)
than full sibs
Parent-offspring pairs share the same number of
alleles on average as sib pairs, but with less
variability (they always share one allele)
Unrelated pairs share less than relatives

9
Identity by State

AA x AA
Aa x Aa
aa x aa
AA x Aa
Aa x aa
AA x aa

2 alleles shared ibs
1 allele shared ibs
0 alleles shared ibs
With genome scan of G markers, can easily compute
mean and variance of genome-wide ibs sharing for
any pair of individuals i,j (the individuals need
not be in the same pedigree)
10
Pedigree errors amongst close relatives are easy
to detect in genome scans - data published in
last 2 years -
GRR (Abecasis et al, 2001), for other methods see
McPeek Sun (2000), Epstein et al. (2000)
11
Mendelian Inheritance Errors

Modest levels are likely
Up to 1 may be typical
Mendelian inheritance checks
Can detect up 30 of errors for SNPs
(Gordon, Heath, Ott, Hum Hered, 1998)
Large effect on power, accuracy
Linkage vs. Association
SNPs vs. Microsatellites
Pairwise LD
Haplotype estimation

(Abecasis et al, EJHG 2001 Akey et al., AJHG
2001, Kirk Cardon, EJHG 2002)
12
Mendelian Error Detection
11
12
12
22
13
Nuclear families individually consistent with
Mendelian inheritance
14
Consistent only if missing offspring has 22
genotype
Consistent only if missing parent has 12 genotype
Error detection by direct observation can miss
errors
15
Genotyping Error Affected Sib Pair Sample
No error
0.5 error
1 error
2 error
5 error
ls 1.5 Lods calculated using Kong Cox
(signed) procedure
16
Genotyping Error QuantitativeTrait Linkage
Analysis
0.5 error
1 error
2 error
5 error
10 error
Dense SNP map (1 SNP/2cM)
17
Association Analysis
Allele frequency differences
18
Genotype Error

Small error rates can have dramatic consequences
Effects depend on study design
ASPs lose power DSPs inflate Type I common
allele association not great influence rare
allele worse
Crucial issue is detection
not essential that errors are resolved, just
detected (LRC2003 this may turn out to be
wrong!)
What levels can be tolerated in pharmacogenetics,
pooling or large-scale association studies?
Detection without families hard problem

Is genotype error partly responsible for
marginal linkage outcomes and/or unreplicable
associations?
19
Genotyping Error Effects on Haplotype Estimation

Estimating haplotypes important for LD,
association studies
Several different methods available to estimate
haplotypes
Families (segregation)
Molecular (haploid cell lines)
Unrelated individuals (if high LD)
What effect does genotyping error have on
haplotype estimation?

Kirk Cardon, Euro J Hum Genet 2002
20
Unrelateds Trios 4-sibs
21
Given methodological differences in haplotype
accuracy, what is influence of error on each
design?
22
Genotyping Error and Haplotype Estimation

At modest levels, genotyping error not great
concern for family designs
Haplotype estimation in unrelateds is
surprisingly robust when LD is high
But when LD low or many common alleles, serious
consequences
Problem Generally dont know LD in advance so
cant predict outcome
Trios inefficient design
Perform slightly better than unrelateds, but too
little power to detect many errors
With regard to error, trios least desirable
approach
Conditional on baseline differences in haplotype
estimation, individual haplotype estimation
influenced about same in all designs
Genotyping error serious problem for linkage,
association studies, but less so for estimation
of haplotypes themselves

23
Simulation Study
Genome of 22 autosomes each of 100 cM (a
lie) 10 markers/chromosome 5 equifrequent
alleles/marker 252 unselected sib pairs gt 1
QTL somewhere in the genome background h2
moderate (30)
24
How many QTLs? Where are they?
25
Simulation Study Exercise

FILES F\lon\2003\scan?.ped, scan?.dat,
scan.map
Run pedstats to view HWE tests
pedstats p scan1.ped d scan1.ped --ignore
--hardy more
2) Find the sample mixups using GRR. How many
mixups are there? What family(ies) are involved?
Check for Mendelian errors using pedstats or
merlin. Are there any? What would you do about
this?
pedstats p scan1.ped d scan1.dat more
merlin p scan1.ped d scan1.dat m scan.map
more
What differences do you see between the programs?
Can you predict the impact on the results?