Title: CORSO DI EPIDEMIOLOGIA GENETICA DI BASE 11 13 ottobre 2004 Istituto Superiore di Sanit Roma STUDI DI
1Statistical genetic methods for disease gene
identification
modified from D. Altschuler
2Association studies
- Association between risk factor and disease risk
factor is significantly more frequent among
affected than among unaffected individuals - In genetic epidemiology
- Risk factors alleles/genotypes/haplotypes
3Association studies
- Candidate genes (functional or positional)
- Fine mapping in linkage regions
- Genome wide screen
4Candidate gene analysis
- Direct analysis
- Association studies between disease and
functional SNPs (causative of disease) of
candidate gene
5Candidate gene analysis
- Indirect analysis
- Association studies between disease and random
SNPs within or near candidate gene - Linkage Disequilibrium mapping
6Association studies
- POPULATION-BASED
- Case-control studies
- FAMILY-BASED
- Cases and related controls
- Nuclear or extended pedigrees
7Case-control studies ?2 test
Risk factor
contingency table
Test of independence ?2 ? (O-E)2 / E with
1 df
8Case-control studies ?2 test
2x3 contingency table
Genotypes
AA Aa aa Cases nAA nAa naa N Controls
mAA mAa maa M tAA tAa taa NM
Test of independence ?2 ? (O-E)2 / E with
2 df
9Case-control studies ?2 test
2x2 contingency table
Alleles
A a Cases nA na 2N Controls
mA ma 2M tA ta 2(NM)
Test of independence ?2 ? (O-E)2 / E with
1 df
10Hardy-Weinberg Equilibrium
- Biallelic locus A, a genotypes AA, Aa, aa
- Allele frequencies A P(A) p
- a P(a) q
- Genotype frequencies are in HWE if
- AA P(AA) p2
- Aa P(Aa) 2pq
- aa P(aa) q2
11Haplotypes
GENOTYPES
Locus 1
2
1
3
Locus 2
6
1
1
5
9
1
7
4
9
1
Identification of phase
6
2
9
1
7
2
1
2
1
2
7
6
1
4
1
7
1
8
1
8
1
4
Locus N
1
0
1
0
12Motivation for haplotype-based analysis
- Increased ability to identify regions that are
shared identical by descent among affected
individuals, and therefore more informative - Haplotype may be the causative composite allele
rather than a particular nucleotide at a
particular SNP - Haplotype analysis is meaningful only if SNPs are
in linkage disequilibrium
13Haplotype determination options
- Collect and genotype family members
- Laboratory-based techniques
- Statistical estimation in unphased individuals
- Likelihood-based E-M algorithms
14Case-control studies ?2 test
2xr contingency table
Haplotypes
1 2 3 ... r Cases n1 n2 n3 ... nr
2N Controls m1 m2 m3 ... mr
2M t1 t2 t3 ... tr 2(NM)
Test of independence ?2 ? (O-E)2 / E with
r-1 df
15Measures of association
- Relative risk (prospective studies)
Odds ratio (retrospective studies)
16Measures of association
Alleles A, a genotypes AA, Aa, a,a Genotype
relative risk
- risk for AA
- risk for aa
- risk for Aa
- risk for aa
- Allele relative risk
- risk for A
- risk for a
GRRAA
GRRAa
GRRa/a 1
FA
Fa 1
17Measures of association
disease genotype - A/A n11 n12 A/a n2
1 n22 a/a n31 n32
GRRA/A ORA/A n11n32/n12n31
GRRA/a ORA/a n21n32/n22n31
disease allele - A n11 n12 a n21
n22
FA ORA n11n22/n12n21
18Measures of association
- Genotypes
- Dominant/recessive/codominant
- - e.g. GRRAA GRRAa A dominant
- Alleles/haplotypes
- Multiplicative model GRRij Fi Fj
- HWE
19Causes of genetic association
20Causes of genetic association
- Indirect association due to LD
marker locus
Functional variant
LD
disease
21Linkage disequilibrium
- Non random association between alleles at
different loci - Loci are in LD if alleles are present on
haplotypes in different proportions that expected
based on allele frequencies
22Linkage disequilibrium
- Locus 1 alleles A and a frequencies pA and
pa - Locus 2 alleles B and b frequencies pB and
pb
A
A
a
a
Possible haplotyes
B
b
B
b
D pAB - pApB ? 0
23Measures of LD
Locus 2 B b Locus 1 A pApBD pApB-D
pA a papB-D papbD pa pB pb
D D / Dmax Dmax Min(pApb, papB)
Dgt0 r2 D2 / (pA pa pB pb)
24Graphic representation of LD
r2
D
GOLD
25Zondervan Cardon, 2004
26Causes of genetic association
- Indirect association due to LD
- Spurious association due to confounding factors
(e.g., population stratification)
27Association due to population stratification
Marchini et al, 2004
28Population stratification
- Possible solutions
- Stratify sample based on confounding variable
- Apply test correction (Genomic Control)
- Use related controls
FAMILY-BASED association studies
29Family-based association studies
1 4 transmitted 2 3 non transmitted
?
?
1 2
3 4
control
2 3
1 4
30TDT Transmission Disequilibrium Test
non transmitted G g G a b g c d
G/G
G/g
transmitted
G/g
TDTG (TG-NTG)2/(TGNTG) (b-c)2/(bc)
?21