Genetics for Epidemiologists Lecture 5: Analysis of Genetic Association Studies - PowerPoint PPT Presentation

About This Presentation
Title:

Genetics for Epidemiologists Lecture 5: Analysis of Genetic Association Studies

Description:

Genetics for Epidemiologists. Lecture 5: Analysis of Genetic Association Studies ... Quantitative Genetics ' ... Quantitative Genetics 1996. Inheritance ... – PowerPoint PPT presentation

Number of Views:314
Avg rating:3.0/5.0
Slides: 71
Provided by: jwito
Learn more at: https://www.genome.gov
Category:

less

Transcript and Presenter's Notes

Title: Genetics for Epidemiologists Lecture 5: Analysis of Genetic Association Studies


1
Genetics for EpidemiologistsLecture 5 Analysis
of Genetic Association Studies
National Human Genome Research Institute
U.S. Department of Health and Human
Services National Institutes of Health National
Human Genome Research Institute
National Institutes of Health
Teri A. Manolio, M.D., Ph.D.Director, Office of
Population Genomics and Senior Advisor to the
Director, NHGRI, for Population Genomics
U.S. Department of Health and Human Services
2
Topics to be Covered
  • Discrete traits and quantitative traits
  • Measures of association
  • Detecting/correcting for false positives
  • Genotyping quality control
  • Quantile-quantile (Q-Q) plots
  • Odds ratios allelic and genotypic
  • Models of genetic transmission
  • Interactions gene-gene, gene-environment

3
Larson, G. The Complete Far Side. 2003.
4
Quantitative Genetics
concerned with the inheritance of those
differences between individuals that are of
degree rather than of kind
Quantitative Qualitative




Falconer and Mackay, Quantitative Genetics 1996.
5
Quantitative Genetics
concerned with the inheritance of those
differences between individuals that are of
degree rather than of kind
Quantitative Qualitative
Continuous gradation among individuals from one extreme to other Sharply demarcated types with little connection by intermediates



Falconer and Mackay, Quantitative Genetics 1996.
6
Quantitative Genetics
concerned with the inheritance of those
differences between individuals that are of
degree rather than of kind
Quantitative Qualitative
Continuous gradation among individuals from one extreme to other Sharply demarcated types with little connection by intermediates
Effects of genes are small Effects of genes are large


Falconer and Mackay, Quantitative Genetics 1996.
7
Quantitative Genetics
concerned with the inheritance of those
differences between individuals that are of
degree rather than of kind
Quantitative Qualitative
Continuous gradation among individuals from one extreme to other Sharply demarcated types with little connection by intermediates
Effects of genes are small Effects of genes are large
Usually many genes Single genes inherited in Mendelian ratios?

Falconer and Mackay, Quantitative Genetics 1996.
8
Inheritance Models in Single Gene Trait
9
Inheritance Models in Single Gene Trait
Genotype Group Genotype Group Genotype Group
Model AA Aa aa



10
Inheritance Models in Single Gene Trait
Genotype Group Genotype Group Genotype Group
Model AA Aa aa
A is Dominant


11
Inheritance Models in Single Gene Trait
Genotype Group Genotype Group Genotype Group
Model AA Aa aa
A is Dominant


12
Inheritance Models in Single Gene Trait
Genotype Group Genotype Group Genotype Group
Model AA Aa aa
A is Dominant
A is Recessive

13
Inheritance Models in Single Gene Trait
Genotype Group Genotype Group Genotype Group
Model AA Aa aa
A is Dominant
A is Recessive
A is Co-Dominant
14
Inheritance Models in Quantitative Trait
15
Inheritance Models in Quantitative Trait
Population Mean Population Mean Population Mean
Model -x 0 x





16
Inheritance Models in Quantitative Trait
Population Mean Population Mean Population Mean
Model -x 0 x
A is Completely Dominant aa AA Aa




17
Inheritance Models in Quantitative Trait
Population Mean Population Mean Population Mean
Model -x 0 x
A is Completely Dominant aa AA Aa

A is Partially Dominant aa Aa AA


18
Inheritance Models in Quantitative Trait
Population Mean Population Mean Population Mean
Model -x 0 x
A is Completely Dominant aa AA Aa

A is Partially Dominant aa Aa AA
A is Not (Co-) Dominant aa Aa AA

19
Inheritance Models in Quantitative Trait
Population Mean Population Mean Population Mean
Model -x 0 x
A is Completely Dominant aa AA Aa

A is Partially Dominant aa Aa AA
A is Not (Co-) Dominant aa Aa AA
A is Over-Dominant aa AA Aa
20
Quantitative Traits with Published GWA Studies
(16 - 34)
  • QT interval
  • Lipids and lipoproteins
  • Memory
  • Nicotine dependence
  • ORMDL3 expression
  • YKL-40 levels
  • Obesity, BMI, waist
  • Insulin resistance
  • Height
  • Bone mineral density
  • F-cell distribution
  • Fetal hemoglobin levels
  • C-Reactive protein
  • 18 groups of Framingham traits
  • Pigmentation
  • Uric Acid Levels
  • Recombination Rate

21
Association of Alleles and Genotypes of rs1333049
(3049) with Myocardial Infarction
C N () C N () G N () G N () ?2 (1df) P-value
Cases 2,132 (55.4) 2,132 (55.4) 1,716 (44.6) 1,716 (44.6) 55.1 1.2 x 10-13
Controls 2,783 (47.4) 2,783 (47.4) 3,089 (52.6) 3,089 (52.6) 55.1 1.2 x 10-13
Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38





Samani N et al, N Engl J Med 2007 357443-453.
22
Association of Alleles and Genotypes of rs1333049
(3049) with Myocardial Infarction
C N () C N () G N () G N () ?2 (1df) P-value
Cases 2,132 (55.4) 2,132 (55.4) 1,716 (44.6) 1,716 (44.6) 55.1 1.2 x 10-13
Controls 2,783 (47.4) 2,783 (47.4) 3,089 (52.6) 3,089 (52.6) 55.1 1.2 x 10-13
Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38
CC N () CG N () CG N () GG N () GG N () ?2 (2df) P-value
Cases 586 (30.5) 960 (49.9) 960 (49.9) 378 (19.6) 378 (19.6) 59.7 1.1 x 10-14
Controls 676 (23.0) 1,431 (48.7) 1,431 (48.7) 829 (28.2) 829 (28.2) 59.7 1.1 x 10-14
Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47
Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90
Samani N et al, N Engl J Med 2007 357443-453.
23
-Log10 P Values for SNP Associations with
Myocardial Infarction
Samani N et al, N Engl J Med 2007 357443-453.
24
Genome-Wide Scan for Type 2 Diabetes in a
Scandinavian Cohort
http//www.broad.mit.edu/diabetes/scandinavs/type2
.html
25
GWA Study of Serum Uric Acid Levels
  • Linear regression of inverse normalized levels
    against number of alleles
  • Additive model
  • Sex, age, age2 as covariates

Li S et al, PLoS Genet 2007 3e194.
26
Association of rs6855911 and Uric Acid Levels
Genotype Means (mg/dl) Genotype Means (mg/dl) Genotype Means (mg/dl)
Cohort Additive Effect AA AG GG
SardiNIA -0.317 4.66 (1.51) 4.48 (1.59) 4.02 (1.63)
InCHIANTI -0.397 5.27 (1.44) 4.94 (1.31) 4.33 (1.37)
Li S et al, PLoS Genet 2007 3e194.
27
Association Methods for Quantitative Traits
  • Linear regression of multivariable adjusted
    residual against number of alleles
    (Kathiresan,Nat Genet 2008 40189-97)
  • Linear regression of log transformed or
    centralized BMI against genotype (Frayling,
    Science 2007 316889-94)
  • Variance components based Z-score analysis of
    quantile normalized height (Sanna, Nat Genet
    2008 40198-203)

28
Ways of Dealing with Multiple Testing
  • Control family wise error rate (FWER) Bonferroni
    (a a/n) or Sidák (a 1- 1- a1/n)
  • False discovery rate proportion of significant
    associations that are actually false positives
  • False positive report probability probability
    that the null hypothesis is true, given a
    statistically significant finding
  • Bayes factors analysis avoids need for assessing
    genome-wide error rates but must identify
    reasonable alternative model

Hogart CJ et al, Genet Epidemiol 2008 32179-85.
29
Larson, G. The Complete Far Side. 2003.
30
Quality Control of SNP Genotyping Samples
  • Identity with forensic markers (Identifiler)
  • Blind duplicates
  • Gender checks
  • Cryptic relatedness or unsuspected twinning
  • Degradation/fragmentation
  • Call rate (gt 80-90)
  • Heterozygosity outliers
  • Plate/batch calling effects

Chanock et al, Nature 2007 Manolio et al Nat
Genet 2007
31
Quality Control of SNP Genotyping SNPs
  • Duplicate concordance (CEPH samples)
  • Mendelian errors (typically lt 1)
  • Hardy-Weinberg errors (often gt 10-5)
  • Heterozygosity (outliers)
  • Call rate (typically gt 98)
  • Minor allele frequency (often gt 1)
  • Validation of most critical results on
    independent genotyping platform

Chanock et al, Nature 2007 Manolio et al Nat
Genet 2007
32
Hardy-Weinberg Equilibrium
  • Occurrence of two alleles of a SNP in the same
    individual are two independent events
  • Ideal conditions
  • random mating - no selection (equal
    survival)
  • no migration - no mutation
  • no inbreeding - large population sizes
  • gene frequencies equal in males and females)
  • If alleles A and a of SNP rs1234 have frequencies
    p and 1-p, expected frequencies of the three
    genotypes are

Freq AA p2
Freq Aa 2p(1-p)
Freq aa (1-p)2
After G. Thomas, NCI
33
Coverage, Call Rates, and Concordance of Perlegen
and Affymetrix Platforms on HapMap Phase II
Metric Perlegen Perlegen Affymetrix/Broad Affymetrix/Broad
Number of SNPs 480,744 480,744 439,249 439,249
Coverage Single Marker Multi-Marker Single Marker Multi-Marker
CEU 0.90 0.96 0.78 0.87
CHB JPT 0.87 0.93 0.78 0.86
YRI 0.64 0.78 0.63 0.75
Average call rate 98.9 98.9 99.3 99.3
Concordance
Homozygous genotypes 99.8 99.8 99.9 99.9
Heterozygous genotypes 99.8 99.8 99.8 99.8
GAIN Collaborative Group, Nat Genet 2007
391045-51.
34
Sample and SNP QC Metrics for Affymetrix 5.0 and
6.0 Platforms in GAIN
Metric 5.0 fail 6.0 fail
Total Samples 1,829 -- 2,289 --
Passing QC 1,817 0.44 2,192 4.24
gt 98 call rate 1,815 0.55 2,257 1.40









Courtesy, J Paschall, NCBI
35
Sample and SNP QC Metrics for Affymetrix 5.0 and
6.0 Platforms in GAIN
Metric 5.0 fail 6.0 fail
Total Samples 1,829 -- 2,289 --
Passing QC 1,817 0.44 2,192 4.24
gt 98 call rate 1,815 0.55 2,257 1.40

Total SNPs 457,645 -- 906,660 --
Passing QC 429,309 6.19 845,814 6.70
MAF gt 1 457,466 0.04 888,234 2.03
gt 98 call rate 419,810 8.27 821,942 9.34
gt 95 call rate 439,272 4.01 873,856 3.61
HWE lt 10 -6 455,899 0.38 904,275 0.26
lt 1 Mendel error 417,722 8.72 899,721 0.01
lt 1 Duplicate error 454,820 0.01 892,103 0.02
Courtesy, J Paschall, NCBI
36
Sample Heterozygosity in GAIN
Courtesy, J Paschall, NCBI
37
Sample Heterozygosity in GAIN
Courtesy, J Paschall, NCBI
38
Signal Intensity Plots for rs10801532 in AREDS
http//www.ncbi.nlm.nih.gov/sites/entrez
39
Signal Intensity Plots for rs4639796 in AREDS
http//www.ncbi.nlm.nih.gov/sites/entrez
40
Signal Intensity Plots for rs534399 in AREDS
http//www.ncbi.nlm.nih.gov/sites/entrez
41
Signal Intensity Plots for rs572515 in AREDS
http//www.ncbi.nlm.nih.gov/sites/entrez
42
Signal Intensity Plots for CD44 SNP rs9666607
Clayton DG et al, Nat Genet 2005 371243-1246.
43
Principal Component Analysis of Structured
Population First to Third Components
Courtesy, G. Thomas, NCI
44
Principal Component Analysis of Structured
Population Fourth and Fifth Components
Courtesy, G. Thomas, NCI
45
Influence of Relatedness on Principal Component
Analysis
Courtesy, G. Thomas, NCI
46
Principal Component Analysis of Structured
Population Fourth and Fifth Components
Courtesy, G. Thomas, NCI
47
Principal Component Analysis of Structured
Population Fourth and Fifth Components
Courtesy, G. Thomas, NCI
48
Summary Points Genotyping Quality Control
  • Sample checks for identity, gender error, cryptic
    relatedness
  • Sample handling differences can introduce
    artifacts but probably can be adjusted for
  • Association analysis is often quickest way to
    find genotyping errors
  • Low MAF SNPs are most difficult to call
  • Inspection of genotyping cluster plots is
    crucial!

49
Quantile-Quantile Plot for Test Statistics, 390
Breast Cancer Cases, 364 Controls
205,586 SNPs ? 1.03
Easton D et al, Nature 2007 4471087-1093.
50
Observed and Expected Associations after Stage 2
of Breast Cancer GWA
Significance Observed Observed Observed Adjusted Observed Adjusted Expected Expected Ratio
0.01 - 0.05 1,239 1,162 934 1.24
10-3 10-2 574 517 348 1.49
10-4 10-3 112 88 53 1.65
10-5 10-4 16 12 7 1.71
lt 10-5 15 13 1 13.5
All p lt 0.05 1,956 1,792 1,343 1.33
Easton D et al, Nature 2007 4471087-93.
51
Q-Q Plot for Multiple Sclerosis Effect of MHC
Hafler D et al, N Engl J Med 2007 357851-862.
52
Q-Q Plot for Prostate Cancer, all SNPs
Gudmundsson J et al, Nat Genet 2007 39977-983.
53
Q-Q Plot for Prostate Cancer, excluding
Chromosome 8
Gudmundsson J et al, Nat Genet 2007 39977-983.
54
Q-Q Plot for Myocardial Infarction
0 20 40
60
Observed chi-squared statistic
0 5 10
15 20 25
Expected chi-squared statistic
Samani N et al, N Engl J Med 2007 357443-453.
55
-Log10 P Values for SNP Associations with
Myocardial Infarction
Samani N et al, N Engl J Med 2007 357443-453.
56
-Log10 P Values for SNP Associations with
Myocardial Infarction
Samani N et al, N Engl J Med 2007 357443-453.
57
SNP Associations with 1,928 MI Cases and 2,938
Controls from UK
Samani N et al, N Engl J Med 2007 357443-453.
58
Association Signal for Coronary Artery Disease on
Chromosome 9
3049
Samani N et al, N Engl J Med 2007 357443-453.
59
Winners Curse Odds Ratios for CHD Associated
with LTA Genotypes in Multiple Studies
Clarke et al, PLoS Genet 2006 2e107.
60
Genome-Wide Scan for Alzheimers Disease in 861
Cases and 550 Controls
Reiman E et al, Neuron 2007 54713-20.
61
Genome-Wide Scan for Alzheimers Disease in
ApoEe4Carriers
Reiman E et al, Neuron 2007 54713-20.
62
LOAD Odds Ratios Associated with rs2373115 GG by
APOEe4 Status
APOEe4 Group APOEe4 OR 95 CI rs2373115 OR 95CI
APOEe4 - 1.12 0.82,1.53
APOEe4 2.88 1.90,4.36
All 6.07 4.63-7.95 1.34 1.06,1.70
Reiman et al, Neuron 2007 54713-720.
63
P Values of GWA Scan for Age-Related Macular
Degeneration
Klein et al, Science 2005 308385-389.
64
Odds Ratios and Population Attributable Risks for
AMD
Attribute (SNP) rs380390 (C/G) rs1329428 (C/T)
Risk allele C C
Allelic association ?2 P value 4.1 x 108 1.4 x 106
Odds ratio (dominant) 4.6 2.0-11 4.7 1.0-22
Frequency in HapMap CEU 0.70 0.82
Population Attributable Risk 70 42-84 80 0-96
Odds ratio (recessive) 7.4 2.9-19 6.2 2.9-13
Frequency in HapMap CEU 0.23 0.41
Population Attributable Risk 46 31-57 61 43-73
Klein et al, Science 2005 308385-389.
65
Risk of Developing AMD by CFH Y402H and
Modifiable Risk Factors
Risk Factor CFH Y402H Genotype CFH Y402H Genotype CFH Y402H Genotype
Risk Factor YY YH HH
BMI lt 30 kg/m2 1.00 1.95 1.42-2.67 3.96 2.69-5.82
BMI gt 30 kg/m2 1.98 0.91-4.31 2.19 1.11-4.30 12.28 4.88-30.90
Non-smoker 1.00 1.95 1.41-2.71 4.23 2.86-6.27
Current smoker 2.34 1.20-4.55 3.20 1.85-5.55 8.69 3.86-19.57
Schaumberg DA et al, Arch Ophthalmol 2007
12555-62.
66
Interaction Is LIPC Genotype Related to HDL-C?
CC
TT
CT
CT
TT
CC
Ordovas et al, Circulation 2002 1062315-2321.
67
Inverse Relation between Endotoxin Exposure and
Allergic Sensitization by CD14 Genotype
Simpson A et al, Am J Respir Crit Care Med
2006174386-392.
68
Challenges in Studying Gene-Environment
Interactions
Challenge Genes Environment
Ease of measure Pretty easy Often hard
Variability over time Low/none High
Recall bias None Possible
Temporal relation to disease Easy Hard
69
Larson, G. The Complete Far Side. 2003.
70
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com