Title: Genome-Wide Association Studies: Hunting for Genes in the New Millennium
1Genome-Wide Association Studies Hunting for
Genes in the New Millennium
National Human Genome Research Institute
U.S. Department of Health and Human
Services National Institutes of Health National
Human Genome Research Institute
National Institutes of Health
Teri A. Manolio, M.D., Ph.D.Director, Office of
Population Genomics Senior Advisor to the
Director, NHGRI, for Population
Genomics November 20, 2008
U.S. Department of Health and Human Services
2We Live in Interesting Times
- May he live in interesting times. Like it or
not we live in interesting times. - --Robert
Kennedy, June 7, 1966 - May you come to the attention of those in
authority. -
- May you find what you are looking for.
Wikipedia, accessed 11Sep07
3Manolio, Brooks, Collins, J. Clin. Invest., May
2008
42007 The Year of GWA Studies
Pennisi E, Science 2007 3181842-43.
5Diseases and Traits with Published GWA Studies (n
76, 11/17/08)
- Macular Degeneration
- Exfoliation Glaucoma
- Lung Cancer
- Prostate Cancer
- Breast Cancer
- Colorectal Cancer
- Bladder Cancer
- Neuroblastoma
- Melanoma
- TP53 Cancer Predisposn
- Chr. Lymph. Leukemia
- Inflamm. Bowel Disease
- Celiac Disease
- Gallstones
- Irritable Bowel Syndrome
- QT Prolongation
- Syst. Lupus Erythematosus
- Sarcoidosis
- Pulmonary Fibrosis
- Psoriasis
- HIV Viral Setpoint
- Childhood Asthma
- Type 1 Diabetes
- Type 2 Diabetes
- Diabetic Nephropathy
- End-St. Renal Disease
- Obesity, BMI, Waist, IR
- Height
- Osteoporosis
- Osteoarthritis
- Male Pattern Baldness
- F-Cell Distribution
- Fetal Hgb Levels
- Lipids and Lipoproteins
- Warfarin Dosing
- Ximelegatran Adv. Resp.
- Parkinson Disease
- Amyotrophic Lat. Sclerosis
- Multiple Sclerosis
- MS Interferon-ß Response
- Prog. Supranuclear Palsy
- Alzheimers Disease in e4
- Cognitive Ability
- Memory
- Hearing
- Restless Legs Syndrome
- Nicotine Dependence
- Methamphetamine Depend.
- Neuroticism
- Schizophrenia
- Sz. Iloperidone Response
6There have been few, if any, similar bursts of
discovery in the history of medical research
Hunter DJ and Kraft P, N Engl J Med 2007
357436-439.
7What is a Genome-Wide Association Study?
- Method for interrogating all 10 million variable
points across human genome - Variation inherited in groups, or blocks, so not
all 10 million points have to be tested - Blocks are shorter (so need to test more points)
the less closely people are related - Technology now allows studies in unrelated
persons, assuming 5,000 10,000 base pair
lengths in common (300,000 1,000,000 markers)
8DNA on Chromosome 7 GAAATAATTAATGTTTTCCTTCCTTCTCC
TATTTTGTCCTTTACTTCAATTTATTTATTTATTATTAATATTATTATTT
TTTGAGACGGAGTTTC/ACTCTTGTTGCCAACCTGGAGTGCAGTGGCGTG
ATCTCAGCTCACTGCACACTCCGCTTTCCTGGTTTCAAGCGATTCTCCTG
CCTCAGCCTCCTGAGTAGCTGGGACTACAGTCACACACCACCACGCCCGG
CTAATTTTTGTATTTTTAGTAGAGTTGGGGTTTCACCATGTTGGCCAGAC
TGGTCTCGAACTCCTGACCTTGTGATCCGCCAGCCTCTGCCTCCCAAAGA
GCTGGGATTACAGGCGTGAGCCACCGCGCTCGGCCCTTTGCATCAATTTC
TACAGCTTGTTTTCTTTGCCTGGACTTTACAAGTCTTACCTTGTTCTGCC
/TTCAGATATTTGTGTGGTCTCATTCTGGTGTGCCAGTAGCTAAAAATCC
ATGATTTGCTCTCATCCCACTCCTGTTGTTCATCTCCTCTTATCTGGGGT
CACA/CTATCTCTTCGTGATTGCATTCTGATCCCCAGTACTTAGCATGTG
CGTAACAACTCTGCCTCTGCTTTCCCAGGCTGTTGATGGGGTGCTGTTCA
TGCCTCAGAAAAATGCATTGTAAGTTAAATTATTAAAGATTTTAAATATA
GGAAAAAAGTAAGCAAACATAAGGAACAAAAAGGAAAGAACATGTATTCT
AATCCATTATTTATTATACAATTAAGAAATTTGGAAACTTTAGATTACAC
TGCTTTTAGAGATGGAGATGTAGTAAGTCTTTTACTCTTTACAAAATACA
TGTGTTAGCAATTTTGGGAAGAATAGTAACTCACCCGAACAGTG/TAATG
TGAATATGTCACTTACTAGAGGAAAGAAGGCACTTGAAAAACATCTCTAA
ACCGTATAAAAACAATTACATCATAATGATGAAAACCCAAGGAATTTTTT
TAGAAAACATTACCAGGGCTAATAACAAAGTAGAGCCACATGTCATTTAT
CTTCCCTTTGTGTCTGTGTGAGAATTCTAGAGTTATATTTGTACATAGCA
TGGAAAAATGAGAGGCTAGTTTATCAACTAGTTCATTTTTAAAAGTCTAA
CACATCCTAGGTATAGGTGAACTGTCCTCCTGCCAATGTATTGCACATTT
GTGCCCAGATCCAGCATAGGGTATGTTTGCCATTTACAAACGTTTATGTC
TTAAGAGAGGAAATATGAAGAGCAAAACAGTGCATGCTGGAGAGAGAAAG
CTGATACAAATATAAAT/GAAACAATAATTGGAAAAATTGAGAAACTACT
CATTTTCTAAATTACTCATGTATTTTCCTAGAATTTAAGTCTTTTAATTT
TTGATAAATCCCAATGTGAGACAAGATAAGTATTAGTGATGGTATGAGTA
ATTAATATCTGTTATATAATATTCATTTTCATAGTGGAAGAAATAAAATA
AAGGTTGTGATGATTGTTGATTATTTTTTCTAGAGGGGTTGTCAGGGAAA
GAAATTGCTTTTT SNPs 1 / 300 bases
9Mapping the Relationships Among SNPs
Christensen and Murray, N Engl J Med 2007
3561094-97.
10Chromosome 9p21 Region Associated with MI
Samani N et al, N Engl J Med 2007 357443-453.
11Distances Among East Coast Cities
Boston Provi-dence New York Phila-delphia Balti-more
Providence 59
New York 210 152
Philadelphia 320 237 86
Baltimore 430 325 173 87
Washington 450 358 206 120 34
12Distances Among East Coast Cities
Boston Provi-dence New York Phila-delphia Balti-more
Providence 59
New York 210 152
Philadelphia 320 237 86
Baltimore 430 325 173 87
Washington 450 358 206 120 34
lt 100 101-200 201-300 301-400 gt 400
13Distances Among East Coast Cities
Boston Provi-dence New York Phila-delphia Balti-more
Providence 59
New York 210 152
Philadelphia 320 237 86
Baltimore 430 325 173 87
Washington 450 358 206 120 34
lt 100 101-200 201-300 301-400 gt 400
14Distances Among East Coast Cities
15Distances Among East Coast Cities
Boston Provi-dence New York Phila-delphia Balti-more Wash-ington
16Mapping the Relationships Among SNPs
Christensen and Murray, N Engl J Med 2007
3561094-97.
17One Tag SNP May Serve as Proxy for Many
Block 1
Block 2
SNP4 ?
SNP3 ?
SNP5 ?
SNP6 ?
SNP7 ?
SNP8 ?
SNP2 ?
SNP1 ?
-
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CGGATTGCTGCATGGATCGCATCTGTAAGCAC
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CAGATCGCTGGATGAATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAC
-
18One Tag SNP May Serve as Proxy for Many
Block 1
Block 2
SNP4 ?
SNP3 ?
SNP5 ?
SNP6 ?
SNP7 ?
SNP8 ?
SNP2 ?
SNP1 ?
-
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CGGATTGCTGCATGGATCGCATCTGTAAGCAC
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CAGATCGCTGGATGAATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAC
19One Tag SNP May Serve as Proxy for Many
Block 1
Block 2
SNP3 ?
SNP5 ?
SNP6 ?
SNP7 ?
SNP8 ?
-
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CGGATTGCTGCATGGATCGCATCTGTAAGCAC
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CAGATCGCTGGATGAATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAC
20One Tag SNP May Serve as Proxy for Many
Block 1
Block 2
SNP3 ?
SNP6 ?
SNP8 ?
-
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CGGATTGCTGCATGGATCGCATCTGTAAGCAC
- CAGATCGCTGGATGAATCGCATCTGTAAGCAT
- CAGATCGCTGGATGAATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAT
- CGGATTGCTGCATGGATCCCATCAGTACGCAC
21One Tag SNP May Serve as Proxy for Many
Block 1
Block 2
Frequency
Singleton
-
- GTT 35
- CTC 30
- GTT 10
- GAT 8
- CAT 7
- CAC 6
- other haplotypes 4
22www.hapmap.org
Nature 2005 4371299-320.
Nature 2007 449851-61.
23A HapMap for More Efficient Association Studies
Goals
- Use just the density of SNPs needed to find
associations between SNPs and diseases - Do not miss chromosomal regions with disease
association - Produce a tool to assist in finding genes
affecting health and disease - Use more SNPs for complete genome coverage of
populations of recent African ancestry
populations due to shorter LD
24Progress in Genotyping Technology
102
ABI TaqMan
ABI SNPlex
10
Cost per genotype (Cents, USD)
Illumina Golden Gate
Affymetrix MegAllele
Affymetrix 10K
Illumina Infinium/Sentrix
Perlegen
1
Affymetrix 100K/500K
Nb of SNPs
1
10
102
103
104
105
106
2001
2005
Courtesy S. Chanock, NCI
25Continued Progress in Genotyping Technology
Affymetrix 500K
Illumina 550K
Illumina 650Y
Illumina 317K
Cost per person (USD)
July 2005
Oct 2006
Courtesy S. Gabriel, Broad/MIT
26Association of Alleles and Genotypes of rs1333049
with Myocardial Infarction
C N () C N () G N () G N () ?2 (1df) P-value
Cases 2,132 (55.4) 2,132 (55.4) 1,716 (44.6) 1,716 (44.6) 55.1 1.2 x 10-13
Controls 2,783 (47.4) 2,783 (47.4) 3,089 (52.6) 3,089 (52.6) 55.1 1.2 x 10-13
Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38
Samani N et al, N Engl J Med 2007 357443-53.
27Association of Alleles and Genotypes of rs1333049
with Myocardial Infarction
C N () C N () G N () G N () ?2 (1df) P-value
Cases 2,132 (55.4) 2,132 (55.4) 1,716 (44.6) 1,716 (44.6) 55.1 1.2 x 10-13
Controls 2,783 (47.4) 2,783 (47.4) 3,089 (52.6) 3,089 (52.6) 55.1 1.2 x 10-13
Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38 Allelic Odds Ratio 1.38
CC N () CG N () CG N () GG N () GG N () ?2 (2df) P-value
Cases 586 (30.5) 960 (49.9) 960 (49.9) 378 (19.6) 378 (19.6) 59.7 1.1 x 10-14
Controls 676 (23.0) 1,431 (48.7) 1,431 (48.7) 829 (28.2) 829 (28.2) 59.7 1.1 x 10-14
Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47 Heterozygote Odds Ratio 1.47
Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90 Homozygote Odds Ratio 1.90
Samani N et al, N Engl J Med 2007 357443-53.
28P Values of GWA Scan for Age-Related Macular
Degeneration
Klein et al, Science 2005 308385-389.
29Nicotine Dependence among Smokers
Bierut LJ et al, Hum Molec Genet 2007 1624-35.
30Genome-Wide Scan for Type 2 Diabetes in a
Scandinavian Cohort
http//www.broad.mit.edu/diabetes/scandinavs/type2
.html
31Genome-Wide Scan for Crohn Disease in Belgian
Cases and Controls
Libioulle C et al, PLoS Genet 2007 Apr
203(4)e58.
32Genome-Wide Scan for Type 2 Diabetes in French
Case-Control Study
Sladek R et al, Nature 2007 445, 881-885.
33Wellcome Trust Genome-Wide Association Study of
Seven Common Diseases
WTCCC, N ature 2007 447661-678.
34Genome-Wide Scan for Breast Cancer in
Postmenopausal Women
Hunter DJ et al, Nat Genet 2007 39870-874.
35-Log10 P Values for SNP Associations with
Myocardial Infarction
Samani N et al., N Engl J Med 2007 357443-53.
36Association Signal for Coronary Artery Disease on
Chromosome 9
Samani N et al., N Engl J Med 2007 357443-53.
37Region of Chromosome 1 Showing Strong Association
with Inflammatory Bowel Disease
Duerr R et al., Science 2006 3141461-63.
38Unique Aspects of GWA Studies
- Permit examination of inherited genetic
variability at unprecedented level of resolution - Permit "agnostic" genome-wide evaluation
- Once genome measured, can be related to any trait
- Most robust associations in GWA studies have not
been with genes previously suspected of
association with the disease - Some associations in regions not even known to
harbor genes
The chief strength of the new approach also
contains its chief problem with more than
500,000 comparisons per study, the potential for
false positive results is unprecedented.
Hunter DJ and Kraft P, N Engl J Med 2007
357436-439.
39Larson, G. The Complete Far Side. 2003.
40Number of New, Significant Gene-Disease
Associations by Year, 1984 - 2000
Hirschhorn J et al, Genet Med 2002 445-61.
41Of 600 Gene-Disease Associations, Only 6
Significant in gt 75 of Identified Studies
Disease/Trait Gene Polymorphism Frequency
DVT F5 Arg506Gln 0.015
Graves Disease CTLA4 Thr17Ala 0.62
Type 1 DM INS 5 VNTR 0.67
HIV/AIDS CCR5 32 bp Ins/Del 0.05-0.07
Alzheimers APOE Epsilon 2/3/4 0.16-0.24
Creutzfeldt-Jakob Disease PRNP Met129Val 0.37
Hirschhorn J et al., Genet Med 2002 445-61.
42Reports For and Against Associations of Variants
with Carotid Atherosclerosis
POLYMORPHISM PRESENT ABSENT SUMMARY
ACE I/D 13 with D 1 with I 18 favors none
APOE 8 with e4, 2 with e2 9 equivocal
AGT M235T 0 8 none
AGTR1 A1166C 0 7 none
MTHFR 7 with T, 1 with non-T 8 equivocal
PON1 Q192R 3 with R 10 none
PON1 L55M 5 with L (subgroups) 1 weak
NOS3 G894T 1 with T 4 none
MMP3 -1516 5A/6A 4 with 6A 0 association
IL-6 G-174C 1 with G 3 none
Manolio et al., ATVB 2004 241567-77.
43(No Transcript)
44Chanock S, Manolio T, et al., Nature 2007
447655-60.
45Replication, Replication, Replication
- Initial study Sufficient description to permit
replication - Sources of cases and controls
- Participation rates and flow chart of selection
- Methods for assessing affected status
- Standard Table 1 including rates of missing
data - Assessment of population heterogeneity
- Genotyping methods and QC metrics
- Replication study
- Similar population, similar phenotype
- Same genetic model, same SNP, same direction
- Adequately powered to detect postulated effect
Chanock S, Manolio T, et al., Nature 2007
447655-60.
46Replication Strategy for Prostate Cancer Study in
CGEMS
Initial Study 1,150 cases / 1,150 controls
gt500,000 Tag SNPs
Replication Study 1 3,000 cases / 3,000 controls
24,000 SNPs
Replication Study 2 2,400 cases / 2,400 controls
1,500 SNPs
200 New ht-SNPs
Replication Study 3 2,500 cases / 2,500 controls
25-50 Loci
Hoover R, Epidemiology 2007 1813-17.
47Replication Strategy in Easton Breast Cancer Study
Stage Cases Controls SNPs
1 408 400 266,722
Easton et al, Nature 2007 4471087-93.
48Replication Strategy in Easton Breast Cancer Study
Stage Cases Controls SNPs
1 408 400 266,722
2 3,990 3,916 13,023
Easton et al, Nature 2007 4471087-93.
49Replication Strategy in Easton Breast Cancer Study
Stage Cases Controls SNPs
1 408 400 266,722
2 3,990 3,916 13,023
3 23,734 23,639 31
Easton et al, Nature 2007 4471087-93.
50Replication Strategy in Easton Breast Cancer Study
Stage Cases Controls SNPs
1 408 400 266,722
2 3,990 3,916 13,023
3 23,734 23,639 31
Final 6
- ABCFS
- BCST
- COPS
- GENICA
- HBCS
- HBCP
- MEC-W
- MEC-J
- NHS
- PBCS
- RBCS
- SASBAC
- SEARCH2
- SEARCH3
- SBCP
- SBCS
- CNIOBCS
- USRT
- TBCS
- KConFab/AOCS
- KBCP
- LUMCBCS
- MCBCS
- MCCS
Easton et al, Nature 2007 4471087-93.
51Larson, G. The Complete Far Side. 2003.
52Replication Strategy in CGEMS Prostate Cancer
Study
Stage Cases Cases Cases Controls Controls SNPs SNPs
1 1,172 1,172 1,172 1,157 1,157 527,869 527,869
Thomas et al, Nat Genet 2008 40310-15.
53Replication Strategy in CGEMS Prostate Cancer
Study
Stage Cases Cases Cases Controls Controls SNPs SNPs
1 1,172 1,172 1,172 1,157 1,157 527,869 527,869
2 3,941 3,941 3,941 3,964 3,964 26,958 26,958
Thomas et al, Nat Genet 2008 40310-15.
54Replication Strategy in CGEMS Prostate Cancer
Study
Stage Cases Cases Cases Controls Controls SNPs SNPs
1 1,172 1,172 1,172 1,157 1,157 527,869 527,869
2 3,941 3,941 3,941 3,964 3,964 26,958 26,958
Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068
Thomas et al, Nat Genet 2008 40310-15.
55Replication Strategy in CGEMS Prostate Cancer
Study
Stage Cases Cases Cases Controls Controls SNPs SNPs
1 1,172 1,172 1,172 1,157 1,157 527,869 527,869
2 3,941 3,941 3,941 3,964 3,964 26,958 26,958
Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068
SNP SNP SNP Gene Stage 12 P-value Stage 12 P-value
rs4962416 rs4962416 rs4962416 MSMB 7 x 10-13 7 x 10-13
rs10896449 rs10896449 rs10896449 11q13 2 x 10-9 2 x 10-9
rs10993994 rs10993994 rs10993994 CTBP2 2 x 10-7 2 x 10-7
rs10486567 rs10486567 rs10486567 JAZF1 2 x 10-6 2 x 10-6
Thomas et al, Nat Genet 2008 40310-15.
56Replication Strategy in CGEMS Prostate Cancer
Study
Stage Cases Cases Cases Controls Controls SNPs SNPs
1 1,172 1,172 1,172 1,157 1,157 527,869 527,869
2 3,941 3,941 3,941 3,964 3,964 26,958 26,958
Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068
SNP SNP SNP Gene Stage 12 P-value Stage 12 P-value Initial Rank Initial Rank
rs4962416 rs4962416 rs4962416 MSMB 7 x 10-13 7 x 10-13 24,223 24,223
rs10896449 rs10896449 rs10896449 11q13 2 x 10-9 2 x 10-9 2,439 2,439
rs10993994 rs10993994 rs10993994 CTBP2 2 x 10-7 2 x 10-7 319 319
rs10486567 rs10486567 rs10486567 JAZF1 2 x 10-6 2 x 10-6 24,407 24,407
Thomas et al, Nat Genet 2008 40310-15.
57Replication Strategy in CGEMS Prostate Cancer
Study
Stage Cases Cases Cases Controls Controls SNPs SNPs
1 1,172 1,172 1,172 1,157 1,157 527,869 527,869
2 3,941 3,941 3,941 3,964 3,964 26,958 26,958
Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068 Selected for p lt 0.068
SNP SNP SNP Gene Stage 12 P-value Stage 12 P-value Initial Rank Initial Rank Initial P-value Initial P-value
rs4962416 rs4962416 rs4962416 MSMB 7 x 10-13 7 x 10-13 24,223 24,223 0.042 0.042
rs10896449 rs10896449 rs10896449 11q13 2 x 10-9 2 x 10-9 2,439 2,439 0.004 0.004
rs10993994 rs10993994 rs10993994 CTBP2 2 x 10-7 2 x 10-7 319 319 4 x 10-4 4 x 10-4
rs10486567 rs10486567 rs10486567 JAZF1 2 x 10-6 2 x 10-6 24,407 24,407 0.042 0.042
Thomas et al, Nat Genet 2008 40310-15.
58Published GWA Reports, 3/2005 - 9/2008
191
Total Number of Publications
Calendar Quarter
59NHGRI Catalog of GWA Studies http//www.genome.g
ov/gwastudies/
60NHGRI GWA Catalog - Objectives
- Identify and track all GWA publications
attempting to assay gt 100,000 SNPs - Extract key information regarding associations
- Provide widely as scientific resource, including
downloadable datafile - Seek commonalities across associations
genome-wide rather than disease by disease - Describe approach clearly so others can replicate
or expand upon it - Maintain consistency in approach
- Adapt to evolving technologies CNVs?
61NHGRI GWA Catalog - Methods
- Survey NIH e-clips daily, PubMed weekly
- Identify all GWA publications attempting to assay
gt 100,000 SNPs - Describe top 5 novel associations significant at
p lt 10-6 - Expand to all associations lt 10-6
- Extract information on
- - Disease/trait - Rs number/risk allele
- - Sample size - Risk allele frequency
- - Genomic region - P-value, OR 95 CI
- - Reported genes - Platform, SNPs
62Reports Included in this Analysis
- 180 published papers through 9/18/2008
- 34 did not report SNP
- 1 reported haplotypes without specific SNPs
- 145 reports
- 782 unique (index) SNPs
- 3,841 unique perfect LD SNPs (linked)
- 4,623 index linked
- 83 index SNPs reported 2-7 times
- 10 multiple reports in unrelated traits
63Functional Classification of 782 Index SNPs
Associated with Complex Traits
37
11
340
2
11
6
22
20
354
0 10 20 30 40
50 60
Percent
64Functional Classification of 782 Index SNPs and
4,623 Index Linked SNPs
0 10 20 30 40
50 60
Percent
Index SNPs Index Linked SNPs
65Odds Ratios of Discrete Associations
Median 1.28
//
//
3 4 5 6 9 13 20 30
66Percent of Variance in Disease Risk Explained by
32 Established CD Risk Loci
Power to detect risk loci
Barrett et al., Nat Genet 2008 Jun 29.
67Odds Ratios of Discrete Associations
//
//
3 4 5 6 9 13 20 30
68Reported Risk Allele Frequencies by Odds Ratios
for Discrete Traits
30 25 20 15 10 5 4 3 2 1
69Reported Risk Allele Frequencies by Odds Ratios
for Discrete Traits
30 25 20 15 10 5 4 3 2 1
70Reported Risk Allele Frequencies by Odds Ratios
for Discrete Traits
30 25 20 15 10 5 4 3 2 1
71Reported Risk Allele Frequencies by Odds Ratios
for Discrete Traits
30 25 20 15 10 5 4 3 2 1
Sarasquete Osteonecrosis
Thorlieifsson Exfoliation Glaucoma
Hakonarson Type 1 DM
van Heel Celiac Disease
WTCCC Type 1 DM
72Characteristics of SNPs Associated with Odds
Ratios gt 4.5
Author Trait Ca/Co RAF OR P-value
Thorliefsson Exfoln glaucoma 75/14,747 0.85 20.10 3 x 10-21
Sarasquete Osteonecrosis 21/64 0.12 12.75 1 x 10-6
HakonarsonM Type 1 diabetes 561/1,143 0.13 8.30 1 x 10-16
van HeelM Celiac disease 991/1,489 0.14 7.04 1 x 10-19
Matarin Stroke 259/269 NR 5.62 6 x 10-6
WTCCCM Type 1 diabetes 1,963/2,938 0.39 5.49 5 x 10-134
Behrens Juvenile arthritis 130/1,952 NR 5.37 2 x 10-10
Fung Parkinsons dis. 267/270 NR 5.00 7 x 10-6
Klein Macular degen. 96/50 0.70 4.60 4 x 10-8
SEARCH Statin myopathy 85/90 0.13 4.50 2 x 10-9
2 other SNPs in this study also associated, OR gt 2.0. MSNPs in MHC region. 2 other SNPs in this study also associated, OR gt 2.0. MSNPs in MHC region. 2 other SNPs in this study also associated, OR gt 2.0. MSNPs in MHC region. 2 other SNPs in this study also associated, OR gt 2.0. MSNPs in MHC region. 2 other SNPs in this study also associated, OR gt 2.0. MSNPs in MHC region. 2 other SNPs in this study also associated, OR gt 2.0. MSNPs in MHC region.
73FST Values Index SNPs and HapMap SNPs
Median 0.069
0.00 0.10 0.20 0.30 0.40 0.50
0.60 0.70 0.80
FST Values
74Phenotype Relationships of SNPs with Highest FST
Values
Immune Pigment Obesity Neuro.
Height BMD Cancer Related
Traits Related
75Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes Signals in Previously Unsuspected Genes
Macular Degeneration CFH
Coronary Disease CDKN2A/2B
Childhood Asthma ORMDL3
Type II Diabetes CDKAL1
Crohns Disease ATG16L1
76Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes Signals in Previously Unsuspected Genes
Macular Degeneration CFH
Coronary Disease CDKN2A/2B
Childhood Asthma ORMDL3
Type II Diabetes CDKAL1
Crohns Disease ATG16L1
Signals in Gene Deserts Signals in Gene Deserts
Prostate Cancer 8q24
Crohns Disease 5p13.1, 1q31.2, 10p21
77Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes Signals in Previously Unsuspected Genes Signals in Common
Macular Degeneration CFH
Coronary Disease CDKN2A/2B Diabetes, Melanoma
Childhood Asthma ORMDL3 Crohns Disease
Type II Diabetes CDKAL1 Prostate Cancer
Crohns Disease ATG16L1
Signals in Gene Deserts Signals in Gene Deserts
Prostate Cancer 8q24
Crohns Disease 5p13.1, 1q31.2, 10p21
78Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes Signals in Previously Unsuspected Genes Signals in Common
Macular Degeneration CFH
Coronary Disease CDKN2A/2B Diabetes, Melanoma
Childhood Asthma ORMDL3 Crohns Disease
Type II Diabetes CDKAL1 Prostate Cancer
Crohns Disease ATG16L1
Signals in Gene Deserts Signals in Gene Deserts Signals in Common
Prostate Cancer 8q24 Breast, Colorectal Cancer Crohns
Crohns Disease 5p13.1, 1q31.2, 10p21
79Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes Signals in Previously Unsuspected Genes Signals in Common
Macular Degeneration CFH
Coronary Disease CDKN2A/2B Diabetes, Melanoma
Childhood Asthma ORMDL3 Crohns Disease
Type II Diabetes CDKAL1 Prostate Cancer
Crohns Disease ATG16L1
Signals in Gene Deserts Signals in Gene Deserts Signals in Common
Prostate Cancer 8q24 Breast, Colorectal Cancer Crohns
Crohns Disease 5p13.1, 1q31.2, 10p21
Signals in Common Signals in Common Signals in Common
Multiple Sclerosis IL7R Type 1 Diabetes
Sarcoidosis C10orf67 Celiac Disease
RA, T1DM PTPN2, PTPN22 Crohns
80Lessons Learned from Initial GWA Studies
Signals in Previously Unsuspected Genes Signals in Previously Unsuspected Genes Signals in Common
Macular Degeneration CFH
Coronary Disease CDKN2A/2B Diabetes, Melanoma
Childhood Asthma ORMDL3 Crohns Disease
Type II Diabetes CDKAL1 Prostate Cancer
Crohns Disease ATG16L1
Signals in Gene Deserts Signals in Gene Deserts Signals in Common
Prostate Cancer 8q24 Breast, Colorectal Cancer Crohns
Crohns Disease 5p13.1, 1q31.2, 10p21
Signals in Common Signals in Common Signals in Common
Multiple Sclerosis IL7R Type 1 Diabetes
Sarcoidosis C10orf67 Celiac Disease
RA, T1DM PTPN2, PTPN22 Crohns
81Study Crohns Disease!
Barrett et al., Nat Genet 2008 Jun 29.
82Conclusions
- Nearly half of GWA-identified SNPs are intergenic
- Only 8.4 of index SNPs are in coding regions, 5
or 3 UTR, or miRTS - Potential selection bias in genotyped SNPs for
excess of missense variants - Most associated odds ratios are lt 1.5
- Risk allele frequencies do not appear skewed
toward rare alleles or large FST values - Highly-differentiated SNPs enriched for
immune-related, pigmentation, and obesity traits - Examination of loci at extremes of these
characteristics may yield interesting insights
83- The more we find, the more we see, the more we
come to learn. - The more that we explore, the more we shall
return. - Sir Tim Rice, Aida, 2000