Title: Highdensity SNP association and CNV analysis of two Autism Susceptibility Loci
1 High-density SNP association and CNV analysis of
two Autism Susceptibility Loci
Pagnamenta AT1, Maestrini E2, Lamb JA3, Sykes
NH1, Sousa I1, Toma C2, Bacchelli E2, Morris AP1,
Bailey AJ4, Monaco AP1, IMGSAC 1 Wellcome Trust
Centre for Human Genetics, Oxford, UK 2
Dipartimento di Biologia, Università di Bologna,
Italy 3 Centre for Integrated Genomic Medical
Research, Manchester, UK 4 Department of
Psychiatry, University of Oxford, UK
Introduction
Results
- Quality Controls
- Based on 20 SNPs on both AUTS1 and AUTS5 arrays,
genotyping reproducibility was 99.99. - Analysis of 50 autosomal SNPs showed no evidence
for population stratification. - SNP association
- AUTS1 see Figure 1A
- In the family-based analysis, PLXNA4
demonstrated greatest significance (rs4731863). - The strongest signal for the Trend test analysis
was from IMMP2L (rs12537269). - AUTS5 see Figure 1B
- TDT analysis implicated UPP2 (rs6709528) and
ZNF533 (rs11885327). - NOSTRIN provided the strongest association in
the Trend test (rs7583629).
- Autism is a neurodevelopmental disorder
characterised by - Impaired communication.
- Impaired reciprocal social interaction.
- Restricted interests and repetitive behaviour.
- The strong genetic component inferred from twin
and family studies has led to numerous genetic
linkage studies. - In the first of these studies, the International
Molecular Genetic Study of Autism Consortium
(IMGSAC) identified a candidate region on
chromosome 7q1. This region, termed Autism
Susceptibility Locus 1 (AUTS1), has been further
implicated in replication studies and
meta-analyses2. However, screening of several
AUTS1 candidate genes has thus far failed to
identify the underlying genetic variants3. - IMGSAC later identified another Autism
Susceptibility Locus (AUTS5) on chromosome 2q,
which has also been replicated4. Screening of
AUTS5 candidate genes has not identified any
variants that can account for the relatively
strong linkage signal at this locus5. - In order to gain a better understanding of the
genetic variation that may underlie the AUTS1 and
AUTS5 linkage signals, a high-density SNP
association study and Copy Number Variation (CNV)
analysis was undertaken in trios selected from
293 IMGSAC multiplex families based on
identity-by-descent sharing at these loci.
A B
rs4731863 P1.0 x 10-4
rs12537269 P1.2 x 10-4
rs2217262 P1.4 x 10-2
rs2217262 P4.2 x 10-3
rs6709528 P8.0 x 10-4
rs11885327 P8.0 x 10-4
rs7583629 P3.2 x 10-5
Figure 1 Family-based and Case-Control SNP
association results. Negative log10 of
uncorrected P-values plotted against chromosome
position. A, AUTS1 data. B, AUTS5 data.
- CNV analysis
- QuantiSNP identified 17 putative CNVs in 7 AUTS1
regions and 6 putative CNVs in 5 AUTS5 regions. - There were no de novo events detected.
- A duplication of the EMID2 gene in AUTS1 was
transmitted to the proband in 3/4 families (and
2/2 affected sibs). - An inherited duplication of IMMP2L-DOCK4 was
detected (Figure 2A) and verified with qPCR and a
genomewide SNP platform (not shown). However,
segregation analysis indicated that the other
affected sib did not inherit this CNV.
Materials and methods
- SNP selection
- Illumina GoldenGate 1536 arrays were designed
for each locus, using Tagger (allowing
aggressive haplotype tags), to optimally
capture HapMap Phase II (v21a) common CEU genetic
variation (MAFgt0.05) in all genes and other
evolutionarily conserved sequences. - SNPs from a previous 1536 array for each locus
(unpublished) and an Affymetrix 10K v2.0 linkage
scan6 were force-included in the selection
procedure, such that the newly selected SNPs
filled any gaps in our existing data. - Approximately 85 and 96 of intragenic
variation was captured (r2gt0.8) with these
combined sets of SNPs in AUTS1 and AUTS5
respectively.
- Replication studies
- Family-based analysis of 56 SNPs in two
replication cohorts identified 11 SNPs with
Plt0.05 (Table 1). - Only 3 of these SNPs were significant (Plt0.05)
upon combined analysis of replication cohorts,
and 2 of these were with the opposite allele from
the primary analysis. - The remaining SNP in DOCK4 (rs2217262, P9.2 x
10-4) was significant after correction for
multiple testing, with the A allele giving an
odds ratio of 2.28. - QMPSF in 285 families detected a microdeletion
of IMMP2L-DOCK4 that was transmitted to both
affected boys. qPCR demonstrated that the DOCK4
deletion breakpoint lies between exon 14 and 31
(Figure 2B). - No structural variants encompassing IMMP2L-DOCK4
were seen in 475 UK controls.
A
GoldenGate array 2
GoldenGate array 1
- SNP genotyping
- DNA was diluted to 100ng/µl and analysed on the
Illumina GoldenGate platform according to
manufacturers instructions. BeadArrays were
scanned using the BeadArray Reader at 532nm and
647nm. The samples typed consisted of - 127/126 family trios, pre-selected for
identity-by-descent sharing between affected sibs
in the AUTS1/AUTS5 linkage regions. - 188 gender-matched controls from the European
Collection of Cell Cultures. - 56 SNPs from AUTS1/AUTS5 with Plt0.005 in either
TDT or Trend tests were selected for replication.
These SNPs were genotyped in a cohort from the
Netherlands (n96 families) plus an additional
IMGSAC collection of samples (n294 families),
using the Sequenom MassEXTEND platform.
B
7qter
Figure 2 SNP and qPCR data showing IMMP2L-DOCK4
CNVs. A, GoldenGate data for proband of family
13-3023. Increased log R ratios are indicated by
arrows, B-allele frequencies consistent with
AAB/ABB genotypes are boxed. SNPs within the
boundaries of the region detected in red.
Showing 106-116Mb of AUTS1 from the Illumina
Genome Viewer. B, DOCK4/GAPDH copy number
determined by qPCR for family 15-0084. The father
(15-0084-001) was used as the non-deleted
reference sample. Relative copy number of 0.7 was
used as the threshold for identifying deletions.
Statistical Analysis Analysis of population
stratification was performed using STRUCTURE7.
Single SNP and SNP haplotypes were analysed using
the standard transmission disequilibrium (TDT)
and Cochran-Armitage Trend tests. Due to the
presence of a higher proportion of families with
missing parents (24 vs 7 in the primary
cohort), association analysis of the replication
samples was carried out using the UNPHASED
application.
Table 1 Family-based association analysis of
replication samples. (C.I.), 95 confidence
interval. Associations with the opposite allele
from the primary analysis are flagged by an
asterisk. Bonferroni correction was carried out
for the combined replication samples and the one
significant result shown in bold.
- CNV analysis
- Final reports were generated, combining B-allele
frequency log R ratio data from both
GoldenGate arrays at each locus, using build 36
genome coordinates. No-calls were deleted. - These files were run on QuantiSNP8 using the
settings L1M, array type100k, maxcopy4, GC
correctionON. - Samples with high CNV counts (gt95 centile) were
removed. CNVs with a log Bayes factor of less
than 10 were then eliminated from further
analysis. - Segregation was determined by qPCR and haplotype
analysis using data from a previous Affymetrix
10K study6. - Quantitative Multiplex PCR of Short Fluorescent
Fragments (QMPSF) was used for the replication
cohort.
Conclusions
- A high-density SNP association study of AUTS1
and AUTS5 identified several novel genes showing
genetic association in the primary IMGSAC autism
cohort, including PLXNA4, IMMP2L, DOCK4, ZNF533
and NOSTRIN. - In the replication study, the only SNP to show
consistent association was rs2217262 in the first
intron of DOCK4. This association retained
significance after correction for multiple
testing of the 28 AUTS1 replication SNPs. - High-density SNP data from the GoldenGate
platform can be used to interrogate the genome
for CNVs. - Although the region is heavily populated with
CNVs in the Database of Genomic Variants (DGV),
the overtransmission of EMID2 duplications in
this sample suggests this locus may warrant
further study. - The coincident SNP association and CNV findings
at the IMMP2L-DOCK4 locus is of particular
interest. - Unlike IMMP2L, the 3 end of DOCK4 is not
represented in the DGV. Together with its
recently described role in dendrite
morphogenesis9, DOCK4 an excellent candidate gene
for further genetic and functional analyses.
References
Acknowledgements
- 1. IMGSAC (1998). A full genome screen for autism
with evidence for linkage to a region on
chromosome 7q. Hum Mol Genet 7, 571-578. - 2. Trikalinos TA, et al. (2006). A
heterogeneity-based genome search meta-analysis
for autism-spectrum disorders. Mol Psychiatry 11,
29-36. - 3. Bonora E, et al. (2005). Mutation screening
and association analysis of six candidate genes
for autism on chromosome 7q. Eur J Hum Genet 13,
198-207. - 4. Buxbaum JD, et al. (2001). Evidence for a
susceptibility gene for autism on chromosome 2
and for genetic heterogeneity. Am J Hum Genet 68,
1514-20. - 5. Bacchelli E, et al. (2003). Screening of nine
candidate genes for autism on chromosome 2q
reveals rare nonsynonymous variants in the
cAMP-GEFII gene. Mol Psychiatry 8, 916-24. - 6. AGP (2007). Mapping autism risk loci using
genetic linkage and chromosomal rearrangements.
Nat Genet 39, 319-328. - 7. Falush D, et al. (2003). Inference of
population structure using multilocus genotype
data linked loci and correlated allele
frequencies. Genetics 164, 1567-1587. - 8. Colella S, et al. (2007). QuantiSNP an
Objective Bayes Hidden-Markov Model to detect and
accurately map copy number variation using SNP
genotyping data. NAR 35, 2013-2025. - 9. Ueda S, et al (2008). Dock4 regulates
dendritic development in hippocampal neurons. J
Neurosci Res 86, 3052-3061.
This work was funded by the NLM Family
Foundation, the Simons foundation and the EC 6th
FP AUTISM MOLGEN, Telethon-Italy. Thanks to Gaby
Barnby, Joseph Trakalo, Chris Allan and Laura
Winchester for genotyping and QuantiSNP analysis,
Tom Scerri for help with SNP selection and iPLEX
design and Erik Mulder for providing samples for
replication analysis.