Title: Comparative network analysis of neurological disorders focuses the genomewide search for autism gene
1Comparative network analysis of neurological
disorders focuses the genome-wide search for
autism genes.
Dennis P. Wall, PhD Center for Biomedical
Informatics dpwall_at_hms.harvard.edu http//wall.hms
.harvard.edu
2Outline
- Rationale Biological Significance (30 mins)
- Present status (5 mins)
- Project Plan (25 mins)
3Introduction
- Polygenic Multigenic
- Many genes have been linked to autism
- Few genes have been replicated in across studies
- Difficult for a single researcher to grasp the
complexity of the autism gene landscape
4StatisticsU.S. number of cases 1992-2006
http//www.fightingautism.org
5Behavioral overlap with other disorders
Schizophrenia
Angelman
Autism
Epilepsy
Fragile X
Seizure Disorder
Rett Syndrome
Mental Retardation
Tuberous Sclerosis
Others??
6Approach
- Build the network of all genes implicated in
Autism to date - Conduct large comparative analysis of Autism and
other neurological disorders at the level of
genes, biological processes, and networks - Leverage existing research on Autism-related
disorders to find new genetic leads.
7Building Gene Lists for All Neurological
Disorders (433)
Gene Lists
Ataxia
OMIM
NINDS
Epilepsy
Asperger Fragile X Tourettes OCD
OCD
GeneCards
Autism
Disease source
Disease gene database
Gene-Disease sources
8Autism Cluster
Genes
1100100101 1110101011 1001010100 1001011101 //
1101011101
Disorders
Autism Cluster
9Network Construction
- Data derived from STRING (http//string.embl.de/)
- Integration of p-p interaction (interactome),
co-expression (transcriptome), orthology
(orthologome),text (bibliome), and other lines of
evidence. - Focus on creating a networks of possible
interactions within a normal cell using
classification methods (random forests)
10Correlated Expression
Sequence coEvolution
A
B
P-P Interaction
Random Forest Decision
B
D1
D2
D3
D4
D5
A
D1
D3
1,0,2,1,0
D3
D4
Text (aka Bibliome)
D2
FXYD1 is identified as a MeCP2 target gene whose
de-repression may directly contribute to Rett
syndrome neuronal pathogenesis
D3
D4
D5
D3
D1
Yes
No
D2
D4
http//www.ploscompbiol.org/article/infodoi/10.13
71/journal.pcbi.0030043
11Networks for all AC disorders
FragileX (97N/100E)
Hypoxia (586 N/4359E)
Microcephaly (135N/166E)
Rett (48N/74E)
Tuberous Sclerosis (110N/204E)
Angelman (51N/57E)
Inf. Hypotonia (29N/16E)
Mental Retardation (573N/1035E)
Hypotonia (154N/208E)
Autism (145N/164E)
Ataxia (428N/1489E)
Spasticity (62N/40E)
Seizure Disorder (35N/13E)
Asperger (15N/9E)
autworks.hms.harvard.edu
12Multi-disorder component of autism (MDAG)
- 66 out of 127 involved in at least one member of
the autism cluster - Highly connected component of the autism network
13(No Transcript)
14Significantly enriched MDAG processes
Cell Proliferation
P 2.7E-02
CNS Development
P 3.29E-11
Ion Transport
P 7.68E-10
Synaptic Transmission
P 2.45E-04
- Fishers exact test
- Bonferroni adjustment
- 14648 biological processes from Gene Ontology
tested
15Process-Driven Predictions
Putative New Genes
Biological Processes
Autism Cluster Disorders
Fragile X
CNS development
64 new genes, all of which occur in 2 or more of
the Autism Cluster Disorders
Synaptic Transmission
Tuberous Sclerosis
Ion Transport
Cell Proliferation
Seizure Disorder
Mental Retardation
16Experimental Validation
- GEO6575 (from UC Davis M.I.N.D. institute)
- White blood cell Affymetrix U133plus2.0
- 17 samples of autistic children without
regression - 18 children with regression
- 9 children with mental retardation or
developmental delay - 12 typically developing children from the general
population
17Blood for Brain
18Autism without regression (17)
Autism with regression (18)
19Experimental Validation
- GEO6575 (from U.C. Davis M.I.N.D. institute)
- White blood cell Affymetrix U133plus2.0
- 17 samples of autistic patients without
regression - 18 patients with regression
- 9 patients with mental retardation or
developmental delay - 12 typically developing children from the general
population
20Data-driven approach to FDR detection can be
ineffective
- Standard data-driven application of false
discovery rate control yields few genes below FDR
threshold of 0.05. (with these data, only 2 genes
survive) - This is a frequent circumstance in instances of
weak signal and large background noise (e.g.
microarray experiments)
21Results of process-driven search
- 43 Process-derived gene predictions had
FDR-adjusted p values lt0.05 - Highly significant rate of validation -- 65 of
predictions confirmed by expression data
22Network-Driven Predictions
23Results of network-driven search
- 267 occurred in 1 autism cluster disorder
- 58 occurred in 2
- 17 in 3
- 3 in 4 sibling disorders
- A total of 345 new predictions
24Results of network-driven search
- 301 had FDR-adjusted p values lt0.05
- 90 (!) of predictions verified by expression
data
25Prior knowledge focuses whole-genomic search
- 43 Process-derived gene predictions had
FDR-adjusted p values lt0.05. 65 - 301 Network-derived gene predictions had
FDR-adjusted p values lt0.05. 90
The rate of validation in both cases is
significantly non-random
26Top 20 genes occurring in 3 or more Autism
Sibling Disorders
For many of these candidates, their roles in
neurological impairment have been studied in
autism cluster disorders, but not in autism.
27Molecular Triangulation
Mental Retardation Fragile X Hypotonia
Ataxia Hypoxia
AR
SLC16A2
Microcephaly Rett Syndrome Spasticity Tuberous
Sclerosis
L1CAM
OPHN1
FXN
MYO5A
SLC6A8
FLNA
PAFAH1B1
28Conclusions
- Previous research has implicated between 100 and
1500 genes as contributors to the molecular
physiology of Autism. - Our knowledge-driven approach provides a logical
means to filter the genome wide search.
29Conclusions
- Global ask swamped by noisy signal
- Informed, knowledge-driven ask results in
biologically significant gene predictions - Comparative analysis of Autism with related
neurological disorders provides a focused search
for novel gene candidates
30Autworks
- Autworks is a web-driven navigation system that
allows any researcher to view and search through
the network of genes implicated in autism and
related neurological disorders - Built to aid and abet the role of serendipity and
inspiration for researchers working on autism and
other complex neuro diseases. - http//autworks.hms.harvard.edu
31Autworks now
32The Plan
- Bring our analytical strategies and Autworks to
the cloud - Beef up underbelly using AWS storage and the
Amazon Turkforce - Scale up comparative network analysis
- Enlarge validation database, verify/re-verify
computational predictions, robustify the
candidates
33Aim 1 Build the neurological disease gene core
of the Autworks relational database
Can be queried with a disease or gene term
34Aim 1 Steps
- (1) Extract the entire set of neurological
disorders listed by NINDS (currently 433) to
ensure that we can find any and all commonalities
to Autism. - (2) Mine all databases in above Table that can be
searched using a disease term as the query,
specifically the Online Mendelian Inheritance in
Man (OMIM), GeneCards, Chromosomal Variation in
Man, the Human Gene Mutation Database (HGMD), and
SNPedia. - (3) Combine and import the features from each of
the online resources into a relational database
that will become the backend of Autworks, being
careful to remove any redundancies. - (4) Cross-reference resources to comprehensively
populate data model.
35Gene-disease data model Gene Core
This data model will share much in common with
Variome projects database
36MeSH Major Topics
GeneTagger
Candidate gene filtered
MeSH term filtered
PMID 17304222 We identified an important
component for controlled actin assembly, abelson
interacting protein-1 (Abi-1), as a binding
partner for the postsynaptic density (PSD)
protein ProSAP2/Shank3. During early neuronal
development, Abi-1 is localized in neurites and
growth cones at later stages, the protein is
enriched in dendritic spines and PSDs
PMID 17173049 SHANK3 (also known as ProSAP2)
regulates the structural organization of
dendritic spines and is a binding partner of
neuroligins genes encoding neuroligins are
mutated in autism and Asperger syndrome. Here, we
report that a mutation of a single copy of SHANK3
on chromosome 22q13 can result in language and/or
social communication disorders...
Can we Turkify this process???
Annotator Checks Accuracy through BioNotate system
Results Gene-Gene Gene-Disease Corpora
ABI1
Shank3
Shank3
Autism
37Aim 2 Build interaction network cores for
Autworks
38Network core
Interaction Core
Ataxia
GO
Co-Ex
Mental Retardation
Can we cloud it up???
Classifier
P-P intx
Bibliome
Phylo-profiles
Autism
39Aim 3 comparative network analysis on the cloud
- Find disease filtered interacting partners
- Find shortest paths btw candidates
- Find minimal subnetworks
- Verify and reconstruct networks appropriately
Autism
Schizophrenia
40Mental Retardation
Genetic Landscape of Autism
Rett Syndrome
Angelman Syndrome
41Autism Diseaseome
42Acknowledgments
- Zak Kohane
- Matt Huyck
- Tom Monaghan
- Todd DeLuca
- Nieves Mendizabel
- Paco Esteban
- Joaquin Goni
- Alal Eran
- Michal Galdzicki
- Lou Kunkel
- Alexa McCray
- Leon Peshkin