BioRDF Breakout - PowerPoint PPT Presentation

About This Presentation
Title:

BioRDF Breakout

Description:

BioRDF Breakout Introduction Kei Cheung Mage-tab Michael Miller vOID Jun Zhao (remote) aTag Matthias Samwald (remote) Discussion All – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 29
Provided by: kc24
Learn more at: https://www.w3.org
Category:

less

Transcript and Presenter's Notes

Title: BioRDF Breakout


1
BioRDF Breakout
  • Introduction Kei Cheung
  • Mage-tab Michael Miller
  • vOID Jun Zhao (remote)
  • aTag Matthias Samwald (remote)
  • Discussion All

2
BioRDF Breakout Microarray Use Case
  • Kei Cheung, Ph.D.
  • Associate Professor
  • Yale Center for Medical Informatics

HCLS IG Face-to-Face Meeting, Santa Clara,
California, November 2-3, 2009
3
Introduction
  • Whole-genome expression profiling has created a
    revolution in the way we study disease and basic
    biology.
  • DNA microarrays allow scientists to quantify
    thousands of genomic features in a single
    experiment
  • Since 1997, the number of published results based
    on an analysis of gene expression microarray data
    has grown from 30 to over 5,000 publications per
    year
  • Major public microarray data repositories have
    been created in different countries (e.g., NCBI
    GEO, EBI ArrayExpress, and CIBEX)

4
Microarray Workflow
5
An Example of differentially expressed genes
6
Importance of Integrating Microarray Data
  • Due to the high cost and low reproducibility of
    many microarray experiments, it is not surprising
    to find a limited number of patient samples in
    each study,
  • Very few common identified marker genes among
    different studies involving patients with the
    same disease.
  • It is of great interest and challenge to merge
    data sets from multiple studies to increase the
    sample size, which may in turn increase the power
    of statistical inferences.
  • The integration of external information resources
    is essential in interpreting intrinsic patterns
    and relationships in large-scale gene expression
    data

7
Microarray Data Standards
  • MGED
  • MIAME
  • MAGE-ML
  • MAGE-TAB

8
Some Examples
  • Joint analysis of two microarray gene-expression
    data sets to select lung adenocarcinoma marker
    genes (Jiang et al. 2004 BMC Bioinformatics)
  • Large-scale integration of cancer microarray data
    identifies a robust common cancer signature (Xu
    et al. 2007 BMC Bioinformatics)
  • What about neurosciences?

9
Access to and Use of Microarray data in
Neuroscience
  • NIH Neuroscience Microarray Consortium
  • Public repositories such as GEO and ArrayExpress
    (including data generated from neuroscience
    microarray experiments)
  • Brain atlases (e.g., Allen Brain Atlas and GenSAT)

10
Ontology-Based Integration
Microarray experiment 1
Microarray experiment 2
11
Example Federated Queries
  • Retrieve a list of differentially expressed genes
    between different brain regions (e.g.,
    hippocampus and entorhinal cortex) for normally
    aged human subjects.
  • Retrieve a list of differentially expressed genes
    for the same brain region of normal human
    subjects and AD patients.
  • Using these lists of genes one can issue
    (federated) queries to retrieve additional
    information about the genes for various types of
    analyses (e.g., GO term enrichment).

12
Microarray Experiment Descriptions
E-GEOD-3296 Transcription profiling of primary
mouse embryonic fibroblasts (MEFs) from
C57B1/6x129/Sv F2 e14.5 embryos that contain a
deletion in the CH1 domain of three of four
alleles of CBP and p300 The CH1 protein
interaction domain of the transcriptional
coactivators p300 and CBP is thought to interact
with HIF-1alpha and this interaction is thought
to be critical to the expression of HIF-1alpha
target genes in response to hypoxia. Trichostatin
A (TSA), an inhibitor of histone deacetylases,
has been reported to repress the expression of
HIF-1alpha target genes. To test the requirement
of the CH1 domain and TSA for gene expression in
response to dipyridyl (a hypoxia mimetic),
primary mouse embryonic fibroblasts (MEFs) were
generated from C57Bl/6x129/Sv F2 e14.5 embryos
that contain a deletion in the CH1 domain of
three of four alleles of CBP and p300. The
remaining allele of p300 or CBP was a conditional
knock out allele. Control MEFs with only a single
conditional knockout allele of p300 or CBP were
also generated. At passage 3 MEFs were infected
with Cre Adenovirus and grown until they had
expanded at least 100 fold. Subconfluent MEFs
were treated with ethanol vehicle or 100ng/ml TSA
with 5 carbon dioxide at 37 C in a humid chamber
for 30 min., followed by ethanol vehicle or 100
umdipyridyl (DP) for an additional 3hrs.
Immediately after treatment, cells were lysed in
Trizol for RNA extraction. E-GEOD-3327
Transcription profiling of different regions of
mouse brain to study adult mouse gene expression
patterns in common strains. Adult mouse gene
expression patterns in common strains. Experiment
Overall Design six mouse strains and seven brain
regions were analyzed E-GEOD-358 Transcription
profiling of rat whole brain samples from animals
with repeated exposure to the anaesthetic
isoflurane 12 Controls, 3 5-exposures, 3
10-exposures. Rats were exposed to 90 minutes of
1.0 isoflurane twice a day for a total of 5 or
10 exposures. Animals did not require intubation.
All exposures and hybridizations were performed
at the Univ. of Pennsylvania
13
Open Biomedical Annotator
14
Some Results
  • Two microarray experiments (E-GEOD-4034,
    E-GEOD-4035) contain the following set of terms
    fear, hippocampus, mouse.
  • These microarray experiments study the role of
    hippocampus in fear using mouse as the model.

15
Analysis tools
  • BioConductor
  • GenePattern
  • Genespring

16
Intercommunity collaboration
  • HCLS (BioRDF)
  • MGED (ArrayExpress)
  • NIF (NeuroLex)
  • Ontology community (NCBO)

17
Web of silos
18
Semantic Web Brilliant Web!
19
The End
20
Discussion
  • What is the RDF structure
  • Extension of SPARQL to empower data analysis
  • Workflow and provenance
  • Visualization
  • How to integrate database and literature
  • Integration of other types of data
  • Inter-community collaboration
  • Translational use cases

21
What should be the RDF structure?
  • Experiments
  • Samples
  • Experimental conditions/factors
  • Gene lists
  • Arrays/chips
  • Raw/processed data (e.g., CEL, GPR, gene matrix)

22
Extension of SPARQL
  • Hierarchical queries
  • Statistical analyses/tests
  • Enrichment analysis

23
Workflow and provenance
  • Taverna
  • Biomoby
  • Genepattern

24
Visualization
  • Cytoscape
  • TreeView

25
How to integrate database and literature
26
Inter-community Collaboration
  • NCBO
  • SWAN

27
What other types of data can be integrated with
microarray data
28
Translational use cases
Write a Comment
User Comments (0)
About PowerShow.com