Title: THE INTERSPACE PROTOTYPE An Analysis Environment for Semantic Interoperability
1- BeeSpace
- Interactive Functional Analysis
- of Arthropod Genome Data
Bruce Schatz Institute for Genomic
BiologyUniversity of Illinois at
Urbana-Champaign schatz_at_uiuc.edu ,
www.beespace.uiuc.edu
2nd Arthropod Genomics Symposium Kansas City,
April 11, 2008
2(No Transcript)
3BeeSpace FIBR Project
- BeeSpace project is NSF FIBR flagship
- Frontiers Integrative Biological Research,
- 5M for 5 years at University of Illinois
- Analyzing Nature and Nurture in Societal Roles
using honey bee as model - (Functional Analysis of Social Behavior)
- Genomic technologies in wet lab and dry lab
- Bee Biology gene expressions
- Space Informatics concept navigations
4System Architecture
5Informatics From Bases to Spaces
- data Bases support genome data
- e.g. FlyBase has sequences and maps
- Genes annotated by GeneOntology and
- linked to biological literature
- information Spaces support biological literature
- e.g. BeeSpace uses automatically generated
- conceptual relationships to navigate functions
6 7 Gene Summary (FlyBase)
8Gene Summary (BeeSpace)
- Structured summary consists of relevant sentences
covering 6 aspects of a gene - Gene Products (GP)
- Expression Location (EL)
- Sequence Information (SI)
- Wild-type Function Phenotypic Information
(WFPI) - Mutant Phenotype (MP)
- Genetical Interaction (GI)
9Software Overview
- Two-stage summarization system
- Retrieve relevant sentences about gene
- Automated gene name recognition
- Synonym expansion using known synonym tables from
FlyBase (MODs) and Entrez - Extract most informative and relevant sentences
for each of the 6 aspects - Categorize relevant sentences about a gene into 6
predefined aspects
10Gene Summarizer
- The generated summaries are directly useful to
biologists, and also serve as entry points to
enable them to quickly navigate relevant
literatures, via the BeeSpace analysis
environment available at - www.beespace.uiuc.edu
11Drosophila gene Abelson (Abl) tyrosine kinase
12Tribolium gene Scr
13Problems in Summarizing Organisms
- Lack of high quality example sentences training
sentences are sentences written by the FlyBase
curators to explain their database decisions, not
sentences from articles. - Domain bias only sentences about Drosophila
melanogaster are used for training the GS.
Differences across organisms, eg., terminology,
writing styles, etc., are not utilized in current
implementation.
14Gene Summarizer New Aspects
- New categories (proposed by FlyBase curators)
- GP SI gt PS (protein domain or structure)
- SI gt HO (homologs or orthologs)
- EL gt EP (spatial/temporal expression patterns)
- SI gt RE (regulatory element information)
- WFPI MP gt PF (wild-type or mutant phenotype
and function) - GI gt IT (genetic or physical interaction)
- New (beyond FlyBase) gt PG (population genetics)
- Utilize cross-domain information for improving
the GS on other organisms.
15(No Transcript)
16Trained Adaptability
- V3.5 trained on curator sentences FlyBase
- V3.6 trained on biologist sentences BeetleBase
- New specific Training Sessions incorporated
- BeetleBase (KSU) on beetle sentences
- Bee Lab (UIUC) on bee/beetle/fly sentences
- Abstracts from Medline and Biosis
- Believe this training suffices for ArthropodSpace!
17Concept Navigation in BeeSpace
18Analysis Environment Features
- SPACE is a Paradigm not a Metaphor!
- Point of View for YOUR Problem
- Externally
- -Dynamically describe custom Region of Space
- -Merge Regions to form Hypothesis Space
- -Differentially express genes against Space
19Analysis Environment System
- Concepts and Genes are Universal Entities!
- Uniformly Represented
- Uniformly Manipulated
- Internally
- -Extract and Index Concepts within Collections
- -Navigate Concepts within Documents
- -Follow Genes from Documents into Databases
-
20BeeSpace v3.5 Session
- Refining and Merging Regions of Space
- Cross bee species differential gene expression
for behaviorial maturation into adult forager - Functional Analysis for Similar Situation!
- Behavioral Maturation merge
- Cross-Species Comparisons merge
- Different Taxa (insect,fish,bird,rodent)
21BeeSpace Semantic Operations
- Merge (S1,S2) into S3
- Summarize (S) into Gene classify
22(No Transcript)
23(No Transcript)
24(No Transcript)
25(No Transcript)
26(No Transcript)
27(No Transcript)
28(No Transcript)
29(No Transcript)
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34(No Transcript)
35(No Transcript)
36(No Transcript)
37 38New Interface v4
- Single Window, Multiple Panes
- Email Model Folders, Lists, Messages
- Coordinated Levels of Resolution
- SPACES merge, switch
- MAPS topic, cluster, ontology
- DOCUMENTS summarize, annotate
39(No Transcript)
40Future Work
- Release v4 to Arthropod Base Consortium
- Target is September 2008 (pre-release during
summer) - Streamlined Interface within Single Window
- Space Navigation
- Topic Maps
- Gene Summarizer
- Gene Annotator
- Seek Website links from Bases!!
- Letters for continued Funding USDA, NSF, NIH
41Towards the Interspace
- The Analysis Environment technology is
GENERAL! BirdSpace? BeeSpace? - PigSpace? CowSpace?
-
- ArthropodSpace?
- BioSpace?
42Faculty Investigators
- BeeSpace at Institute for Genomic Biology
- University of Illinois at Urbana-Champaign
- Biology
- Gene Robinson, Entomology (behavioral
expression) - Susan Fahrbach, Wake Forest (anatomical
localization) - Sandra Rodriguez-Zas, Animal Sciences (data
analysis) - Informatics
- Bruce Schatz, Medical Information Science
(systems) ChengXiang Zhai, Computer Science (text
analysis) - Chip Bruce, Library Information Science (users)
43Informatics Researchers I
- Bruce Schatz (faculty), systems
- ChengXiang Zhai (faculty), algorithms
- David Arcoleo (staff), research programmer
- Barry Sanders (staff), interface designer
- Moushumi Sen Sarma (postdoc), biology user
- Jim Buell (staff), project coordinator
- Faculty Collaborators
- Saurabh Sinha (CS), Sheng Zhong (BIOENG)
44Informatics Researchers II
- Bioinformatics students
- Xu Ling gene summarizer (cf. FlyBase)
- Xin He gene annotator (meta-analysis)
- Yanen Li gene classification (ontology)
- Computer Science students
- Yue Lu entity recognizer (relations)
- Peixiang Zhao concept switching (subgraphs)
- Yuanhua Lv user personalization (spaces)
- Informatics students
- Brant Chee semantic clusterer (smallworlds)