THE INTERSPACE PROTOTYPE An Analysis Environment for Semantic Interoperability - PowerPoint PPT Presentation

1 / 44
About This Presentation
Title:

THE INTERSPACE PROTOTYPE An Analysis Environment for Semantic Interoperability

Description:

Gene Products (GP) Expression Location (EL) Sequence Information (SI) ... GP SI = PS (protein domain or structure) SI = HO (homologs or orthologs) ... – PowerPoint PPT presentation

Number of Views:86
Avg rating:3.0/5.0
Slides: 45
Provided by: CAN128
Category:

less

Transcript and Presenter's Notes

Title: THE INTERSPACE PROTOTYPE An Analysis Environment for Semantic Interoperability


1
  • BeeSpace
  • Interactive Functional Analysis
  • of Arthropod Genome Data

Bruce Schatz Institute for Genomic
BiologyUniversity of Illinois at
Urbana-Champaign schatz_at_uiuc.edu ,
www.beespace.uiuc.edu
2nd Arthropod Genomics Symposium Kansas City,
April 11, 2008
2
(No Transcript)
3
BeeSpace FIBR Project
  • BeeSpace project is NSF FIBR flagship
  • Frontiers Integrative Biological Research,
  • 5M for 5 years at University of Illinois
  • Analyzing Nature and Nurture in Societal Roles
    using honey bee as model
  • (Functional Analysis of Social Behavior)
  • Genomic technologies in wet lab and dry lab
  • Bee Biology gene expressions
  • Space Informatics concept navigations

4
System Architecture
  • BeeSpace
  • Concepts
  • Concepts
  • SEQ
  • Expressions
  • Expressions
  • Databases
  • Bees
  • Flies
  • Documents
  • Documents
  • SEQ
  • Community
  • Community

5
Informatics From Bases to Spaces
  • data Bases support genome data
  • e.g. FlyBase has sequences and maps
  • Genes annotated by GeneOntology and
  • linked to biological literature
  • information Spaces support biological literature
  • e.g. BeeSpace uses automatically generated
  • conceptual relationships to navigate functions

6
  • v3.6

7
Gene Summary (FlyBase)
8
Gene Summary (BeeSpace)
  • Structured summary consists of relevant sentences
    covering 6 aspects of a gene
  • Gene Products (GP)
  • Expression Location (EL)
  • Sequence Information (SI)
  • Wild-type Function Phenotypic Information
    (WFPI)
  • Mutant Phenotype (MP)
  • Genetical Interaction (GI)

9
Software Overview
  • Two-stage summarization system
  • Retrieve relevant sentences about gene
  • Automated gene name recognition
  • Synonym expansion using known synonym tables from
    FlyBase (MODs) and Entrez
  • Extract most informative and relevant sentences
    for each of the 6 aspects
  • Categorize relevant sentences about a gene into 6
    predefined aspects

10
Gene Summarizer
  • The generated summaries are directly useful to
    biologists, and also serve as entry points to
    enable them to quickly navigate relevant
    literatures, via the BeeSpace analysis
    environment available at
  • www.beespace.uiuc.edu

11
Drosophila gene Abelson (Abl) tyrosine kinase
12
Tribolium gene Scr
13
Problems in Summarizing Organisms
  • Lack of high quality example sentences training
    sentences are sentences written by the FlyBase
    curators to explain their database decisions, not
    sentences from articles.
  • Domain bias only sentences about Drosophila
    melanogaster are used for training the GS.
    Differences across organisms, eg., terminology,
    writing styles, etc., are not utilized in current
    implementation.

14
Gene Summarizer New Aspects
  • New categories (proposed by FlyBase curators)
  • GP SI gt PS (protein domain or structure)
  • SI gt HO (homologs or orthologs)
  • EL gt EP (spatial/temporal expression patterns)
  • SI gt RE (regulatory element information)
  • WFPI MP gt PF (wild-type or mutant phenotype
    and function)
  • GI gt IT (genetic or physical interaction)
  • New (beyond FlyBase) gt PG (population genetics)
  • Utilize cross-domain information for improving
    the GS on other organisms.

15
(No Transcript)
16
Trained Adaptability
  • V3.5 trained on curator sentences FlyBase
  • V3.6 trained on biologist sentences BeetleBase
  • New specific Training Sessions incorporated
  • BeetleBase (KSU) on beetle sentences
  • Bee Lab (UIUC) on bee/beetle/fly sentences
  • Abstracts from Medline and Biosis
  • Believe this training suffices for ArthropodSpace!

17
Concept Navigation in BeeSpace
18
Analysis Environment Features
  • SPACE is a Paradigm not a Metaphor!
  • Point of View for YOUR Problem
  • Externally
  • -Dynamically describe custom Region of Space
  • -Merge Regions to form Hypothesis Space
  • -Differentially express genes against Space

19
Analysis Environment System
  • Concepts and Genes are Universal Entities!
  • Uniformly Represented
  • Uniformly Manipulated
  • Internally
  • -Extract and Index Concepts within Collections
  • -Navigate Concepts within Documents
  • -Follow Genes from Documents into Databases

20
BeeSpace v3.5 Session
  • Refining and Merging Regions of Space
  • Cross bee species differential gene expression
    for behaviorial maturation into adult forager
  • Functional Analysis for Similar Situation!
  • Behavioral Maturation merge
  • Cross-Species Comparisons merge
  • Different Taxa (insect,fish,bird,rodent)

21
BeeSpace Semantic Operations
  • Extract
  • S
  • R
  • Map
  • R
  • S
  • Merge (S1,S2) into S3
  • Summarize (S) into Gene classify

22
(No Transcript)
23
(No Transcript)
24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
  • v4

38
New Interface v4
  • Single Window, Multiple Panes
  • Email Model Folders, Lists, Messages
  • Coordinated Levels of Resolution
  • SPACES merge, switch
  • MAPS topic, cluster, ontology
  • DOCUMENTS summarize, annotate

39
(No Transcript)
40
Future Work
  • Release v4 to Arthropod Base Consortium
  • Target is September 2008 (pre-release during
    summer)
  • Streamlined Interface within Single Window
  • Space Navigation
  • Topic Maps
  • Gene Summarizer
  • Gene Annotator
  • Seek Website links from Bases!!
  • Letters for continued Funding USDA, NSF, NIH

41
Towards the Interspace
  • The Analysis Environment technology is
    GENERAL! BirdSpace? BeeSpace?
  • PigSpace? CowSpace?
  • ArthropodSpace?
  • BioSpace?

42
Faculty Investigators
  • BeeSpace at Institute for Genomic Biology
  • University of Illinois at Urbana-Champaign
  • Biology
  • Gene Robinson, Entomology (behavioral
    expression)
  • Susan Fahrbach, Wake Forest (anatomical
    localization)
  • Sandra Rodriguez-Zas, Animal Sciences (data
    analysis)
  • Informatics
  • Bruce Schatz, Medical Information Science
    (systems) ChengXiang Zhai, Computer Science (text
    analysis)
  • Chip Bruce, Library Information Science (users)

43
Informatics Researchers I
  • Bruce Schatz (faculty), systems
  • ChengXiang Zhai (faculty), algorithms
  • David Arcoleo (staff), research programmer
  • Barry Sanders (staff), interface designer
  • Moushumi Sen Sarma (postdoc), biology user
  • Jim Buell (staff), project coordinator
  • Faculty Collaborators
  • Saurabh Sinha (CS), Sheng Zhong (BIOENG)

44
Informatics Researchers II
  • Bioinformatics students
  • Xu Ling gene summarizer (cf. FlyBase)
  • Xin He gene annotator (meta-analysis)
  • Yanen Li gene classification (ontology)
  • Computer Science students
  • Yue Lu entity recognizer (relations)
  • Peixiang Zhao concept switching (subgraphs)
  • Yuanhua Lv user personalization (spaces)
  • Informatics students
  • Brant Chee semantic clusterer (smallworlds)
Write a Comment
User Comments (0)
About PowerShow.com