Title: BeeSpace: An Interactive Environment for Functional Analysis of Social Behavior
1BeeSpace An Interactive Environment for
Functional Analysis of Social Behavior
- Bruce Schatz, Principal Investigator
- Graduate School of Library Information Science
(GSLIS) - Department of Computer Science, Program in
Neuroscience - schatz_at_uiuc.edu, www.canis.uiuc.edu
- Theme for Genomics of Neural and Behavioral
Plasticity - www.beespace.uiuc.edu
- IGB Thematic Research Seminar, November 2, 2004
2Bee Counted Vote Today!
3BeeSpace FIBR Project
- BeeSpace project is NSF FIBR flagship
- Frontiers Integrative Biological Research,
- 5M for 5 years at University of Illinois
- Nature-Nurture using honey bee as model
- Genome technologies in wet lab and dry lab
biology - Localized Gene Expression for Normal Social
Behavior - Gene Robinson, Entomology (behavioral
expressions) - Susan Fahrbach, Entomology (anatomical
localization) - Sandra Rodriguez-Zas, Animal Sciences (data
analysis) - Interactive Information System for Functional
Analysis - Bruce Schatz, Library Information Science
(info systems) - ChengXiang Zhai, Computer Science (text
analysis) - Chip Bruce, Library Information Science (user
support)
4Post-Genome Informatics
- Classical Organisms have extensive Genetic
Descriptions - There will be NO more classical organisms beyond
- Mice and Men other than Worms and Flies, Yeasts
and Weeds. - So must use comparative genomics to classical
organisms, - Via sequence homologies and literature analysis.
- Automatic annotation of genes to standard
classifications, - Such as Gene Ontology via sequence homology.
- Automatic analysis of functions to scientific
literature, - Such as concept spaces via text mining.
- Descriptions in Literature MUST be used for
future - interactive environments for functional analysis!
-
5Informational Science
- Computational Science is widely accepted as the
- Third Branch of Science (beyond Experimental and
Theoretical) - Genes are Computed, Proteins are Computed,
- Sequence equivalences are Computed.
- Informational Science is coming to be accepted as
the - Fourth Branch of Science
- Based on Information Science technologies for
- Functional Mining of Information Sources
- Comparative Analysis within the
- Dry Lab of Biological Knowledge
6Conceptual Navigation in BeeSpace
7Biology The Model Organism
- The Western Honey Bee, Apis mellifera
- has become a primary model for social behavior
- Complex social behavior in controllable urban
environment - Normal Behavior honey bees live in the wild
- Controllable Environment hives can be modified
- Small size manageable with current genomic
technology - Capture bees on-the-fly during normal behavior
- Record gene expressions for whole-brain or
brain-region
8Informatics From Bases to Spaces
- data Bases support genome data
- e.g. FlyBase has sequences and maps
- Genes annotated by GeneOntology and linked to
literature - BeeBase (Christine Elsik, Texas AM)
- Uses computed homologies to annotate genes
- information Spaces support biomedical literature
- e.g. BeeSpace uses automatically generated
- conceptual relationships to navigate functions
9BeeSpace Software Environment
- Will build a Concept Space of Biomedical
Literature for Functional Analysis of Bee Genes - -Partition Literature into Community Collections
- -Extract and Index Concepts within Collections
- -Navigate Concepts within Documents
- -Follow Links from Documents into Databases
- Locate Candidate Genes in Related Literatures
then - follow links into Genome Databases
10BeeSpace Software Implementation
- Natural Language Processing
- Identify noun phrases
- Recognize biological entities
- Statistical Information Retrieval
- Compute statistical contexts
- Support conceptual navigation
- Network Information System
- Concept switch across community collections
- Semantic Links into biological databases
11BeeSpace Information Sources
- Biomedical Literature
- Medline (medicine)
- Biosis (biology)
- Agricola, CAB Abstracts, Agris (agriculture)
- Model Organisms (heredity)
- -Gene Descriptions (FlyBase, WormBase)
- Natural Histories (environment)
- -BeeKeeping Books (Cornell Library, Harvard
Press)
12Worm Community System (1991)
- WCS Information Sources
- Literature Biosis, Medline, newsletters,
meetings - Data Genes, Maps, Sequences, strains, cells
- WCS Interactive Environment
- Browsing search, navigation
- Filtering selection, analysis
- Sharing linking, publishing
- WCS 250 users at 50 labs across Internet (1991)
- Flagship in NSF National Collaboratory program
13WCS Molecular
14WCS Cellular
15WCS PPCS demo
16Medical Concept Spaces (1998)
- Obtain discipline-scale collection
- Medline from NLM, 10M bibliographic abstracts
- human classification Medical Subject Headings
- Partition discipline into Community Repositories
- 4 core terms per abstract for MeSH classification
- 32K nodes with core terms (classification tree)
- Community is all abstracts classified by core
term - 40M abstracts containing 280M concepts
- computation took 2 days on NCSA Origin 2000
- Simulating World of Medical Communities
- 10K repositories with gt 1K abstracts (1K w/ gt
10K)
17Navigation in MedSpace
- For a patient with Rheumatoid Arthritis
- Find a drug that reduces the pain (analgesic)
- but does not cause stomach (gastrointestinal)
bleeding
Choose Domain
18Concept Search
19Concept Navigation
20Retrieve Document
21Biomedical Session
22Categories and Concepts
23Concept Switching
24Document Retrieval
25Biological Concept Spaces (2005)
- Compute concept spaces for All of Biology
- BioSpace across entire biomedical literature
- 50M abstracts across 50K repositories
- Use Gene Ontology to partition literature into
- biological communities for functional analysis
- GO same scale as MeSH but adequate coverage?
- GO light on social behavior (biological process)
26Interactive Functional Analysis
- BeeSpace will enable users to navigate a uniform
space of diverse databases and literature sources
for hypothesis development and testing, with a
software system that goes beyond a searchable
database, using statistical literature analyses
to discover functional relationships between
genes and behavior. - Genes to Behaviors
- Behaviors to Genes
- Concepts to Concepts
- Clusters to Clusters
- Navigation across Sources
27BeeSpace Information Sources
- General for All Spaces
- Scientific Literature
- -Medline, Biosis, Agricola, Agris, CAB Abstracts
- -partitioned by organisms and by functions
- Model Organisms
- -Gene Descriptions (FlyBase, WormBase, MGI, SCD,
TAIR) - Special Sources for BeeSpace
- -Natural History Books (Cornell Library, Harvard
Press)
28XSpace Information Sources
- Organize Genome Databases (XBase)
- Compute Gene Descriptions from Model Organisms
- Partition Scientific Literature for Organism X
- Compute XSpace using Semantic Indexing Technology
- Boost the Functional Analysis from Special
Sources - Collecting Useful Data about Natural Histories
- e.g. CowSpace Leverage in AIPL Databases
29Beyond BeeSpace
- The Analysis Environment technology is
GENERAL! BirdSpace? BehaviorSpace? BrainSpace?
SoySpace? CowSpace? IGBSpace? - BioSpace
-
- Internet will evolve into Interspace