Towards BioMedSpace: Infrastructure Integrative Biomedical Informatics - PowerPoint PPT Presentation

1 / 80
About This Presentation
Title:

Towards BioMedSpace: Infrastructure Integrative Biomedical Informatics

Description:

Towards BioMedSpace: Infrastructure Integrative Biomedical Informatics – PowerPoint PPT presentation

Number of Views:74
Avg rating:3.0/5.0
Slides: 81
Provided by: CAN128
Category:

less

Transcript and Presenter's Notes

Title: Towards BioMedSpace: Infrastructure Integrative Biomedical Informatics


1
Towards BioMedSpace Infrastructure Integrative
Biomedical Informatics
Bruce R. SchatzDepartment of Medical Information
Science Institute for Genomic Biology University
of Illinois at Urbana-Champaign schatz_at_uiuc.edu,
www.canis.uiuc.edu
Center Computational Medicine Bioinformatics Unive
rsity of Michigan, November 4, 2009
2
The Grand Vision
  • Schatz, Jan 1997

3
Linguistics Levels and Universal Units
  • 1985 Syntax Files (wholes)
  • 1995 Structure Objects (parts)
  • 2005 Semantics Concepts (meaning)
  • 2015 Pragmatics Features (reality)

4
Building Analysis Environments
  • Biomedical Informatics
  • Bioinformatics Concept Summarization
  • Medinformatics Cohort Identification
  • Integrative Biology
  • Federation search similarity
  • Integration navigation links

5
BeeSpace Goals
  • Bioinformatics Flagship within NSF
  • Frontiers Integrative Biological Research
  • BEE BIOLOGY
  • Experimentally measure gene expression in the
    brain for important societal roles during normal
    behavior varying heredity (nature) and
    environment (nurture)
  • SPACE INFORMATICS
  • Interactively annotate functions for differential
    expression using conceptbased navigation of
    biological literature and genebased
    summarization analysis

6
Conceptual Navigation in BeeSpace
7
From Bases to Spaces
  • Comparative Genomics using Classical Models
  • data Bases support genome data
  • Sequence-based gene analysis using FlyBase
  • To standard classifications such as Gene Ontology
  • Based on manual annotation by human curators
  • information Spaces support biological information
  • Literature-based gene analysis using BeeSpace
  • To computed classifications via extracted
    concepts
  • Based on automatic annotation by conceptual
    relationships
  • Descriptions in Literature MUST be used in future
  • interactive environments for functional analysis!

8
(No Transcript)
9
System Versions
  • V1 Filter Concept Graph
  • Search, Expand, Merge, Switch, Visualize
  • V2 Cluster Conceptual Groupings
  • Small Worlds (Natural), Language Model
    (Steerable), Concepts/Documents
  • V3 Summarize Gene Descriptions
  • Gene Extraction, Sentence Classification
  • V4 Analyze Functional Concepts
  • Concept Identification, Category Grouping
  • V5 Answer Entity Relationships
  • Entities, Relations, Templates

10
Automatic Categorization v2
  • Sorting of Spaces based on Metadata
  • Sorting of Spaces based on Ontology
  • MeSH for Medline Abstracts
  • Gene Ontology computed for documents
  • Sorting of Spaces based on Clustering
  • Natural Maps from Small Worlds
  • Steerable Maps from Language Models
  • Semantic Indexing of Dynamic Spaces
  • Fast System enables Interactive Sorting!

11
Small World Graph
12
Semantics Deeper and Faster
  • Semantic Indexing across all of Medline
  • Previous Attempts used Word Co-Occurrence
  • Now Phrase Parser works general-purpose
  • Now Mutual Information full differential
  • Parallel Optimization of MI Graph
  • Real-time Computation Shared Memory Cluster
  • Interactive on our 16PC 256GB RAM workerbee
  • Dynamic Spaces then Dynamic Semantic Indexing
  • Interactive Clustering Natural Map
  • Heuristic Approximation Small Worlds Graphs

13
Dynamic Clustering
  • Community Structure enables Dynamic Clustering
    with Large Vectors

14
Automatic Curation v3
  • Automatic Summarization of Genes
  • Retrieve relevant sentences about gene
  • Classify sentences into important aspects
  • protein domain, homolog/ortholog
  • expression pattern, phenotype function
  • regulatory element, genetic interaction
  • Generalizing to Biology Entities
  • Genes, anatomical, behavior, chemical
  • Question answering from biology factoids
  • Computed Curation from Literature

15
Gene Summary (FlyBase)
16
Gene Summary (BeeSpace)
  • Structured summary consists of relevant sentences
    covering 6 aspects of a gene
  • Gene Products (GP)
  • Expression Location (EL)
  • Sequence Information (SI)
  • Wild-type Function Phenotypic Information
    (WFPI)
  • Mutant Phenotype (MP)
  • Genetical Interaction (GI)

17
Drosophila gene Abelson (Abl) tyrosine kinase
18
Tribolium gene Scr
19
Gene Summarizer New Aspects
  • New categories (proposed by FlyBase curators)
  • GP SI gt PS (protein domain or structure)
  • SI gt HO (homologs or orthologs)
  • EL gt EP (spatial/temporal expression patterns)
  • SI gt RE (regulatory element information)
  • WFPI MP gt PF (wild-type or mutant phenotype
    and function)
  • GI gt IT (genetic or physical interaction)
  • New (beyond FlyBase) gt PG (population genetics)
  • Utilize cross-domain information for improving
    the GS on other organisms.

20
(No Transcript)
21
Semantics Deeper and Faster
  • Semantic Indexing across all of Medline
  • Previous Attempts used Word Co-Occurrence
  • Entity Recognition works general-purpose
  • Function Categorization works general-purpose
  • Parallel Optimization of Entity Summarization
  • Batch Computation on national Cloud Cluster
  • Yahoo/HP/Intel 1000 processor cloud computer
  • Largest job thus far (10hrs, 512cores)
  • Interactive Clustering underway Steerable Map
  • Hybrid of Language Model and Small Worlds

22
BeeSpace System v3
  • SPACES and REGIONS
  • Dynamic and Relative
  • Space is collection of documents
  • Region is collection of terms
  • Extract creates new Region from old Space
  • Map creates new Space from old Region
  • New from Old Spaces and Regions via merges
  • Summarize classifies Gene within Space
  • Annotate finds differential functional expression

23
BeeSpace Semantic Operations
  • Extract
  • S
  • R
  • Map
  • R
  • S
  • Merge (S1,S2) into S3
  • Summarize (S) into Gene classify

24
(No Transcript)
25
(No Transcript)
26
(No Transcript)
27
(No Transcript)
28
(No Transcript)
29
(No Transcript)
30
(No Transcript)
31
(No Transcript)
32
(No Transcript)
33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
(No Transcript)
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
New Interface v4
  • Single Window, Multiple Panes
  • Space Panel, Service Tabs
  • SPACES custom, system
  • FILTER searching, sorting
  • CLUSTER map natural and steerable
  • SUMMARIZE categorize using space
  • ANALYZE annotate using space

41
Functional Analysis v4
  • The software system goes beyond a searchable
    database, using statistical literature analyses
    to discover functional relationships between
    genes and behavior.
  • This research will enable all scientists who
    study bee genes to live on the frontier of
    integrative biology, where biotechnology enables
    routine expression analysis and bioinformatics
    enables functional analysis
  • unconstrained by pre-existing categories.
  • Genelist Analyzer v4
  • -Differential Expression of Gene Names against
    Space
  • -Background is custom made Literature Space
  • -Produces Concept List from Gene List
  • -Analyze using Concept Navigation and Gene
    Summarization

42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
(No Transcript)
48
(No Transcript)
49
(No Transcript)
50
(No Transcript)
51
(No Transcript)
52
(No Transcript)
53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
(No Transcript)
58
(No Transcript)
59
(No Transcript)
60
(No Transcript)
61
(No Transcript)
62
Towards the Interspace
  • The Analysis Environment technology is
    GENERAL! BirdSpace? BeeSpace?
  • PigSpace? CowSpace?
  • ArthropodSpace? AnimalSpace?
  • BioSpace? MedSpace?

63
Question Answering v5
  • Entities and Relations
  • Question Answering templates
  • Entity
  • Gene, Anatomical
  • Behavior, Chemical
  • Relation
  • Regulation (Gene-Gene)
  • Expression (Gene-Anatomy)
  • Function (Gene-Behavior) Biological Process
  • Function (Gene-Chemical) Molecular Function

64
(No Transcript)
65
(No Transcript)
66
(No Transcript)
67
(No Transcript)
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
Towards Question Answering
  • Merging Filter and Summarize
  • Extract Entities from Literature
  • Generate Relations from Entities
  • Generate Answers from Relations
  • Question Answering is
  • Multiple Steps of
  • Entity Relation Semantics

73
(No Transcript)
74
(No Transcript)
75
Healthcare Infrastructure
  • Provider Pyramids
  • Scale to Volumes for Chronic Illness
  • Population Health with Patient Tracking
  • Risk Assessment
  • Cohort Cluster by Treatment Effects
  • Simulate with Yahoo Health Messages

76
Symptom Clustering
77
Condition Clustering
78
Informatics Researchers (Faculty)
  • Investigators
  • Bruce Schatz, systems (Medical Information
    Science)
  • ChengXiang Zhai, algorithms (Computer Science)
  • Collaborators (students)
  • Saurabh Sinha, Computer Science
  • Jiawei Han, Computer Science
  • Sheng Zhong, Bioengineering
  • Nathan Price, Chemical Biomolecular Engineering
  • Collaborators (advices)
  • John MacMullen, Library Information Science
  • Dan Roth, Computer Science
  • Roxana Girju, Linguistics
  • Karrie Karahalios, Computer Science

79
Informatics Researchers (Staff)
  • V1-V3
  • Todd Littell, research programmer
  • Jim Buell, research coordinator
  • Nyla Ismail, biology postdoc
  • Moushumi Sen Sarma, biology postdoc
  • V4-V5
  • David Arcoleo, research programmer
  • Barry Sanders, research programmer
  • Moushumi Sen Sarma, biology postdoc
  • Radhika Khetani, biology postdoc

80
Informatics Researchers (Students)
  • V1 Filter (parse)
  • Jing Jiang, Azadeh Shakery, Yuanhua Lv
  • V2 Cluster (group)
  • Brant Chee, Qiaozhu Mei, Peixiang Zhao
  • V3 Summarize (classify)
  • Xu Ling, Jing Jiang, Qiaozhu Mei, Xin He
  • V4 Analyze (annotate)
  • Xin He, Brant Chee, Moushumi Sarma, Xu Ling
  • V5 Answer (extract)
  • Xu Ling, Xin He, Yanen Li, Yue Lu
Write a Comment
User Comments (0)
About PowerShow.com