Automatically Generating Gene Pathways from Biomedical Literature - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

Automatically Generating Gene Pathways from Biomedical Literature

Description:

Molecular Bioengineering. of Biomass Conversion. Genomic Ecology of Global Change ... Ordering Graphs into Pathways using Biology Datasets ... – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 23
Provided by: elaines3
Category:

less

Transcript and Presenter's Notes

Title: Automatically Generating Gene Pathways from Biomedical Literature


1
Automatically Generating Gene Pathways from
Biomedical Literature
  • Bruce Schatz, Principal Investigator
  • Department of Medical Information Science
  • University of Illinois at Urbana-Champaign
  • schatz_at_uiuc.edu, www.canis.uiuc.edu
  • Institute for Genomic Biology
  • Theme for Genomics of Neural and Behavioral
    Plasticity
  • www.beespace.uiuc.edu
  • U. Michigan NCBC Collaboration Visit, February
    10, 2006

2
Institute for Genomic Biology (IGB)
  • Project Cost 75 million
  • Gross square feet (GSF) 174,485
  • Net assignable square feet (NASF) 100,816
  • Move in Headcount 392
  • Completion Date October 2006

3
IGB Research Schematic
Host-Microbe Systems
Microbial Genome Informatics
Genomics of Neural and Behavioral Plasticity
Mining Microbial Genomes for Novel Antibiotics
Animal Genome Informatics
Genomic Ecology of Global Change
Regenerative Biology and Tissue Engineering
Biocomplexity
Molecular Bioengineering of Biomass Conversion
Research Core Facilities
Vivarium
Bioinformatics
Precision Proteomics
Program Area 1 - Systems Biology Program Area 2 -
Cellular and Metabolic Engineering Program Area 3
- Genome Technology
4
BeeSpace FIBR Project
  • BeeSpace project is NSF FIBR flagship
  • Frontiers Integrative Biological Research,
  • 5M for 5 years at University of Illinois
  • Nature-Nurture using honey bee as model
  • Genome technologies in wet lab and dry lab
    biology
  • Localized Gene Expression for Normal Social
    Behavior
  • Gene Robinson, Entomology (behavioral
    expressions)
  • Susan Fahrbach, Entomology (anatomical
    localization)
  • Sandra Rodriguez-Zas, Animal Sciences (data
    analysis)
  • Interactive Information System for Functional
    Analysis
  • Bruce Schatz, Library Information Science
    (info systems)
  • ChengXiang Zhai, Computer Science (text
    analysis)
  • Chip Bruce, Library Information Science (user
    support)

5
Conceptual Navigation in BeeSpace
6
BeeSpace Prototype Collections
  • Organism
  • Bee Apis mellifera
  • Fly  Fly Ecology, Evolution and Behavior
  • Bird  Bird Communication
  • Development
  • Behaviorial  Maturation
  • Development  Development of insects
  • Communication  Communication by insects
  • Behavior
  • Agonistic Agonistic and Territorial Behaviors
  • Forage Behavior of Resource Acquisition
  • Nest  Home Maintenance and Defense
  • Social Behavior of Social Integration in Insects

7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
(No Transcript)
12
Pathway ExtractionPrototype System
  • Systems Integration and Data Collections --
    Bruce Schatz
  • Text Documents producing Relationship Graphs
  • Language Based Entity Summarization ChengXiang
    Zhai (CS)
  • Frequent Subgraph Data Mining Jiawei Han (CS)
  • Ordering Graphs into Pathways using Biology
    Datasets
  • Microarray Derived Signaling Pathway Sheng
    Zhong (Bioeng)
  • Sequence Derived Gene Regulation Saurabh Sinha
    (CS)
  • Microarray Derived Functional Clusters Ping Ma
    (Statistics)

13
From Text to Entity-Relation Graph
Enhance

Gene
Biomedical Text
Gene
14
Adapting Biological Named Entity Recognizer
E
test data
training data
T1
Tm

15
Preliminary Evaluation Results
  • Recognizing gene names
  • Maximum entropy/Logistic regression recognizer
  • Text data from BioCreAtIvE (Medline)
  • 3 organisms (Fly, Mouse, Yeast), each contributes
    5,000 sentences with 2,500 with gene mentions

16
Scalable Mining and Searching of Graph Data
  • Efficient frequent and closed graph mining GSpan
    (1 in Google Scholar graph mining) and
    CloseGraph (11)
  • gIndex Graph indexing by data mining approach
    (SIGMOD04, invited to TODS05)
  • Search graphs in massive biological and chemical
    databases
  • Grafil Similarity search in massive graph data
    sets (SIGMOD05, invited to TODS06)
  • Pattern compression and profile exploration
    (KDD05 award, invited to Machine Learning
    journal, VLDB05)

17
CODENSE Mine Coherent Dense Subgraphs (ISMB05)
18
Our Recent Work on Graph/Network Mining
  • D. Cai, Z. Shao, X. He, X. Yan, and J. Han,
    Community Mining from Multi-Relational
    Networks, PKDD'05.
  • H. Hu, X. Yan, Yu, J. Han and X. J. Zhou, Mining
    Coherent Dense Subgraphs across Massive
    Biological Networks for Functional Discovery,
    ISMB'05.
  • C. Liu, X. Yan, H. Yu, J. Han, and P. S. Yu,
    Mining Behavior Graphs for Backtrace'' of
    Noncrashing Bugs'', SDM'05
  • C. Liu, X.Yan, and J. Han, Mining Control Flow
    Abnormality for Logic Error Isolation, SDM'06.
  • X. Yan and J. Han, gSpan Graph-Based
    Substructure Pattern Mining, ICDM'02
  • X. Yan and J. Han, CloseGraph Mining Closed
    Frequent Graph Patterns, KDD'03
  • X. Yan, P. S. Yu, and J. Han, Graph Indexing A
    Frequent Structure-based Approach, SIGMOD'04
  • X. Yan, X. J. Zhou, and J. Han, Mining Closed
    Relational Graphs with Connectivity Constraints,
    KDD'05
  • X. Yan, P. S. Yu, and J. Han, Substructure
    Similarity Search in Graph Databases, SIGMOD'05
  • X. Yan, F. Zhu, J. Han, and P. S. Yu, Searching
    Substructures with Superimposed Distance,
    ICDE'06
  • D. Xin, J. Han, X. Yan and H. Cheng, Mining
    Compressed Frequent-Pattern Sets, VLDB'05.
  • X. Yan, H. Cheng, J. Han, and D. Xin,
    Summarizing Itemset Patterns A Profile-Based
    Approach, KDD'05.

19
From microarray to transcriptional network
http//bioinfor.bioen.uiuc.edu
Dannenberg, Zhong et al, Genes Dev. 2005
20
Revised confidence in Pathway
Pathway
Revised pathway
Gene
Tr.Factor
Gene
Gene
Gene
Promoters
Gene targets
.

Orthologs
Scan Genome (Stubb software)
Phylogenetic Motif finding
DNA motif(s)
Sinha (2006) unpublished
21
Epigenetic regulatory network elucidation
through multiple information
integration
Ma et al (2006) unpublished data
22
Cancer Drug Response Prediction through Gene
Expression Profiling
Cancer drug GI 50
Ma et al (2005) unpublished manuscript
Write a Comment
User Comments (0)
About PowerShow.com