Ontology Generation Based on a UserSpecified Ontology Seed - PowerPoint PPT Presentation

About This Presentation
Title:

Ontology Generation Based on a UserSpecified Ontology Seed

Description:

Find proteins in humans that are 20 kDa. Find all the ... Gene Cards. The Gene Ontology. GPM Proteomics Database. www.deg.byu.edu. 4. Extraction Ontology ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 21
Provided by: cui1
Learn more at: https://www.deg.byu.edu
Category:

less

Transcript and Presenter's Notes

Title: Ontology Generation Based on a UserSpecified Ontology Seed


1
Ontology Generation Based on a User-Specified
Ontology Seed
  • Cui Tao
  • Data Extraction Research Group
  • Department of Computer Science
  • Brigham Young University

Supported by NSF
2
Introduction
  • Motivation
  • Traditional search engines return documents
  • Ontology-based data extraction return
    information
  • Problem
  • Build extraction ontology that meet users needs
  • Goal
  • Automatically build ontologies for users needs

3
Example
  • Example a biologist is interested in information
    about large proteins in humans and their
    functions
  • Possible queries
  • Find proteins in humans that are gt20 kDa
  • Find all the proteins in humans that serve as
    receptors
  • ...
  • Information sources --- various online databases
  • NCBI
  • Gene Cards
  • The Gene Ontology
  • GPM Proteomics Database

4
Extraction Ontology
Molecular Weight
  • Regular Expression \d1,5(\.\d1,2)?
  • Unit kilodaltons?kdas?kds?das?daltons?

5
User Interface
Select a title for the forms
6
User InterfaceBinary Relationship
Protein
Name

Name
Protein
7
User InterfaceBinary Relationship
Protein
Name
Name
Protein

Molecular weight
8
User InterfaceN-ary Relationship
Start
End
Orientation
Chromosome number
Chromosome location
9
User InterfaceN-ary Relationship
GO phrase
GO
GO ID
Go ID
Go term

10
Overall Form
Go ID
Go term

11
Ontology View
Start
End
Orientation
Chromosome number
Chromosome location
GO phrase
Name
Protein
GO
GO ID
Molecular weight
12
Fill in the Form
Go ID
Go term

13
Protein
Fill in the Form
Name
14-3-3 protein epsilon Mitochondrial import
stimulation factor Lsubunit Protein kinase C
inhibitor protein-1 KCIP-1 14-3-3E
Molecular Weight
29175 Daltons
Chromosome location
Go ID
Go term

GO0019899 GO0019904
enzyme binding protein domain specific binding
14
Mapping
15
Mapping
16
Mapping
Name
17
Data Frame Generation
  • Choose from data frame library
  • Data frames for basic values
  • Numbers within different ranges
  • Integers, floats, etc
  • Emails, phone numbers, addresses, etc
  • Domain specific values (DNA sequences)
  • Units
  • Build lexicon files

18
Data Frame Generation
  • Find the best matched data frame from the
    library
  • Find the correct units

19
Build Lexicon Files
Name
20
Contribution
  • Automatically generates ontologies depending on
    users requests
  • Provides a tool for users to easily provide
    ontology seeds
  • Automatically generates ontology views from
    ontology seeds
  • Automatically map ontology concepts to source
    databases
Write a Comment
User Comments (0)
About PowerShow.com