Title: WP4: Conceptual Mining from Text for Knowledge Engineering
1WP4 Conceptual Mining from Text for Knowledge
Engineering
- State of the Art
- WP CoordinatorsAlfonso ValenciaCarlos Rodriguez
2Why Concept/Semantic Mining?
- Knowledge Acquisition Bottleneck
- Top-Down, manually-designed Ontologies are
- sparse (non-exhaustive)
- shallow (not fine-grained)
- not mappable (to terms or other ontologies)
- not easily updated or customized
- Text-based ontologies reflect better diversity in
knowledge as reflected by the literature and
domain terminology
3Information for Ontology Learning
4State of the Art Methods
- implicit relations
- Corpus Distribuition
- Machine Learning Algorithms
- explicit relations
- Symbolic (rule and syntax-based)
- Hybrid, combining some or all
- Bootstrap the ontology-learning process using
existing resources
5An example
Meiosis Cyclin Checkpoint Interphase Nucleoplasma
Division Histone Replication Chromatid
Blaschke, et al., Funct. Integ. Genomics 2001
Cell cycle
Words
17 genes PCNA CDC2 MSH2 LBR TOP2A ...
GO codes
DNA replication DNA metabolism Cell Cycle control
PCNA-MSH2The binding of PCNA to MSH2 may reflect
linkage between mismatch repair and
replication. LBR-CDC2 LBR undergoes mitotic
phosphorylation mediated by p34(cdc2) protein
kinase.
Sentences
24 genes ABCA5 CAT ELF2 PIM1 WNT2 ...
Dipeptidyl Prolyl nmr Collagen-binding
Words
Unknown
6Induce rules at different linguistic levels
7Lexical- and syntax-derived relationships from
text
- Complex relationships in CCO
- degradates
- participate_in
- catalyses
- adjacent_to
- agent_in
- What new ones can be learnt?
- LBR undergoes mitotic phosphorylation mediated
by p34(cdc2) protein kinase. - mitotic phosphorylation mediated_by protein
kinase - Can it be subsumed by others?
- Are there other subcategories?
8Beyond the State of the Art
- Optimal hybrid methodology for
- Extracting entities
- Discovering relations
- Providing ontology-relevant information(But what
and how ?) - Comparing top-down with bottom-up ontologies
- Providing definitional information
- Application to CC-cancer domains (and possibly
to gene regulation)
9In the context of project and other WPs
- Reasoning with text-generated ontologies
competing or complementing? - Reduction of lexical and semantic relationships
to ontological relation inventory - How to present and use Text-Mined information for
ontology design (especially for database
annotation)? - How to curate, evaluate and compare ontologies?
10Information for Ontology Engineers
- New Classes (ontology) and Instances (KB)
- Definitions and glosses
- Concept usage and entity examples
- Terms and synonyms
- Hierarchical and non-hierarchical relations
- Possible reasoning rules
11To and Fro other WPs