Background Knowledge for Ontology Construction - PowerPoint PPT Presentation

1 / 11

About This Presentation

Title:

Background Knowledge for Ontology Construction

Description:

Number of Views:58

Avg rating:3.0/5.0

Slides: 12

Provided by: www278

Category:

Tags: background | construction | knowledge | ontology | semiautomatic

Transcript and Presenter's Notes

Title: Background Knowledge for Ontology Construction

1
Background Knowledge for Ontology Construction

2
Bag-of-words

There exist various ways of selecting word
weights. In our paper we propose a method to
learn them!

Important
Word Weigts
Noise
3
SVM Feature selection

Basic algorithm
Learn linear SVM classifier for each of the
categories.
Word is important if it is important for
classification into any of the categories.
Reference
Brank J., Grobelnik M., Milic-Frayling N.
Mladenic D. Feature selection using support
vector machines.

4
Word weight learning

Algorithm
Calculate linear SVM classifier for each category
Calculate word weights for each category from SVM
normal vectors. Weight for i-th word and j-th
category is
Final word weights are calculated separately for
each document

5
OntoGen system

System for semi-automatic ontology construction
Why semi-automatic?The system only gives
suggestions to the user, the user always makes
the final decision.
The system is data-driven and can scale to large
collections of documents.
Current version focused on construction of Topic
Ontologies, next version will be able to deal
with more general ontologies.
Can import/export RDF.

There is a big divide between unsupervised and
fully supervised construction tools.
Both approaches have weak points
it is difficult to obtain desired results using
unsupervised methods, e.g. limited background
knowledge
manual tools (e.g. Protégé, OntoStudio) are time
consuming, user needs to know the entire domain.
We combined these two approaches in order to
eliminate these weaknesses
the user guides the construction process,
the system helps the user with suggestions based
on the document collection.

http//kt.ijs.si/blazf/examples/ontogen.html
6
How does OnteGen help?

By identifying the topics and
relations between them
using k-means clustering
cluster of documents gt topic
documents are assigned to clusters gt
subject-of relation
We can repeat clustering on a subset of documents
assigned to a specific topic gt identifies
subtopics and subtopic-of relation

By naming the topics
using centroid vector
A centroid vector of a given topic is the average
document from this topic (normalised sum of
topics documents)
Most descriptive keywords for a given topic are
the words with the highest weights in the
centroid vector.
using linear SVM classifier
SVM classifier is trained to seperate documents
of the given topic from the other document in
the context
Words that are found most mportant for the
classification are selected as keywords for the
topic

7
(No Transcript)
8
Topic ontology of Yahoo! Finances
9
Background knowledge in OntoGen

All of the methods in OntoGen are based on
bag-of-words representation.
By using a different word weights we can tune
these methods according to the users needs.
The user needs to group the documents into
categories. This can be done efficiently using
active learning.

http//kt.ijs.si/blazf/examples/ontogen.html
10
Influence of background knowledge