Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech

Description:

Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech 2005. 2. 17 – PowerPoint PPT presentation

Number of Views:50
Avg rating:3.0/5.0
Slides: 21
Provided by: ackr
Category:

less

Transcript and Presenter's Notes

Title: Towards the Semantic Web 6 Generating Ontologies for the Semantic Web: OntoBuilder R.H.P. Engles and T.Ch.Lech


1
Towards the Semantic Web 6 Generating
Ontologies for the Semantic Web
OntoBuilderR.H.P. Engles and T.Ch.Lech
  • ? ? ?
  • 2005. 2. 17

2
1. The overall
  • OntoBuilder
  • Extraction of information from texts for building
    knowledge bases.
  • Consist of the two modules OntoExtract and
    OntoWrapper.

3
1.1 The overall architecture
4
1.2 OntoExtract and OntoWrapper(1/2)
  • OntoExtract
  • Semi-automatic Ontology construction from
    unstructured information (natural language
    sources).
  • OntoWrapper
  • Semi-automatic Ontology construction from
    semi-structured and structured information
    sources.
  • extract information from places on specific sites
    (e.g. names, email addresses, telephone numbers).

5
1.2 OntoExtract and OntoWrapper(2/2)
  • CORPORUM is dependent on a linguistic analysis of
    a given text, comprising normalization,
    tokenization and part-of-speech tagging.
  • Relations between concepts are defined
    (e.g. subClassOf relations, or InstanceOf
    relations).
  • Through semantic analysis of a domain, the tool
    can automatically generate relation between words
    within a domain.
  • Visualization of such semantic structures can
    than be used for navigation and browsing through
    document sets.

6
2. OntoExtract(1/3)
  • OntoExtract supports analysis of natural language
    texts and generates lightweight, domain specific
    ontologies of these texts (utilizing already
    existing knowledge from a central data
    repository).
  • OntoExtract is able to
  • analysis of natural language,
  • provide initial ontologies,
  • refine existing ontologies,
  • find relations between key terms in documents,
  • find instances of concepts within document,
  • finds classes, sub-class relationships.

7
2. OntoExtract(2/3)
  • How does OntoExtract currently work
  • parses, tokenizes and analyses text,
  • generates nodes and relations between them,
  • enhances specific aspects of the discovered
    knowledge item using a background
    repository (containing general knowledge of the
    world, represented in Sesame),
  • and the final analysis results are submitted to
    the RDFS server Sesame.

8
Sesame domain knowledge
Sesame background knowledge
9
3. OntoWrapper
  • OntoWrapper
  • deal with the analysis of structured pages
  • allow the user to define XML/RDF templates,
    variables and rule sets to perform a structured
    analysis of a specific domain
  • generate the merged output and sending it to the
    Sesame repository as data statements about
    specific pages.

10
4.1 Generating Semantic Structures(1/2)
  • Generation of semantic knowledge in information
    extraction is based upon the result of parsing
    steps that can be of varying analysis depth.
  • Level of Linguistic Analysis
  • Tokenization
  • Lexical/Morphological Analysis
  • POS tagging
  • Syntactic Analysis
  • Semantic/Pragmatic Analysis
  • Discourse Analysis
  • CORPORUMs lexical analysis includes
  • text normalization, tokenization, POS tagging

11
4.1 Generating Semantic Structures(2/2)
  • In OntoExtract the initial analysed and annotated
    text is transformed into an internal
    representation that makes use of a variety of
    linguistic analysis steps to come to an initial
    interpretation of what is written.
  • Representation contains the original text, its
    annotations, but also the resolutions performed
    on it.
  • The semantic structures undergo a translation
    such a more formal representation.

12
4.2 Generating Ontologies from Textual Resources
  • How the translation from linguistics into
    formalisms can be done properly
  • problem of representation level what knowledge
    should be represented at the ontology level/ fact
    level (what represents an instance/ concept)
  • problem of dealing with the inheritance problem
  • consistency between extracted ontologies and
    their truth within specific domains
  • Ontologies are extracted from single documents
    taken from the web( concepts are extracted,
    created). These are set into relation with each
    other, augmented with properties and found
    instances are hooked up to them.

13
4.3 Visualization and Navigation
  • The exported semantic network structures and be
    run through a graph layout algorithm in order to
    generate visualizations (with CCA viewer).
  • Intercluster relationships are used to navigate
    from one cluster to another by relevant concepts.

14
5. Issues in Using Automated Text Extraction for
Ontology Building using IE on Web Resources
  • Internet has an additional challenge
    multi-cultural background of the authors
  • Generated ontologies can be used as seed
    ontologies , automatically generated from a
    variety of user defined documents.

15
(No Transcript)
16
(No Transcript)
17
(No Transcript)
18
(No Transcript)
19
(No Transcript)
20
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com