Active Ontology ActOn: An semantic Information Integration approach for dynamic distributed informat - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Active Ontology ActOn: An semantic Information Integration approach for dynamic distributed informat

Description:

... at certain time. Query events. Query when it is out lifetime ... Lifetime control for the semantic metadata. Control the life time of metadata in property level ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 39
Provided by: gridU
Category:

less

Transcript and Presenter's Notes

Title: Active Ontology ActOn: An semantic Information Integration approach for dynamic distributed informat


1
Active Ontology (ActOn) An semantic Information
Integration approach for dynamic distributed
information sources
  • Wei Xing, Oscar Corcho, Carole Goble, Marios
    Dikaiakos
  • Information Management Group
  • School of Computer Science
  • University of Manchester

2
Outline
  • Motivation
  • Introduction ActOn
  • Overview of ActOn architecture
  • Prototype implementation
  • ActOn-based information service deployment on
    EGEE
  • Experiment evaluation
  • Conclusions and future work

3
Motivation
  • Providing Just enough, Just in Time information
    for a large-scale, dynamic distributed system.
  • Workflow management
  • Meta-scheduling
  • Resource discovery
  • Service level agreement
  • Problems with the existing information services
  • Expensive
  • In-Correct
  • Meaningful
  • Complex Query

4
Challenges
  • Information about most of Grid entities needs to
    be updated very frequently
  • The latency of retrieving a piece of information
    may be longer than its change frequency
  • Information of a Grid entity consists of multiple
    attributes, whose values can be normally obtained
    from different geographically-distributed
    information sources
  • Heterogeneous information sources are
    available/unavailable dynamically.

5
Proposed solution
  • Active Ontology (ActOn).
  • an ontology-based information integration
    approach that can be used to generate and
    maintain up-to-date metadata for a dynamic,
    large-scale distributed system.
  • Graphic data model
  • Metadata cache
  • Intelligent semantic metadata management
  • Lifetime-control
  • Dynamic information source selection
  • Updating-on-demand

6
Related work (1)
  • Existing Grid Information Services
  • MDS
  • Monitoring-based information service
  • Expensive
  • Ladp data model (weak in query)
  • RGMA
  • Relational database servlets
  • Poor performance in a large scale, dynamic
    distributed environment
  • rigorous with change (database schema)
  • BDII
  • MDS-based information service
  • Cached information in each service (local or
    regional)
  • High performance

7
Comparison
8
Related Work (2)
  • Limitations of the traditional ontology-based
    information integration approach
  • Not prepared for highly dynamic information
  • Not fault-tolerant and robust to changes in the
    information source availability
  • Examples
  • TSIMMS, Manifold, InfoSleuth.

9
Comparison
10
Introduction of the ActOn
  • An Ontology-based Meta-aggregation Information
    Service
  • Domain Ontologies Grid Ontology
  • Service-oriented within S-OGSA architecture
  • Main functionalities
  • Integrate diverse data sources
  • Extremely dynamic (appear, temporarily
    unavailable, disappear)in grid environment
  • Generate and maintain up-to-date metadata
  • Dynamic, large-scale distributed systems
  • Provide a uniform interface
  • Accessing and discovery of metadata

11
Overview of ActOn Architecture
12
System Components
  • ActOn is comprised of
  • software components
  • Metadata Scheduler (Msch)
  • Information Source Selector (ISS)
  • Metadata Cache Service (McS)
  • A set of Wrappers
  • Knowledge components (Ontologies)
  • Domain ontologies
  • Information source ontologies

13
Metadata Scheduler (Msch)
  • It is designed to apply an update-on-demand
    policy to cache metadata.
  • That is, the cached metadata is not updated until
    it is stale when being queried, so as to avoid
    unnecessary updates.
  • We adopt event-driven mechanisms to cope with
    that policy.

14
Defined events
  • We have defined three types of events that can
    trigger the update process,
  • Application-specific events
  • Need specific information updating at certain
    time.
  • Query events
  • Query when it is out lifetime
  • System-related events
  • Related events happen, which may cause change.

15
Information Source Selector (ISS)
  • The Information Source Selector (ISS) is used to
    find the most suitable information source from
    the set of available sources, which are described
    as instances of the Information Source Ontology.
    Information sources can be any system (database,
    file, service, etc.) that contains relevant
    information.

16
Information source store
  • Define the class of each kind of information
    source in information source ontology, and its
    property accordingly.
  • Generate the instances of information sources
    into the information source store.
  • Query the store for a information source for a
    piece of the specific information.
  • Select the information source.

17
Metadata Cache Service (McS)
  • The Metadata Cache (MC) stores and manages the
    metadata obtained from the information sources,
    together with its timestamp and lifetime
    information, so that it can check whether such
    property values are still valid or not (e.g.,
    lifetime control) when it receives a query event
    that involves them.

18
Lifetime control for the semantic metadata
  • Control the life time of metadata in property
    level
  • No fixed lifetime
  • No same lifetime for a multiple-property metadata
  • Property life time control
  • House-keeping service
  • Managing the lifetime of the metadata

19
The house-keeping service
20
Wrappers
  • The query translator
  • General query by GUI
  • Translating to a particular query language
  • Ontology-based
  • Schema mapping
  • Define the access api and access point (instances)

21
Knowledge components
  • Domain ontology
  • Define the concepts and properties of a specific
    doamin
  • In our example grid ontology
  • Information source ontology
  • Define the concepts and properties of each kind
    of information sources
  • In our example EGEE information source ontology

22
Overview of the ontologies
23
Example 1 Grid domain ontology
  • The Core Grid ontology

24
Example 2 the information source ontology
  • EGEE information source ontology

25
Prototype Implementation
  • Java application Jena RDF API

26
ActOn Deployment on EGEE
  • EGEE production testbed includes 200 sites, 35K
    CPUs, 13 Petabyes of storage and runs 30K-70K
    jobs per day on behalf of 100 VOs
  • The User Interface that we used to access the
    EGEE Grid is the UI (ui.tier2.hep.manchester.ac.uk
    ) at University of Manchester

27
Experiment evaluation
  • Objectives of the experiment evaluation
  • A fair systematic approach to measure
    Information quality of different Grid
    information services
  • Evaluate information quality of two EGEE
    information services (BDII and RGMA) and ActOn.

28
Experiment set-up
  • Selected two metrics commonly used in information
    retrieval
  • Precision the proportion of relevant information
    retrieved, out of all the information retrieved.
  • Recall the proportion of relevant information
    that is retrieved, out of all the relevant
    information available.
  • EGEE tesbed
  • Three Grid nodes Manchester, UCY, and Belgrade
  • A set of Unix shell scripts

29
Queries used in Evaluation
  • Query 1 Find all the Computing Elements (CEs)
    that support the BIOMED Virtual Organisation
    (VO).
  • Query 2 Find all the CEs that support the BIOMED
    VO and have more than 100 CPUs available.
  • Query 3 Find all the CEs that support the MPI
    running environment.
  • Query 4 Find all the CEs that support the BIOMED
    VO, have more than 100 CPUs available, and
    support the MPI running environment.
  • Query 5 Find all the CEs where GATE (Geant4
    Application for Tomographic Emission) can be run.
  • Query 6 Find all the CEs that support the BIOMED
    VO, have more than 100 CPUs available, and where
    GATE can be run.

30
Query Translation
31
Results of Query 1 in different ISs
32
Results
33
Remarks
  • BDII has weak query capabilities
  • RGMA is not able to relate information available
    in different tables
  • RGMA is very sensitive to the registering and
    availability of information providers at a given
    point in time
  • Some complex queries cannot be answered by one
    type of information service in isolation

34
Conclusions and future work
  • Active Ontology (ActOn)
  • an ontology-based information integration
    approach,
  • ActOn overcomes some of the limitations
  • information quality
  • availability and robustness
  • response time
  • Future work
  • Information quality validation
  • Algorithm for dynamic lifetime predication

35
ActOn publications
  • W. Xing, O. Corcho, C. Goble, and M.
    D. Dikaiakos, "A Grid Information Service based
    on an Intelligent Information  Integration
    Architecture." European Semantic Web Conference
    2007 (ESWC-2007), poster, June, 2007.
  • W. Xing, O. Corcho, C. Goble, and M. D.
    Dikaiakos, "Active Ontology An Information
    Integration Approach for Dynamic Information
    Sources", in the 2nd International Workshop on
    Semantic and Grid Computing (SGC-07),  in
    conjunction with the International Symposium on
    Parallel and Distributed  Processing and
    Applications (ISPA-2007). (accepted for
    publication).
  • W. Xing, O. Corcho, C. Goble, and M. D.
    Dikaiakos, "Information Quality Evaluation for
    Grid Information Services." CoreGRID Symposium
    2007. In conjunction with EuroPar 2007 (accepted
    for publication).
  • W. Xing, O. Corcho, C. Goble, and M. D.
    Dikaiakos, "ActOn A Semantic  Information 
    Service  for  EGEE",  the 8th IEEE/ACM
    International Conference on Grid Computing (Grid
    2007) (accepted for publication).(22 accept
    rate)

36
  • Thank you for your attention!
  • Comments and questions!

37
(No Transcript)
38
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com