Title: Active Ontology ActOn: An semantic Information Integration approach for dynamic distributed informat
1Active Ontology (ActOn) An semantic Information
Integration approach for dynamic distributed
information sources
- Wei Xing, Oscar Corcho, Carole Goble, Marios
Dikaiakos - Information Management Group
- School of Computer Science
- University of Manchester
2Outline
- Motivation
- Introduction ActOn
- Overview of ActOn architecture
- Prototype implementation
- ActOn-based information service deployment on
EGEE - Experiment evaluation
- Conclusions and future work
3Motivation
- Providing Just enough, Just in Time information
for a large-scale, dynamic distributed system. - Workflow management
- Meta-scheduling
- Resource discovery
- Service level agreement
- Problems with the existing information services
- Expensive
- In-Correct
- Meaningful
- Complex Query
4Challenges
- Information about most of Grid entities needs to
be updated very frequently - The latency of retrieving a piece of information
may be longer than its change frequency - Information of a Grid entity consists of multiple
attributes, whose values can be normally obtained
from different geographically-distributed
information sources - Heterogeneous information sources are
available/unavailable dynamically.
5Proposed solution
- Active Ontology (ActOn).
- an ontology-based information integration
approach that can be used to generate and
maintain up-to-date metadata for a dynamic,
large-scale distributed system. - Graphic data model
- Metadata cache
- Intelligent semantic metadata management
- Lifetime-control
- Dynamic information source selection
- Updating-on-demand
6Related work (1)
- Existing Grid Information Services
- MDS
- Monitoring-based information service
- Expensive
- Ladp data model (weak in query)
- RGMA
- Relational database servlets
- Poor performance in a large scale, dynamic
distributed environment - rigorous with change (database schema)
- BDII
- MDS-based information service
- Cached information in each service (local or
regional) - High performance
-
7Comparison
8Related Work (2)
- Limitations of the traditional ontology-based
information integration approach - Not prepared for highly dynamic information
- Not fault-tolerant and robust to changes in the
information source availability - Examples
- TSIMMS, Manifold, InfoSleuth.
9Comparison
10Introduction of the ActOn
- An Ontology-based Meta-aggregation Information
Service - Domain Ontologies Grid Ontology
- Service-oriented within S-OGSA architecture
- Main functionalities
- Integrate diverse data sources
- Extremely dynamic (appear, temporarily
unavailable, disappear)in grid environment - Generate and maintain up-to-date metadata
- Dynamic, large-scale distributed systems
- Provide a uniform interface
- Accessing and discovery of metadata
11Overview of ActOn Architecture
12System Components
- ActOn is comprised of
- software components
- Metadata Scheduler (Msch)
- Information Source Selector (ISS)
- Metadata Cache Service (McS)
- A set of Wrappers
- Knowledge components (Ontologies)
- Domain ontologies
- Information source ontologies
13Metadata Scheduler (Msch)
- It is designed to apply an update-on-demand
policy to cache metadata. - That is, the cached metadata is not updated until
it is stale when being queried, so as to avoid
unnecessary updates. - We adopt event-driven mechanisms to cope with
that policy.
14Defined events
- We have defined three types of events that can
trigger the update process, - Application-specific events
- Need specific information updating at certain
time. - Query events
- Query when it is out lifetime
- System-related events
- Related events happen, which may cause change.
15Information Source Selector (ISS)
- The Information Source Selector (ISS) is used to
find the most suitable information source from
the set of available sources, which are described
as instances of the Information Source Ontology.
Information sources can be any system (database,
file, service, etc.) that contains relevant
information.
16Information source store
- Define the class of each kind of information
source in information source ontology, and its
property accordingly. - Generate the instances of information sources
into the information source store. - Query the store for a information source for a
piece of the specific information. - Select the information source.
17Metadata Cache Service (McS)
- The Metadata Cache (MC) stores and manages the
metadata obtained from the information sources,
together with its timestamp and lifetime
information, so that it can check whether such
property values are still valid or not (e.g.,
lifetime control) when it receives a query event
that involves them.
18Lifetime control for the semantic metadata
- Control the life time of metadata in property
level - No fixed lifetime
- No same lifetime for a multiple-property metadata
- Property life time control
- House-keeping service
- Managing the lifetime of the metadata
19The house-keeping service
20Wrappers
- The query translator
- General query by GUI
- Translating to a particular query language
- Ontology-based
- Schema mapping
- Define the access api and access point (instances)
21Knowledge components
- Domain ontology
- Define the concepts and properties of a specific
doamin - In our example grid ontology
- Information source ontology
- Define the concepts and properties of each kind
of information sources - In our example EGEE information source ontology
22Overview of the ontologies
23Example 1 Grid domain ontology
24Example 2 the information source ontology
- EGEE information source ontology
25Prototype Implementation
- Java application Jena RDF API
26ActOn Deployment on EGEE
- EGEE production testbed includes 200 sites, 35K
CPUs, 13 Petabyes of storage and runs 30K-70K
jobs per day on behalf of 100 VOs - The User Interface that we used to access the
EGEE Grid is the UI (ui.tier2.hep.manchester.ac.uk
) at University of Manchester
27Experiment evaluation
- Objectives of the experiment evaluation
- A fair systematic approach to measure
Information quality of different Grid
information services - Evaluate information quality of two EGEE
information services (BDII and RGMA) and ActOn.
28Experiment set-up
- Selected two metrics commonly used in information
retrieval - Precision the proportion of relevant information
retrieved, out of all the information retrieved. - Recall the proportion of relevant information
that is retrieved, out of all the relevant
information available. - EGEE tesbed
- Three Grid nodes Manchester, UCY, and Belgrade
- A set of Unix shell scripts
29Queries used in Evaluation
- Query 1 Find all the Computing Elements (CEs)
that support the BIOMED Virtual Organisation
(VO). - Query 2 Find all the CEs that support the BIOMED
VO and have more than 100 CPUs available. - Query 3 Find all the CEs that support the MPI
running environment. - Query 4 Find all the CEs that support the BIOMED
VO, have more than 100 CPUs available, and
support the MPI running environment. - Query 5 Find all the CEs where GATE (Geant4
Application for Tomographic Emission) can be run. - Query 6 Find all the CEs that support the BIOMED
VO, have more than 100 CPUs available, and where
GATE can be run.
30Query Translation
31Results of Query 1 in different ISs
32Results
33Remarks
- BDII has weak query capabilities
- RGMA is not able to relate information available
in different tables - RGMA is very sensitive to the registering and
availability of information providers at a given
point in time - Some complex queries cannot be answered by one
type of information service in isolation
34Conclusions and future work
- Active Ontology (ActOn)
- an ontology-based information integration
approach, - ActOn overcomes some of the limitations
- information quality
- availability and robustness
- response time
- Future work
- Information quality validation
- Algorithm for dynamic lifetime predication
35ActOn publications
- W. Xing, O. Corcho, C. Goble, and M.
D. Dikaiakos, "A Grid Information Service based
on an Intelligent Information Integration
Architecture." European Semantic Web Conference
2007 (ESWC-2007), poster, June, 2007. - W. Xing, O. Corcho, C. Goble, and M. D.
Dikaiakos, "Active Ontology An Information
Integration Approach for Dynamic Information
Sources", in the 2nd International Workshop on
Semantic and Grid Computing (SGC-07), in
conjunction with the International Symposium on
Parallel and Distributed Processing and
Applications (ISPA-2007). (accepted for
publication). - W. Xing, O. Corcho, C. Goble, and M. D.
Dikaiakos, "Information Quality Evaluation for
Grid Information Services." CoreGRID Symposium
2007. In conjunction with EuroPar 2007 (accepted
for publication). - W. Xing, O. Corcho, C. Goble, and M. D.
Dikaiakos, "ActOn A Semantic Information
Service for EGEE", the 8th IEEE/ACM
International Conference on Grid Computing (Grid
2007) (accepted for publication).(22 accept
rate)
36- Thank you for your attention!
- Comments and questions!
37(No Transcript)
38(No Transcript)