Title: Semantic grid From Concepts to Implementation
1Semantic gridFrom Concepts to Implementation
- Nguyen Thanh Vu
- Hoang Song Cam Thach
- Cu Nguyen Phuong Ha
2Outline
- Introduction
- Semantic Web
- S-OGSA
- Implementation ( e-Science myGrid )
3What is the Semantic Gird?
- An extension of the current Grid in which
information and services are given well-defined
and explicitly represented meaning, so that it
can be shared and used by humans and machines,
better enabling them to work in cooperation.
4Why we need the Semantic Grid?
- It is a truth universally acknowledged, that
an application in possession of good middleware,
must be in want of meaningful metadata. - --
prof. C. Goble
5Why we need the Semantic Grid?
- Example To illustrate, consider if a machines
operating system is described as SunOS or
Linux. To query for a machine that is Unix
compatible, a user either has to - 1. Explicitly incorporate the Unix
compatibility concept into the request
requirements by requesting a disjunction of all
Unix-variant operating systems, e.g.,
(OpSysSunOS OpSysLinux), or - 2. Wait for all interesting resources to
advertise their operating system as Unix as well
as either Linux or SunOS, e.g., (OpSysSunOS,
Unix), and then express a match as
set-membership of the desired Unix value in the
OpSys value set, e.g., hasMember(OpSys, Unix).
6Why we need the Semantic Grid?
- Example (cont)
- Apply Semantics
- - Knowledge base SunOS and Linux are types
of Unix operating system - - Request Need the Unix compatibility OS
7Semantic Web
- Current Web ( WWW )
- - Is a huge library of interlinked documents
that are transferred by computers and presented
to people. - - Anyone can contribute to it.
- - Quality of information or even the
persistence of documents cannot be generally
guaranteed. - - Contains a lot of information and
knowledge, but machines usually serve only to
deliver and present the content of documents
describing the knowledge. - - People have to connect all the sources of
relevant information and interpret them
themselves. -
Machine can Process the
content
But Machine cant Understand
content
8Semantic Web
- Definition
- The Semantic Web is an extension of the
current web in which the semantics of information
and services on the web is defined, making it
possible for the web to understand and satisfy
the requests of people and machines to use the
web content. -
--- Tim Berners-Lee
9Semantic Web
- Definition ( cont )
- Semantic web is an effort to enhance current
web so that computers can process the information
presented on WWW, interpret and connect it, to
help humans to find required knowledge -
10Semantic Web
- Semantic Web is a project that should provide a
common framework that allows data to be shared
and reused across application, enterprise, and
community boundaries. - Is led by World Wide Web Consortium (W3C).
11Semantic Web Architecture (1)
- URI (Uniform Resource Identifier) is a string of
a standardized form that allows to uniquely
identify resources. - Unicode is a standard of encoding international
character sets and it allows that all human
languages can be used (written and read) on the
web using one standardized form.
12Semantic Web Architecture (2)
- XML ( Extensible Markup Language) layer makes
sure that there is a common syntax used in the
semantic web.
13Semantic Web Architecture (3)
- RDF stands for Resource Description Framework.
- RDF is a graphical formalism ( XML syntax
semantics) - for representing metadata
- for describing the semantics of information in a
machine- accessible way - Provides a simple data model based on triples
subject-predicate-object
14RDF Data model
- Statements are ltsubject, predicate, objectgt
triples - ltJoe, hasFamilyName,Smith gt
- Can be represented as a graph.
- Statements describe properties of resources
- A resource is any object that can be pointed to
by a URI - Properties themselves are also resources (URIs)
15RDF Syntax
- RDF has an XML syntax that has a specific
meaning - - Every Description element describes a
resource - - Every attribute or nested element inside
a Description is a property of that Resource - - We can refer to resources by URIs
16RDF Example
English Statement http//www.example.org/index.ht
ml has a creation-date whose value is August 16,
1999 Triple representation exindex.html
extermscreation-date "August 16,
1999" RDF Graph representation
17RDF Example (cont)
RDF/XML syntax
18Semantic Web Architecture (4)
- RDFS (RDF Schema) is extending RDF vocabulary to
allow describing taxonomies of classes and
properties.
19RDFS ( cont)
- RDF does not give any special meaning to
vocabulary such as subClassOf or type (supporting
OO-style modelling). - RDF Schema extends RDF with a schema vocabulary
that allows you to define basic vocabulary terms
and the relations between those terms - Class, type, subClassOf,
- Property, subPropertyOf, range, domain
- it gives extra meaning to particular RDF
predicates and resources - this extra meaning, or semantics, specifies how
a term should be interpreted.
20Semantic Web Architecture (5)
- OWL stands for Web Ontology Language.
- OWL is a language derived from description
logics. - OWL provides additional standardized vocabulary.
- OWL provide reasoning support
21Semantic Web Architecture (6)
- RIF/SWRL rule languages are being standardized
for the semantic web. - Provide rules beyond the constructs available
from RDFS OWL.
22Semantic Web Architecture (7)
- SPARQL stands for Simple Protocol And RDF Query
Language. - SPARQL is used to query RDF data as well as RDFS
and OWL ontologies with knowledge bases.
23S-OGSA
- Why
- What
- How
- Design Principles
- S-OGSA
- Conclusions and future works
- Reference
- QA
24Why Semantic Grid ?
- Currently, Grid metadata is generated and used in
an ad hoc fashion , represented in different
formats. - Its hard to share
- Its hard to reuse
- Its hard to reinterpret
- Semantic Grid is an extension of the Grid
increases interoperability and greater
flexibility
25What is Semantic Grid
- An extension of the Grid
- Rich metadata is exposed and handled explicitly,
shared, and managed via Grid protocols
26What is Semantic Grid
- The Semantic Grid uses metadata to describe
information in the Grid. - Turning information into something more than just
a collection of data means understanding the
context, format, and significance of the data. - Therefore
- Understand information
- Discovery and reuse
27Semantic?
- Semantic metadata meaning
- Metadata explicitly exposed as a first class
object in a machine processable form. - Controlled vocabularies or knowledge models (aka
Ontologies) for describing metadata in a machine
processable form. - Schemas for structuring metadata in a machine
processable form. - Rules over metadata.
- Possibly using Semantic Web technologies
- For people and machines
28Design Principles for a Reference Semantic Grid
Architecture
- Parsimony
- lightweight
- minimize the impact on legacy Grid infrastructure
and tooling. - Extensibility
- Uniformity (of the mechanisms)
- manageability of S-OGSA entities
- Have both stateless and stateful Grid services
like OGSA - S-OGSA services are OGSA-observant Grid services.
29Design Principles for a Reference Semantic Grid
Architecture
- Diversity
- Mixed ecosystem of Grid and Semantic Grid
services - Services ignorant of semantics
- Services aware of semantics but unable to process
them - Services aware of semantics and able to process
(part of) them
30Design Principles for a Reference Semantic Grid
Architecture
- Heterogeneity (of semantic representation)
- Any resources property may have many different
semantic descriptions - captured (or not) in different representational
forms (text, logic, ontology, rule).
31Design Principles for a Reference Semantic Grid
Architecture
- Enlightenment
- minimal impact on adding explicit semantics to
current Grid entities - Grid entities should not break if consume and
process Grid resources but cannot consume and
process associated semantics - Grid entities can incrementally acquire, lose and
reacquire explicit semantics during their
lifetime
32S-OGSA
- Defined by
- Information model
- New entities
- Capabilites
- New functionalities
- Mechanisms
- How it is delivered
Model
provide/ consume
expose
Capabilities
Mechanisms
use
33S-OGSA
- How to provide
- Just give the semantic metadata to those services
- Or we can have the semantic services by SOGSA
own.
34S-OGSA
- There are no big differences
- if the service can understand semantic (e.g.,
they support semantic API), then itself can be a
S-OGSA service.
35S-OGSA
- A Grid usually consist of several different
services by OGSA - VO management service
- Resource discovery and Management service
- Job Management service
- Security service
- Data Management service
- The S-OGSA should (will) provide the metadata
semantic services to those services.
36S-OGSA
- The Solution
- Attached the semantic to Grid entities.
- Binding them together by semantic binding
service. - Normal grid services can be semantic by the
semantic binding service.
37S-OGSA Model. Semantic Bindings
38S-OGSA
Application 1
Application N
Optimization
Security
Data
OGSA
Execution Management
Semantic-OGSA
Semantic Provisioning Services
Resource management
Information Management
Infrastructure Services
39S-OGSA
40S-OGSA Model and Capabilities
WebMDS
Annotation Service
Metadata Service
Ontology Service
OGSA-DAI
Grid Service
Semantic BindingProvisioning Service
Is-a
Knowledge Service
Reasoning Service
Is-a
CAS
Is-a
Is-a
Is-a
Semantic ProvisioningService
Knowledge Entity
Grid Entity
1..m
1..m
SAMLfile
uses
Is-a
Ontology
Is-a
Semantic aware Grid Service
Knowledge Resource
Grid Resource
DFDL file
Rule set
1..m
1..m
consume
produce
JSDL file
0..m
0..m
Semantic Binding
0..m
0..m
Is-a
Knowledge
Semantic Grid
Grid
41S-OGSA Model and Capabilities
- Grid Entities
- Resources and services
- Knowledge Entities
- Grid Entities that represent or could operate
with some form of knowledge (e.g ontologies,
rules, knowledge bases ) - Semantic Bindings
- entities associatie of a Grid Entity with one or
more Knowledge Entities
42S-OGSA Model and Capabilities
- Semantic Grid Entities (all entitites in the
binding model) - Semantic Provisioning Services
- provisioning and management of explicit semantics
and its association with Grid entities - creation, storage, update, removal and access of
different forms of knowledge and metadata - Knowledge provisioning services
- ontology services , reasoning services .
- Semantic binding provisioning services
- metadata services, annotation services .
43S-OGSA Model and Capabilities
- Semantically Aware Grid Services
- Be able to consume Semantics Bindings and being
able to take actions based on knowledge and
metadata . - Sample Actions
- Metadata aware authorization of a given identity
by a VO Manager service . - Execution of a search request over entries in a
semantic resource catalogue . - Incorporation of a new concept in to an ontology
hosted by an ontology service . - Reduction of an annotated scientific data set to
a smaller subset by a scientist.
44S-OGSA Mechanisms
- Treating Knowledge Entities and Semantic Bindings
as Grid Resources - Common Information Model (CIM) Resource Model
- Grid Entities class CIM-ManagedElement in the
CIM Model. - Knowledge Entities class S-OGSA-KnowledgeEntity
- S-OGSA-SemanticBindingSemantic Binding, the
association between a Grid Entity
(CIM-ManagedElement) and a Knowledge Entity
(S-OGSA-KnowledgeEntity).
45S-OGSA Mechanisms
46S-OGSA Mechanisms
- S-Stateful Services mechanisms for the delivery
of Semantic Bindings for resources - Based on Web Services Resource Framework (WSRF)
47Retrieving and Querying Semantic Bindings of
Resources
Query/Retrieval Result
Metadata Service
Ontology Service
Metadata Retrieval/Query Request
Obtain schema for Semantic Bindings
Semantic Binding Ids Retrieval Request
Metadata Seeking Client
Resource Specific
Lifetime
State/properties/metadata access port
Resource
- A Feta ODE-SGS, OWL-S, WSMO service desc
- FOAF Profile
- .
Semantic Binding Ids
Service
- Deliver Metadata pointers through resource
properties - Zero impact on existing protocols
. . .
48Conclusions and future works
- Extensions to current Grid models to deal with
flexible forms of explicit metadata - The central component Semantic Binding
- Define a set of services (Semantic Provisioning
Services) that play an important role in the
exposure, delivery and generation of metadata - ontology management and reasoning services,
metadata services and annotation services. - The actual mechanisms to be used for treating
the new components as Grid entities and for
delivering them as part of existing Grid service
frameworks.
49Conclusions and future works
- Design principles
- The Semantic Grid is the Grid.
- The Semantic Grid has a spectrum of Semantic
Capabilities. - Painless migration to the Semantic Grid.
- Semantic Grid lifecycle.
- Multiple semantics.
50Conclusions and future works
- Challenges
- Technical
- architectural or theoretical foundations, the
maturity of Semantic and Grid technologies, - improving the performance of creating and
retrieving semantically-encoded metadata - Operational
- gathering and maintaining the semantic content
- Sociological and political
- legal, security and privacy implications of
clearly exposed metadata and automated reasoning
51QA
52Implementation
53e-Science
- e-Science is about global collaboration in key
areas of science, and the next generation of
infrastructure that will enable it. - e-Science will change the dynamic of the way
science is undertaken. - John Taylor, DG of UK OST
- The Grid intends to make access to computing
power, scientific data repositories and
experimental facilities as easy as the Web makes
access to information. - Tony Blair, 2002
54UK e-Science Grid
55UK e-Science Initiative
- 180M Programme over 3 years
- 130M is for Grid Applications in all areas of
science and engineering - Particle Physics and Astronomy (PPARC)
- Engineering and Physical Sciences (EPSRC)
- Biology, Medical and Environmental Science
- 50M Core Program to encourage development of
generic industrial strength Grid middleware
56 Some UK e-Science Projects
- GRIDPP (PPARC)
- ASTROGRID (PPARC)
- Comb-e-Chem (EPSRC)
- DAME (EPSRC)
- DiscoveryNet (EPSRC)
- GEODISE (EPSRC)
- myGrid (EPSRC)
- RealityGrid (EPSRC)
- Climateprediction.com (NERC)
- Oceanographic Grid (NERC)
- Molecular Environmental Grid (NERC)
- NERC DataGrid (NERC OST-CP)
- Biomolecular Grid (BBSRC)
- Proteome Annotation Pipeline (BBSRC)
- High-Throughput Structural Biology (BBSRC)
- Global Biodiversity (BBSRC)
57Some UK e-Science Projects
- Biology of Ageing (BBSRC MRC)
- Sequence and Structure Data (MRC)
- Molecular Genetics (MRC)
- Cancer Management (MRC PPARC)
- Clinical e-Science Framework (MRC)
- Neuroinformatics Modeling Tools (MRC)
- Interdisciplinary Research Collaborations Grand
Challenge - Advanced Knowledge Technologies
- Medical Images and Signals
- Equator
- DIRC (Dependability)
58Content
- e-Science
- myGrid
- Context
- Workflows, repository, registry and provenance
- Concept services
- Using concepts
- Discovering workflows and services
- Workflow composition support
- Discovering and linking experimental components
- Linking provenance logs
- Remarks
59myGrid
- EPSRC UK e-Science pilot project
- Open Source Upper Middleware for Bioinformatics
- Knowledge-driven Middleware for data intensive in
silico experiments in biology - (Web) Service-based architecture -gt OGSA Grid
services - Targeted at Tool Developers, Bioinformaticians
and Service Providers - http//www.mygrid.org.uk
60Data intensive bioinformatics
61Graves DiseaseAutoimmune disease of the thyroid
62Services and toolkit
63Workflows as in silico experiments
- Freefluo workflow enactment engine
- WSFL
- Scufl
- Workflow discovery
- Finding workflows that others have done, and that
I have done myself - Workflow creation
- Finding classes of services
- Guiding service composition
- We dont do automated composition
- Dynamic workflow enactment service discovery and
invocation - Choose services instances when running workflow
- User involvement
64FreeFluo and Taverna environments
- Freefluo workflow enactment engine
- WSFL
- Scufl
- Taverna development environment
65Investigation set of experiments metadata
- Experimental design components
- workflow specifications query specifications
notes describing objectives applications
databases relevant papers the web pages of
important workers, - Experimental instances that are records of
enacted experiments - data results a history of services invoked by a
workflow engine instances of services invoked
parameters set for an application notes
commenting on the results - Experimental glue that groups and links design
and instance components - a query and its results a workflow linked with
its outcome links between a workflow and its
previous and subsequent versions a group of all
these things linked to a document discussing the
conclusions of the biologist
- Life Science IDs URIs
- RDF-based annotations
- DAMLOIL -gt OWL ontologies
66Bio in silico experiments service types
- Making in silico experiments
- workflow
- distributed database query processing.
- Managing experimental outcomes
- information management
- managing metadata
- Scientific method
- provenance management
- change notification
- personalisation
- Sharing experiments
- semantic services for discovering services and
workflows, and managing metadata - third party service registries and federated
personalised views over those registries, - ontologies and ontology management.
- The base services that tools that will constitute
the experiments - third party services such databases,
computational analyses, simulations . - specialised services such as AMBIT text
extraction.
67Experiment life cycle
68Sharing info ? Sharing meaning
- Metadata
- Data describing the content and meaning of
resources and services. - But everyone must speak the same language
- Terminologies
- Shared and common vocabularies
- For search engines, agents, curators, authors and
users - But everyone must mean the same thing
- Ontologies
- Shared and common understanding of a domain
- Essential for search, exchange and discovery
- A common vocabulary of terms
- Some specification of the meaning of the terms
- A shared understanding for people and machines
69myGrid Service Stack
70myGrid Service Stack
71W3C Ontology and Metadata languages
- OWL (and DAMLOIL)
- The Web Ontology Language OWL
- Family of languages OWL Lite, OWL DL OWL Full
- OWL DL DAMLOIL
- Expressive language for describing concepts,
relationships, constraints and axioms - Sound and complete, and efficient, reasoning over
expressions to infer relationships between
concepts rather than assert them (including the
hierarchy). - OWL is W3C Candidate recommendation.
- RDF
- Resource Description Framework
- W3C language for describing metadata on the Web
- Triples (subject, predicate, object) forming
graphs - Associate URIs (LSIDs) with other URIs (LSIDs)
- Associate URIs with OWL concepts (which are URIs)
- RDQL
- Triple store RDF implementations (e.g. Jena)
- http//www.w3.org/RDF
72Concept services Ontology Services
- Ontology server for concept expressions
- Ontology development environments
- OilEd
- FaCT reasoner for inferring over concept
expressions - Imprecise matchmaking for best effort
substitutability - Reasoning over descriptions
- Generating classification structures
- Matchmaker and ranking for matching concept
expressions - Instance store for indexing instances of concept
expressions in registries and databases
73Concept services Annotation services
- RDF repositories
- Jena Toolkit
- RDF query languages RDQL
- myGrid Information Repository
- Version 1 Relational (DB2)
- Version 2 Federated architecture.
- Browsers for annotating objects and viewing
annotations - Automated tools for marking up objects with
annotations.
74myGrid Information Repository
- Stores experimental components
- Workflow specs as XML Scufl docs
- Data
- XML notes
- Types
- XML docs
- Relational
- RDF (like)
- Every entry has Dublin Core provenance attributes
- Every entry can have (multiple) concept OWL
concept expressions - Multiple mIRs
75Registries
- Publishes experimental components services,
workflows and (distributed query plans in the
future?) - Multiple 3rd party registries
- Multiple 3rd party metadata
76Using Concepts
- Controlled vocabulary for advertisements for
workflows and services - Indexes into registries and mIR
- Semantic discovery of services and workflows
- Semantic discovery of repository entries
- Type management for composition
- Semantic workflow construction guidance and
validation - Navigation paths between data and knowledge
holdings - Semantic glue between repository entries
- Semantic annotation and linking of workflow
provenance logs
77Semantic discovery services workflows
- Services and workflows in registry have RDF and
OWL descriptions - Selection by the types of inputs they use,
outputs they produce, the bioinformatics tasks
they perform - Querying using RDQL over RDF UDDI registry for
operational metadata - Matching using FaCT OWL classification for
concept-based metadata
A registry browser
A workflow wizard
78Find Components
79Workflow construction
- Outputs and inputs of chained services are
compatible - OWL Concept
- XSD Type
- Data Format
- Workflows are constructed in collaboration with
Scientist - No automated workflow creation
- Find service being embedded into Taverna by end
October like Geodise approach
80Linking objects to objects via concepts
81Reference
- Professor Carole Goble and the myGrid consortium,
Knowledge-based Middleware for BioGrid services
from the myGrid Project - Professor Carole Goble and the myGrid consortium,
The Role of Concepts in myGrid - http//www.mygrid.org.uk
- http//www.semanticgrid.org
- http//www.w3.org
- An overview of S-OGSA a Reference Semantic Grid
Architecture - Oscar Corcho, Pinar Alper, Ioannis Kotsiopoulos,
Paolo Missier, Sean Bechhofer and Carole Goble
School of Computer Science The University of
Manchester, Manchester, UK - The Semantic Grid
- Wei Xing1 , Marios Dikaiakos2 (1School of
Computer Science University of Manchester,
2Department of Computer Science University of
Cyprus)