Title: Using RDF/OWL Technologies for Discovery and Use Metadata
1Using RDF/OWL Technologies for Discovery and Use
Metadata
M.Benno Blumenthal, Michael Bell, John del
Corral, and Emily Grover-KopecInternational
Research Institute for Climate and
SocietyColumbia Universityhttp//iridl.ldeo.co
lumbia.edu/
2Definitions
- Resource Description Framework (RDF)
- Web Ontology Language (OWL)
3Why RDF?
- Web-based system for interoperating semantics
- A key part of the Semantic Web
- RDF/OWL is an interesting technology, but it is
even more interesting when it is clear that it
can help solve our problems
4The Data Problem
Datasets
Users
5The Tool Interface
Datasets
Tools
Users
6Standard Metadata
Standard Metadata Schema/Data Services
Datasets
Tools
Users
7Many Data Communities
8Super Schema
Standard metadata schema
9Super Schema direct
Standard metadata schema/data service
10Flaws
- A lot of work
- Super Schema/Service is the Lowest-Common-Denomina
tor - Science keeps evolving, so that standards either
fall behind or constantly change
11RDF Standard Data Model Exchange
Standard metadata schema
RDF
RDF
RDF
RDF
RDF
RDF
12RDF Data Model Exchange
Standard metadata schema
RDF
13RDF Architecture
Virtual (derived) RDF
14Why is this better?
- Maps the original dataset metadata into a
standard format that can be transported and
manipulated - Still the same impedance mismatch when mapped to
the least-common-denominator standard metadata,
but - When a better standard comes along, the original
complete-but-nonstandard metadata is already
there to be remapped, and late semantic binding
means everyone can use the new semantic mapping - Can uses enhanced mappings between models that
are close - EASIER these are tools to enhance the mapping
process
15Sample Tool Faceted Search
http//iridl.ldeo.columbia.edu/ontologies/query2.p
l?...
16Distinctive Features of the search
- Search terms are interrelated
- terms that describe the set of returns are
displayed (spanning and not) - Returned items also have structure (sub-items and
superseded items are not shown)
17Architectural Features of the search
http//iridl.ldeo.columbia.edu/ontologies/query2.p
l
- Multiple search structures possible
- Multiple languages possible
- Search structure is kept in the database, not in
the code
18Cast of RDF Characters
Semantic Layers Query Language Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena
19RDF framework for writing connections
- Triplets of
- Subject
- Property (or Predicate)
- Object
- URIs identify things, i.e. most of the above
- Namespaces are used as a convenient shorthand for
the URIs
20Datatype Properties
- WOA dctitle NOAA NODC WOA01
- WOA dcdescription NOAA NODC WOA01 World
Ocean Atlas 2001, an atlas of objectively
analyzed fields of major ocean parameters at
monthly, seasonal, and annual time scales.
Resolution 1x1 Longitude global Latitude
global Depth 0 m,5500 m Time Jan,Dec
monthly
21Object Properties
WOA iridlisContainerOf Grid-1x1, Grid-1x1
iridlisContainerOf Monthly
22WOA01 diagram
23Standard Properties
- WOA dctermhasPart Grid-1x1,
- Grid-1x1 dctermhasPart MONTHLY
- Alternatively
- WOA iridlisContainerOf Grid-1x1,
- iridlisContainerOf rdfssubPropertyOf
dctermhasPart
24netcdf/CF in RDF
SST rdftype cfattnon_coordinate_variable,
SST cfattstandard_name cfsea_surface_tempera
ture, SST netcdfhasDimension longitude
- Object properties provide a framework for
explicitly writing down relationships between
data objects/components, e.g. vague meaning of
nesting is made explicit - Properties also can be related, since they are
objects too
25Noncontextual Modeling
- noncontextual modeling make RDF the perfect glue
between systems and fixed data models The
Semantic Web
26RDF Level
- Transport/Exchange (RDF/XML)
- Storage
- RDF APIs (Redland,Jena,Sesame)
- Query (SPARQL,SeRQL, )
- Basic Semantics
27RDF SemanticsRDF Primer
Truly useful property rdftype a
Underlying Class rdfProperty
Organizational Classes rdfBag rdfAlt rdfSeq rdfList
Structured values rdfvalue
Reification rdfStatement rdfsubject rdfpredicate rdfobject
Bag Properties rdf_1 rdf_2
List Properties rdffirst rdfrestrdfnil
28RDF-Schema (RDFS)
Transitive Properties rdfssubClassOf (is a), rdfssubPropertyOf
rdfsClass, rdfsResource
rdfsmember rdfsdomain, rdfsrange
rdfsDatatype, rdfsLiteral, rdfsContainer
Refering to other RDF documents rdfsseeAlso, rdfsisDefinedBy
Basic documentation rdfslabel, rdfscomment
29Gazetteer Classes
30Gazetteer Individuals
31Search Interface Term
- http//iri.columbia.edu/benno/sampleterm.pdf
32Semantics lead to Virtual Triples
- Transitive
- a rdfssubClassOf b rdfssubClassOf c
- implies a rdfssubClassOf c
- i.e. semantics of rdfssubClassOf imply
additional triples not explicitly stated - Likewise
- a rdfssubPropertyOfb rdfssubPropertyOf c
- implies a rdfssubPropertyOf c
- More interestingly,
- a myprop b, myprop rdfssubPropertyOf
prop2 implies a prop2 b
33Subcategories are not subClasses
- So carelessly translating existing conceptual
organizations can get one into trouble
34Domain and Range are inherited
- Since the domain and range of a property are
classes, then subclasses inherit properties (in
this sense)
35UML/RDFS
- Unified Modeling Language
- Base concepts are the same (RDFS lacks methods),
so one can export the underlying structure of the
code as the underlying structure for the metadata - See Representing UML in RDF
36Ontologies
- Use Conventions to connect concepts to
established sets of concepts - Generate additional virtual triples from the
original set and semantics - RDFS some property/class semantics
- OWL additional property/class semantics more
sophisticated (ontological) relationships
37OWL
- Language for expressing ontologies, i.e. the
semantics are very important. However, even
without a reasoner to generate the implied RDF
statements, OWL classes and properties represent
a sophistication of the RDF Schema - However, there is a serious split in world view
from what we have been talking about concepts as
classes vs concepts as individuals
38OWL
rdfProperty owlDatatypeProperty owlObjectProperty owlAnnotationProperty
owlFunctionalProperty owlInverseFunctionalProperty owlTransitiveProperty owlSymmetricProperty
rdfsseeAlso owlimports
owlontology
39Protégé
- Tool for editing/displaying Ontologies
- Different tabs display different perspectives
- http//protege.stanford.edu/
40Cast of RDF Characters II
Semantic Layers Query Language Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena
41Query Language SPARQL
- (quick reference at http//www.dajobe.org/2005/04-
sparql/) - Supported by Redland, Jena, Sesame-2.0 (alpha)
- Jena implementation supports url source of
triples, i.e. do not even need a triple store - The standard
42Query Language SeRQL
- Older than SPARQL
- Implemented on top of Sesame
- Currently more powerful than SPARQL, i.e. has
nested queries
43SeRQL DetailsCopied from on-line tutorial
- Syntax
- Select
- Construct
- Where
- From
44SeRQL basic syntax
- person fooworksFor Company rdftype
fooITCompany
45SeRQL multiple statements
- subj1 pred1 obj1 pred2 obj2
- Or
- subj1 pred1 obj1 , subj1 pred2 obj2
46SeRQL short cuts
- subj1 pred1 obj1,obj2,obj3
- (also implies obj1,obj2,obj3 are distinct)
47SeRQL Select
Output as table (XML)
- SELECT dataset, dlabel
- FROM dataset rdftype iridldataset,
- dataset rdfslabel dlabel
- USING NAMESPACE
- iridl lthttp//iridl.ldeo.columbia.edu/ontologie
s/iridl.owlgt
48SeRQLConstruct
Output as RDF (RDF/XML)
CONSTRUCT dataset rdftype fooLabelledDatasets
FROM dataset rdftype iridldataset
rdfslabel dlabel USING NAMESPACE iridl
lthttp//iridl.ldeo.columbia.edu/ontologies/iridl.o
wlgt
49Faceted Search Explicated
50Search Interface
- Items (datasets/maps)
- Terms
- Facets
- Taxa
51Search Interface Semantic API
- item dctitle dcdescription rsslink
iridlicon - dctermisPartOf item2
- dctermisReplacedBy item2
- item trmisDescribedBy term
- term a facet of taxa of trmTerm,
- facet a trmFacet, taxa a trmTaxa,
- term trmdirectlyImplies term2
52Faceted Search w/Queries
http//iridl.ldeo.columbia.edu/ontologies/query2.p
l?...
53RDF Architecture
Virtual (derived) RDF
54IRI RDF Architecture
Data Servers
MMI
Ontologies
JPL
Start Point
bibliography
Standards Organizations
RDF Crawler
Location Canonicalizer
RDFS Semantics Owl Semantics SWRL Rules SeRQL
CONSTRUCT
Time Canonicalizer
Sesame
Search Queries
Search Interface
55Creating Virtual Triples from Semantic Layers
Semantic Layers Query Language Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena
56SWRL
- SWRL A Semantic Web Rule Language Combining OWL
and RuleML - A language for writing rules in RDF/OWL, i.e. RDF
statements that are rules for creating new RDF
statements
57Simple Knowledge Organization System (SKOS)
- Schema for relating concepts
58Simple Knowledge Oranization System (SKOS)
- So, for a resource of type skosConcept, any
properties of that resource (such as creator,
date of modification, source etc.) should be
interpreted as properties of a concept, and not
as properties of some 'real world thing' that
that resource may be a conceptualisation of. - This layer of indirection allows thesaurus-like
data to be expressed as an RDF graph. The
conceptual content of any thesaurus can of course
be remodelled as an RDFS/OWL ontology. However,
this remodelling work can be a major undertaking,
particularly for large and/or informal thesauri.
A SKOS Core representation of a thesaurus maps
fairly directly onto the original data
structures, and can therefore be created without
expensive remodelling and analysis
59RDF Frameworks
Protégé API
Redland Bindings in many languages, supports several triple stores, some with context
Jena Java API, some cmd line utilities, supports inference layers
Sesame HTTP server, Java API, supports inference, version 2 alpha has context
60Sesame
- SAIL- Storage and Inference Layer
- i.e. you can write down rules that imply virtual
triples so that triples are generated as they are
put into the store
RDF No inference
RDFS RDFS inference
OWLIM Some OWL inference
Custom
61Jena
- Java framework
- In-memory and persistent stores
- Inference API
62Topics/Issues
- OpenDAP and RDF can we transport data semantics
without fixing the entire schema? - netcdf/HDF and RDF do we need non-contextual
modeling in our metadata transport/storage? - Concepts as classes vs concepts as individuals
- Sub-classes vs sub-categories
- OWL in detail
- Protégé demo
63RDF Cast of Characters
Semantic Layers Query Language Tools and Frameworks
SPARQL Protégé
RDFS SeRQL
OWL Sesame
SKOS Reasoners Redland
SWRL Jena