Title: SKOS, XSLT, and Transforming RDF Resources for use in a Production Semantic Technologies Application
1SKOS, XSLT, and Transforming RDF Resources for
use in a Production Semantic Technologies
Application
- Bettina K. Schimanski, PhD
- John M. Linebarger, PhD
- May 23, 2007
- Semantic Technology Conference 2007
- San Jose, California, USA
2Outline
- Introduction Our First Production Semantic
Technologies Application - Motivation
- Objectives
- Overview and Demo
- Technologies Used
- SKOS/RDF/RDFS
- XSLT/XPath
- SPARQL
- Lessons Learned
- Tools and Support
- Introductory Resources
- Moral
- Complexity Increased Quickly
- Integrated Many Technologies
3Introduction The NISAC Program
- The National Infrastructure Simulation and
Analysis Center (NISAC) program - NISAC is often called upon to quickly analyze the
impact on critical infrastructures of a potential
future event - Fast Analysis and Simulation Team (FAST)
exercises - Time-limited (from four hours to several days)
4NISAC CIP KM Portal
- Critical Infrastructure Protection (CIP)
Knowledge Management (KM) Portal - Supports rapid access of information during a
FAST exercise - Documents
- Presentations
- Media files
- Links to external Web pages
5Motivation
- Problem
- Need better search method for the CIP KM Portal
6Motivation
- Problem
- Need better search method for the CIP KM Portal
- Solution
- Widen the search to retrieve additional related
documents by expanding keywords or phrases to
include related synonyms - Filter out only the most relevant documents
7Motivation
- Solution
- Widen the search to retrieve additional related
documents by expanding keywords or phrases to
include related synonyms - Filter out only the most relevant documents
- Examples
- corn ? maize,
- MI ? Michigan, mile,
- bird flu ? avian flu, avian influenza, h5n1,
a/h5,
8Synonym Expansion Process Overview
Enter keyword in search box
Synonyms
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
9Synonym Expansion Demo
10Synonym Expansion Process Overview
Enter keyword in search box
Synonyms
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
11Synonym Expansion Process Overview
Enter keyword in search box
Synonyms
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
12Objectives
- What are the requirements for Synonym Expansion?
- Need to link the synonyms to our existing CIP KM
domain ontology (in OWL) and vice versa - Otherwise simply adding them to our database
would have been sufficient - Need a simple language for expressing
vocabularies of concepts in machine-understandable
way - Need a source of synonyms
- Domain independent
- Domain specific
13Objectives
- What are the requirements for Synonym Expansion?
- Need to link the synonyms to our existing CIP KM
domain ontology (in OWL) and vice versa - Otherwise simply adding them to our database
would have been sufficient - Need a simple language for expressing
vocabularies of concepts in machine-understandable
way - Need a source of synonyms
- Domain independent
- Domain specific
14Possible Representations
OWL and SKOS are both RDF-based, therefore both
can link to the CIP KM OWL Ontology
15According to Alistair Miles
However, the two approaches do compliment each
other and can be used in combination
16Layers of Representation
SKOS
OWL
RDF SCHEMA
RDF
XML
17Layers of Representation
SKOS
OWL
RDF SCHEMA
RDF
XML
(EXtensible Markup Language)
18SKOS
- An emerging standard for representing thesauri,
simple taxonomies, glossaries and controlled
vocabularies - SKOS Core Vocabulary is based on RDF/RDFS
- RDF (Resource Description Framework)
- a general-purpose language for representing
information on the Web - provides a network representation of resources
- RDFS (RDF Schema)
- describes how to use RDF for RDF vocabularies on
the Web - provides class and property hierarchies
- Being RDF-based allows easy linking to other
RDF-based formats (such as OWL) - Can express content and structure of a concept
scheme as an RDF graph
19RDF Graph Notation
- An RDF Statement is a triple
- object-attribute-value
- or subject-predicate-object
- or resource-property-ltresource or literalgt
Image Source http//www.w3.org/TR/swbp-skos-core-
guide/
20SKOS Core Overview
- Basic description
- Concept, ConceptScheme
- Labelling
- prefLabel, altLabel, hiddenLabel
- Documentation
- definition, scopeNote, changeNote, historyNote,
editorialNote, publicNote, privateNote - Semantic relations
- broader, narrower, related
- Grouping
- Collection, OrderedCollection, member, memberList
- RDF/RDFS
- Label, about, type, nodeID, Description
Derived in part from Alistair Miles SKOS A
language to describe simple knowledge structures
for the Web, XTech 2005
21SKOS Core Example (contd)
cipkmconceptsavianinfluenza
avian influenza
rdfslabel
Avian influenza is
rdfabout
skosdefinition
skosConcept
avian influenza
skosprefLabel
skosaltLabel
avian flu
skosaltLabel
bird flu
Legend
skosbroader
Resource
cipkmconceptspandemicinfluenza
Literal
22SKOS Core Example (contd)
skosConcept
rdfabout
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
skosConceptScheme
23SKOS Core Example (contd)
rdfslabel
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
skosprefLabel
skosaltLabel
skoshiddenLabel
24SKOS Core Example (contd)
skosdefinition
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
skosbroader
skosnarrower
skosrelated
25Adding to the SKOS Core
- cipkm is a user-defined namespace with its own
vocabulary
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
Used for prioritizing the concept
Can be used to disambiguate keywords
26Synonym Expansion Process Overview
Enter keyword in search box
Synonyms in SKOS
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
27Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
28Objectives
- What are the requirements for Synonym Expansion?
- Need to link the synonyms to our existing CIP KM
domain ontology and vice versa - Otherwise simply adding them to our database
would have been sufficient - Need a simple language for expressing
vocabularies of concepts in machine-understandable
way - Need a source of synonyms
- Domain independent
- Domain specific
29Categories of Synonyms
- Domain independent synonyms
- Use WordNet
- a large lexical database of English
- developed under direction of George A. Miller at
Princeton University (Cognitive Science Lab) - consists of nouns, verbs, adjectives and adverbs
- We only used nouns from WordNet
- the keywords analysts use tend to be concrete
things - Obtained an RDF-based version of WordNet 2.0
created by Mark van Assem (Vrije University,
Amsterdam), available on W3C website - Required SKOS version of WordNet
- Solution XSLT stylesheets to transform RDF
version of WordNet into SKOS version of WordNet - Domain specific synonyms
- drawn from NISAC analyst community
30RDF to SKOS
- Most SKOS is created
- automatically from another format via some
transformation program - or using a text editor (i.e. by hand)
- or using an IDE (Integrated Development
Environment) - ThManager
31RDF to SKOS
Transformer
Synonyms in RDF
Synonyms in SKOS
32RDF to SKOS
XSLT Processor
Synonyms in RDF
Synonyms in SKOS
33XSLT/XPath
- XSLT (EXtensible Stylesheet Language
Transformations) - W3C Recommendation
- is a language for transforming XML documents into
other XML or arbitrary text documents (HTML or
text) - recursive functional programming language
- XPath (XML Path Language)
- W3C Recommendation
- is a language for extracting parts of an XML
document - designed to be used by XSLT
- Saxon
- XSLT and XPath processor
Sources XSLT Working with XML and HTML by
Khun Yee Fung, 2001 and http//www.w3.org/TR/xpat
h
34XSLT Overview
- XSLT is a language designed to transform
- XML input
- into XML or arbitrary text output
- Recursion is an integral part of all advanced
uses of XSLT
INPUT
TRANSFORM
OUTPUT
XSLT Processor
XML Document
Transformed XML, HTML, or text Document
35XSLT Overview (contd)
ltrdfDescription rdfabout"wn20instancessynset-
grave-noun-2"gt ltwn20schemasenseLabelgtgravelt/wn2
0schemasenseLabelgt ltwn20schemasenseLabelgttomblt
/wn20schemasenseLabelgt lt/rdfDescriptiongt
XSLT Processor
ltskosConcept rdfabout"http//www.w3.org/2006/0
3/wn/wn20/instances/synset-grave-noun-2"gt
ltrdfslabelgtgravelt/rdfslabelgt
ltskosdefinitiongt(a place for the burial of a
corpse (especially beneath the ground and marked
by a tombstone) "he put flowers on his mother's
grave")lt/skosdefinitiongt ltskosprefLabel
rdfnodeID"grave-noun-2"/gt
ltcipkmfrequencygt2lt/cipkmfrequencygt
ltskosaltLabel rdfnodeID"grave-noun-2-tomb"/gt
lt/skosConceptgt
36XSLT Overview (contd)
- XSLT stylesheets are XML documents themselves
- They generally start with an xslstylesheet
element, with version and namespace declarations
xslstylesheet
namespaces
ltxslstylesheet
xmlnsxsl"http//www.w3.org/1999/XSL/Transform"
xmlnsxsd"http//www.w3.org/2001
/XMLSchema" xmlnsrdf"http//www
.w3.org/1999/02/22-rdf-syntax-ns"
xmlnsrdfs"http//www.w3.org/2000/01/rdf-schema
" xmlnsowl"http//www.w3.org/2
002/07/owl" xmlnsskos"http//w
ww.w3.org/2004/02/skos/core"
xmlnscipkm"https//cip.sandia.gov/CIPKM.owl"
xmlnswn20instances"http//www.w3.o
rg/2006/03/wn/wn20/instances/"
xmlnswn20schema"http//www.w3.org/2006/03/wn/wn2
0/schema/" version"2.0"gt
version
37XSLT Structure
- XSLT
- is a recursive functional programming language
- like programs in other languages, stylesheets can
become quite long and complex - uses XPath to extract parts of the XML source
document - XSLT stylesheets can be viewed as stylesheet
programs containing stylesheet modules - External modules including zero or more
additional stylesheets using ltxslincludegt or
ltxslimportgt - Internal modules providing one or more templates
in a stylesheet using ltxsltemplategt - Templates provide a way to modularize and reuse
code
Source XSLT 2.0 Programmers Reference by
Michael Kay, 2004
38Simple XSLT Overview
- Basic Elements
- stylesheet, template, version, type
- Control Flow
- choose, when, otherwise, for-each,
apply-templates, call-template - Elements
- variable, element, attribute
- Matching
- match, name, test, select, value-of,
analyze-string, matching-substring
39XSLT Example
- Pseudocode
- Define a template
- Match part of the source XML doc
- Specifically match any element tag rdfRDF
- Gather together all nodes under this tag
- Specifically that match rdfDescription
- Find another template that matches each of these
nodes
xsltemplate
ltxsltemplate match"rdfRDF"gt
ltxslapply-templates select"rdfDescription"/gt lt/
xsltemplategt
xslapply-templates
40XSLT Example (contd)
- Pseudocode
- Define a template
- Match part of the source XML doc
- Specifically match any element tag rdfRDF
- Gather together all nodes under this tag
- Specifically that match rdfDescription
- Find another template that matches each of these
nodes
xsltemplate
ltxsltemplate match"rdfRDF"gt
ltxslapply-templates select"rdfDescription"/gt lt/
xsltemplategt
xslapply-templates
xslcall-template
41XSLT Example (contd)
xslchoose
ltxsltemplate match"rdfDescription"gt
ltxslchoosegt ltxslwhen
test"_at_wn20schemasenseLabel"/gt
ltxslotherwisegt ltxslchoosegt
ltxslwhen test"matches(_at_rdfabout,
'-noun-')"gt ltxslelement
name"skosConcept"gt
ltxslvariable name"synsetName"gt
ltxslvalue-of select"_at_rdfabout"/gt
lt/xslvariablegt
ltxslattribute name"rdfabout"gt
ltxslvalue-of select"synsetName"/gt
lt/xslattributegt
ltxslvariable name"senseName"gt
ltxslanalyze-string
select"_at_rdfabout" regex"(.synset-)(.)(-.-.
)"gt
ltxslmatching-substringgt
ltxslvalue-of select"translate(reg
ex-group(2),'_',' ')"/gt
lt/xslmatching-substringgt
lt/xslanalyze-stringgt
lt/xslvariablegt
xslotherwise
xslelement
xslwhen
xslfor-each
42XPath Example
- General idea
- Here XPath is being used in an XSLT stylesheet
- It is extracting the definition (or gloss) of a
concept of a term from another XML document
another XML file on the file system
ltxslvalue-of select"document('wordnet-glossary.r
df', document('')) //rdfDescription_at_rdfabout
synsetName/wn20schemagloss"/gt
variable representing the concept name
extract definition / glossary
43Details of Conversion
- Performance considerations
- need to limit the number of synonyms used in
building Oracle SQL statement that returns
documents - Solution prioritize the synonyms and return the
best ones - Problem
- attaching arbitrary priority attributes to SKOS
elements causes XML parsing errors because it
does not fit in with the SKOS schema - changing the SKOS schema is not a desirable
option because it is an emerging standard - Led to a two pass approach in creating SKOS
result - added on extra priority-related elements
44Details of Conversion (contd)
From previous examples
ltskosprefLabelgtavian_influenzalt/skosprefLabelgt
First Pass of conversion
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabel rdfnodeID"avian_influenza"/gt
ltskosaltLabel rdfnodeID"avian_influenza-av
ian_flu"/gt ltskosaltLabel rdfnodeID"avian_in
fluenza-bird_flu"/gt ltskosaltLabel
rdfnodeID"avian_influenza-ai"/gt
ltskosaltLabel rdfnodeID"avian_influenza-h5n1"/gt
ltskosaltLabel rdfnodeID"avian_influenza-a/
h5n1"/gt ltskosaltLabel rdfnodeID"avian_influ
enza-a(h5n1)"/gt ltskosaltLabel
rdfnodeID"avian_influenza-a(h5)"/gt
ltskosaltLabel rdfnodeID"avian_influenza-a/h5"/gt
ltskosbroader rdfresource"cipkmconceptspa
ndemicinfluenza"/gt ltcipkmfrequencygt0lt/cipkmf
requencygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
45Details of Conversion (contd)
Second Pass
ltrdfDescription rdfnodeID"avian_influenza"gt
ltrdfslabelgtAVIAN INFLUENZAlt/rdfslabelgt
ltcipkmprioritygt0lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-avian_flu"gt
ltrdfslabelgtAVIAN FLUlt/rdfslabelgt
ltcipkmprioritygt1lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-bird_flu"gt
ltrdfslabelgtBIRD FLUlt/rdfslabelgt
ltcipkmprioritygt2lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-ai"gt
ltrdfslabelgtAIlt/rdfslabelgt
ltcipkmprioritygt3lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-h5n1"gt
ltrdfslabelgtH5N1lt/rdfslabelgt
ltcipkmprioritygt4lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a(h5n1)"gt
ltrdfslabelgtA(H5N1)lt/rdfslabelgt
ltcipkmprioritygt5lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a/h5n1"gt
ltrdfslabelgtA/H5N1lt/rdfslabelgt
ltcipkmprioritygt6lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a(h5)"gt
ltrdfslabelgtA(H5)lt/rdfslabelgt
ltcipkmprioritygt7lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a/h5"gt
ltrdfslabelgtA/H5lt/rdfslabelgt
ltcipkmprioritygt8lt/cipkmprioritygt
lt/rdfDescriptiongt
46Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
47Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
48SPARQL
- Query Language for RDF
- W3C Working Draft
- SQL for Semantic Web
- SQL
- SELECT from tablename
- SPARQL
- SELECT ?x, ?y, ?z WHERE ?x ?y ?z
- Reference implementation is currently ARQ, which
is bundled with Jena - a Java API for building Semantic Web applications
- provides a programmatic access to RDF, RDFS, OWL
- includes SPARQL and a rule-based inference engine
- Used to return the synonym expansion of keywords
represented in SKOS
49SPARQL Implementation Issues
- Stored SKOS and ontologies in Oracle using Jena
- As with SQL queries, performance is best when the
most specific items are searched for before less
specific ones - When reloading the database an index is lost
- Jenas default index is not optimized for our
SPARQL query - Must manually optimize the database table
whenever the ontology model is reloaded - Output is a union of three queries, covering
three cases - when the search keyword matches a preferred label
- all alternate labels are returned
- when the search keyword matches an alternate
label - preferred label is returned
- when the search keyword matches an alternate
label - all other alternate labels of the same concept
are returned - Other fields are returned
- allows the calling program to sort the synonyms
in priority order
50SPARQL Query Example
SELECT ?synonym ?frequency ?priority ?prefLabel
?altLabel WHERE ?y rdfslabel searchString .
?x skosprefLabel ?y . ?x skosaltLabel ?a
. ?a rdfslabel ?synonym
. ?a rdfslabel ?prefLabel . ?a
cipkmpriority ?priority . ?x
cipkmfrequency ?frequency UNION ?y
rdfslabel searchString . ?x skosaltLabel ?y .
?x skosprefLabel ?a . ?a
rdfslabel ?synonym . ?a rdfslabel
?altLabel . ?a cipkmpriority
?priority . ?x cipkmfrequency
?frequency UNION ?y rdfslabel
searchString . ?x skosaltLabel ?y . ?x
skosaltLabel ?a . ?a rdfslabel
?synonym . ?a rdfslabel ?altLabel
. ?a cipkmpriority ?priority
. ?x cipkmfrequency ?frequency
51SPARQL Query Example
SELECT ?synonym ?frequency ?priority ?prefLabel
?altLabel WHERE ?y rdfslabel searchString .
?x skosprefLabel ?y . ?x skosaltLabel ?a
. ?a rdfslabel ?synonym
. ?a rdfslabel ?prefLabel . ?a
cipkmpriority ?priority . ?x
cipkmfrequency ?frequency UNION ?y
rdfslabel searchString . ?x skosaltLabel ?y .
?x skosprefLabel ?a . ?a
rdfslabel ?synonym . ?a rdfslabel
?altLabel . ?a cipkmpriority
?priority . ?x cipkmfrequency
?frequency UNION ?y rdfslabel
searchString . ?x skosaltLabel ?y . ?x
skosaltLabel ?a . ?a rdfslabel
?synonym . ?a rdfslabel ?altLabel
. ?a cipkmpriority ?priority
. ?x cipkmfrequency ?frequency
Find synonyms in which searchString is prefLabel,
? return all alternate labels
Find synonyms in which searchString is altLabel,
? return its prefLabel
Find synonyms in which searchString is altLabel,
? return all its altLabels
Return all synonyms in priority order (based on
prefLabel or altLabel branching)
52Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
53Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create SQL statement for highest priority synonyms
Documents in Database
Retrieve documents containing these synonyms
54Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create SQL statement for highest priority synonyms
Documents in Database
Retrieve documents containing these synonyms
55Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create SQL statement for top synonyms
Documents in Oracle
Pass SQL statement to Oracle to retrieve
documents
56Lessons Learned (Good Bad)
- WordNet
- original RDF version (obtained from W3C) was
incomplete - Mark van Assem corrected it (Prolog program)
- we posted the SKOS version, along with our XSLT
stylesheets used to create, it back to the
community - Posting to Semantic Technology Research Community
- contains many undesirable (unprofessional) words
that we filtered out - SKOS SPARQL
- we had to escape certain characters
- in SKOS lt gt
- in SPARQL query
- Jena support group was very helpful
- modified Jena to add extra log4j debug statements
for us
R D ? R amp D
Presidents Directive ? President\\s
Directive
57Tools and Support
- Jena and SPARQL (API for Programmatic Access
Query) - Yahoo Groups support site
- Free membership in Yahoo Groups required to post
- Timely support, RSS feeds, user conference
- SKOS (Lightweight Taxonomy Framework)
- Mailing list
- Wiki
- Posting of SKOS version of WordNet to Semantic
Technology Research Community - THManager
- XML Processors
- XSLT / XPath
- XSLT Processors
- Saxon
- Commercial and open-source versions available
- Needed an XSLT 2.0 processor
- Michael Kays Blog
- XMLSpy
- Has a nice XSLT debugger
- Fixed stack size for XSLT recursion
58Links for previous slide (Tools and Support)
- Jena (http//jena.sourceforge.net/)
- SPARQL (http//www.w3.org/TR/rdf-sparql-query/)
- Jena Yahoo Groups support site (http//groups.yaho
o.com/group/jena-dev/) - Jena User Conference (http//jena.hpl.hp.com/juc20
06/index.html) - SKOS (http//www.w3.org/2004/02/skos/)
- Mailing list (http//www.w3.org/2004/02/skos/mail)
- Wiki (http//esw.w3.org/topic/SkosDev)
- Posting of SKOS version of WordNet to Semantic
Technology Research Community (http//esw.w3.org/t
opic/SkosDev/DataZone) - THManager (http//thmanager.sourceforge.net/)
- XSLT (http//www.w3.org/TR/xslt)
- XPath (http//www.w3.org/TR/xpath)
- XSLT Processors
- Saxon (http//saxon.sourceforge.net/)
- Michael Kays Blog (http//saxonica.blogharbor.com
/blog) - XMLSpy (http//www.altova.com/products/xmlspy/xml_
editor.html) - Xalan (http//xalan.apache.org/)
- MSXML (http//msdn2.microsoft.com/en-us/xml/defaul
t.aspx)
59Introductory Resources
- Introductory Overviews to the Semantic Web
- The 2006 IEEE Intelligent Systems article "The
Semantic Web Revisited, by Shadbolt, Nigel,
Wendy Hall, and Tim Berners-Lee. - A five-year reassessment and follow-up to the
famous 2001 Scientific American article "The
Semantic Web" which inaugurated the field of the
Semantic Web - A Semantic Web Primer by Grigoris Antoniou and
Frank van Harmelen - Specific Technical Sources
- SKOS
- W3C Tutorial
- SKOS Introduction by Peter Mikhalenko on
www.xml.com - XSLT/XPath
- Beginning XSLT by Jeni Tennison
- XSLT 2.0 and XPath 2.0 by Michael Kay
- Other
- Saxon website
60Links for previous slide (Introductory Resources)
- Introductory Overviews to the Semantic Web
- The Semantic Web Revisited (http//eprints.ecs.s
oton.ac.uk/12614/01/Semantic_Web_Revisted.pdf) - The Semantic Web (http//www.sciam.com/print_ver
sion.cfm?articleID00048144-10D2-1C70-84A9809EC588
EF21) - A Semantic Web Primer (http//mitpress.mit.edu/cat
alog/item/default.asp?ttype2tid10140) - Specific Technical Sources
- SKOS W3C Tutorial (http//www.w3.org/2004/02/skos)
- SKOS Introduction (http//www.xml.com/pub/a/2005/0
6/22/skos.html) - Beginning XSLT (ISBN-10 1861005946 ISBN-13
978-1861005946) - XSLT 2.0 (ISBN-10 0764569090 ISBN-13
978-0764569098) - XPath 2.0 (ISBN-10 0764569104 ISBN-13
978-0764569104) - Saxon (http//saxon.sourceforge.net/)
61Acknowledgements
- The National Infrastructure Simulation and
Analysis Center (NISAC) is a program under the
Department of Homeland Securitys (DHS)
Preparedness Directorate. Sandia National
Laboratories (SNL) and Los Alamos National
Laboratory (LANL) are the prime contractors for
NISAC under the programmatic direction of DHSs
Infrastructure Protection/Risk Management
Division. - Sandia is a multiprogram laboratory operated by
Sandia Corporation, a Lockheed Martin Company for
the United States Department of Energys National
Nuclear Security Administration under contract
DE-AC04-94AL85000.
62Question Answer