SKOS, XSLT, and Transforming RDF Resources for use in a Production Semantic Technologies Application - PowerPoint PPT Presentation

1 / 62
About This Presentation
Title:

SKOS, XSLT, and Transforming RDF Resources for use in a Production Semantic Technologies Application

Description:

The National Infrastructure Simulation and Analysis Center (NISAC) program ... Lacking in IDEs (ThManager, ...?) Several IDEs Available (Prot g , SWOOP, ... – PowerPoint PPT presentation

Number of Views:373
Avg rating:3.0/5.0
Slides: 63
Provided by: johnbr64
Category:

less

Transcript and Presenter's Notes

Title: SKOS, XSLT, and Transforming RDF Resources for use in a Production Semantic Technologies Application


1
SKOS, XSLT, and Transforming RDF Resources for
use in a Production Semantic Technologies
Application
  • Bettina K. Schimanski, PhD
  • John M. Linebarger, PhD
  • May 23, 2007
  • Semantic Technology Conference 2007
  • San Jose, California, USA

2
Outline
  • Introduction Our First Production Semantic
    Technologies Application
  • Motivation
  • Objectives
  • Overview and Demo
  • Technologies Used
  • SKOS/RDF/RDFS
  • XSLT/XPath
  • SPARQL
  • Lessons Learned
  • Tools and Support
  • Introductory Resources
  • Moral
  • Complexity Increased Quickly
  • Integrated Many Technologies

3
Introduction The NISAC Program
  • The National Infrastructure Simulation and
    Analysis Center (NISAC) program
  • NISAC is often called upon to quickly analyze the
    impact on critical infrastructures of a potential
    future event
  • Fast Analysis and Simulation Team (FAST)
    exercises
  • Time-limited (from four hours to several days)

4
NISAC CIP KM Portal
  • Critical Infrastructure Protection (CIP)
    Knowledge Management (KM) Portal
  • Supports rapid access of information during a
    FAST exercise
  • Documents
  • Presentations
  • Media files
  • Links to external Web pages

5
Motivation
  • Problem
  • Need better search method for the CIP KM Portal

6
Motivation
  • Problem
  • Need better search method for the CIP KM Portal
  • Solution
  • Widen the search to retrieve additional related
    documents by expanding keywords or phrases to
    include related synonyms
  • Filter out only the most relevant documents

7
Motivation
  • Solution
  • Widen the search to retrieve additional related
    documents by expanding keywords or phrases to
    include related synonyms
  • Filter out only the most relevant documents
  • Examples
  • corn ? maize,
  • MI ? Michigan, mile,
  • bird flu ? avian flu, avian influenza, h5n1,
    a/h5,

8
Synonym Expansion Process Overview
Enter keyword in search box
Synonyms
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
9
Synonym Expansion Demo
10
Synonym Expansion Process Overview
Enter keyword in search box
Synonyms
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
11
Synonym Expansion Process Overview
Enter keyword in search box
Synonyms
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
12
Objectives
  • What are the requirements for Synonym Expansion?
  • Need to link the synonyms to our existing CIP KM
    domain ontology (in OWL) and vice versa
  • Otherwise simply adding them to our database
    would have been sufficient
  • Need a simple language for expressing
    vocabularies of concepts in machine-understandable
    way
  • Need a source of synonyms
  • Domain independent
  • Domain specific

13
Objectives
  • What are the requirements for Synonym Expansion?
  • Need to link the synonyms to our existing CIP KM
    domain ontology (in OWL) and vice versa
  • Otherwise simply adding them to our database
    would have been sufficient
  • Need a simple language for expressing
    vocabularies of concepts in machine-understandable
    way
  • Need a source of synonyms
  • Domain independent
  • Domain specific

14
Possible Representations
OWL and SKOS are both RDF-based, therefore both
can link to the CIP KM OWL Ontology
15
According to Alistair Miles
However, the two approaches do compliment each
other and can be used in combination
16
Layers of Representation
SKOS
OWL
RDF SCHEMA
RDF
XML
17
Layers of Representation
SKOS
OWL
RDF SCHEMA
RDF
XML
(EXtensible Markup Language)
18
SKOS
  • An emerging standard for representing thesauri,
    simple taxonomies, glossaries and controlled
    vocabularies
  • SKOS Core Vocabulary is based on RDF/RDFS
  • RDF (Resource Description Framework)
  • a general-purpose language for representing
    information on the Web
  • provides a network representation of resources
  • RDFS (RDF Schema)
  • describes how to use RDF for RDF vocabularies on
    the Web
  • provides class and property hierarchies
  • Being RDF-based allows easy linking to other
    RDF-based formats (such as OWL)
  • Can express content and structure of a concept
    scheme as an RDF graph

19
RDF Graph Notation
  • An RDF Statement is a triple
  • object-attribute-value
  • or subject-predicate-object
  • or resource-property-ltresource or literalgt

Image Source http//www.w3.org/TR/swbp-skos-core-
guide/
20
SKOS Core Overview
  • Basic description
  • Concept, ConceptScheme
  • Labelling
  • prefLabel, altLabel, hiddenLabel
  • Documentation
  • definition, scopeNote, changeNote, historyNote,
    editorialNote, publicNote, privateNote
  • Semantic relations
  • broader, narrower, related
  • Grouping
  • Collection, OrderedCollection, member, memberList
  • RDF/RDFS
  • Label, about, type, nodeID, Description

Derived in part from Alistair Miles SKOS A
language to describe simple knowledge structures
for the Web, XTech 2005
21
SKOS Core Example (contd)
cipkmconceptsavianinfluenza
avian influenza
rdfslabel
Avian influenza is
rdfabout
skosdefinition
skosConcept
avian influenza
skosprefLabel
skosaltLabel
avian flu
skosaltLabel
bird flu
Legend
skosbroader
Resource
cipkmconceptspandemicinfluenza
Literal
22
SKOS Core Example (contd)
skosConcept
rdfabout
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
skosConceptScheme
23
SKOS Core Example (contd)
rdfslabel
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
skosprefLabel
skosaltLabel
skoshiddenLabel
24
SKOS Core Example (contd)
skosdefinition
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
skosbroader
skosnarrower
skosrelated
25
Adding to the SKOS Core
  • cipkm is a user-defined namespace with its own
    vocabulary

ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabelgtavian influenzalt/skosprefLabe
lgt ltskosaltLabelgtavian flult/skosaltLabelgt
ltskosaltLabelgtbird flult/skosaltLabelgt
ltskosaltLabelgtailt/skosaltLabelgt
ltskosaltLabelgth5n1lt/skosaltLabelgt
ltskosaltLabelgta/h5n1lt/skosaltLabelgt
ltskosaltLabelgta(h5n1)lt/skosaltLabelgt
ltskosaltLabelgta(h5)lt/skosaltLabelgt
ltskosaltLabelgta/h5lt/skosaltLabelgt
ltskosbroader rdfresource"cipkmconceptspandemi
cinfluenza"/gt ltcipkmfrequencygt0lt/cipkmfreque
ncygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
Used for prioritizing the concept
Can be used to disambiguate keywords
26
Synonym Expansion Process Overview
Enter keyword in search box
Synonyms in SKOS
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
27
Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
28
Objectives
  • What are the requirements for Synonym Expansion?
  • Need to link the synonyms to our existing CIP KM
    domain ontology and vice versa
  • Otherwise simply adding them to our database
    would have been sufficient
  • Need a simple language for expressing
    vocabularies of concepts in machine-understandable
    way
  • Need a source of synonyms
  • Domain independent
  • Domain specific

29
Categories of Synonyms
  • Domain independent synonyms
  • Use WordNet
  • a large lexical database of English
  • developed under direction of George A. Miller at
    Princeton University (Cognitive Science Lab)
  • consists of nouns, verbs, adjectives and adverbs
  • We only used nouns from WordNet
  • the keywords analysts use tend to be concrete
    things
  • Obtained an RDF-based version of WordNet 2.0
    created by Mark van Assem (Vrije University,
    Amsterdam), available on W3C website
  • Required SKOS version of WordNet
  • Solution XSLT stylesheets to transform RDF
    version of WordNet into SKOS version of WordNet
  • Domain specific synonyms
  • drawn from NISAC analyst community

30
RDF to SKOS
  • Most SKOS is created
  • automatically from another format via some
    transformation program
  • or using a text editor (i.e. by hand)
  • or using an IDE (Integrated Development
    Environment)
  • ThManager

31
RDF to SKOS
Transformer
Synonyms in RDF
Synonyms in SKOS
32
RDF to SKOS
XSLT Processor
Synonyms in RDF
Synonyms in SKOS
33
XSLT/XPath
  • XSLT (EXtensible Stylesheet Language
    Transformations)
  • W3C Recommendation
  • is a language for transforming XML documents into
    other XML or arbitrary text documents (HTML or
    text)
  • recursive functional programming language
  • XPath (XML Path Language)
  • W3C Recommendation
  • is a language for extracting parts of an XML
    document
  • designed to be used by XSLT
  • Saxon
  • XSLT and XPath processor

Sources XSLT Working with XML and HTML by
Khun Yee Fung, 2001 and http//www.w3.org/TR/xpat
h
34
XSLT Overview
  • XSLT is a language designed to transform
  • XML input
  • into XML or arbitrary text output
  • Recursion is an integral part of all advanced
    uses of XSLT

INPUT
TRANSFORM
OUTPUT
XSLT Processor
XML Document
Transformed XML, HTML, or text Document
35
XSLT Overview (contd)
ltrdfDescription rdfabout"wn20instancessynset-
grave-noun-2"gt ltwn20schemasenseLabelgtgravelt/wn2
0schemasenseLabelgt ltwn20schemasenseLabelgttomblt
/wn20schemasenseLabelgt lt/rdfDescriptiongt
XSLT Processor
ltskosConcept rdfabout"http//www.w3.org/2006/0
3/wn/wn20/instances/synset-grave-noun-2"gt
ltrdfslabelgtgravelt/rdfslabelgt
ltskosdefinitiongt(a place for the burial of a
corpse (especially beneath the ground and marked
by a tombstone) "he put flowers on his mother's
grave")lt/skosdefinitiongt ltskosprefLabel
rdfnodeID"grave-noun-2"/gt
ltcipkmfrequencygt2lt/cipkmfrequencygt
ltskosaltLabel rdfnodeID"grave-noun-2-tomb"/gt
lt/skosConceptgt
36
XSLT Overview (contd)
  • XSLT stylesheets are XML documents themselves
  • They generally start with an xslstylesheet
    element, with version and namespace declarations

xslstylesheet
namespaces
ltxslstylesheet
xmlnsxsl"http//www.w3.org/1999/XSL/Transform"
xmlnsxsd"http//www.w3.org/2001
/XMLSchema" xmlnsrdf"http//www
.w3.org/1999/02/22-rdf-syntax-ns"
xmlnsrdfs"http//www.w3.org/2000/01/rdf-schema
" xmlnsowl"http//www.w3.org/2
002/07/owl" xmlnsskos"http//w
ww.w3.org/2004/02/skos/core"
xmlnscipkm"https//cip.sandia.gov/CIPKM.owl"
xmlnswn20instances"http//www.w3.o
rg/2006/03/wn/wn20/instances/"
xmlnswn20schema"http//www.w3.org/2006/03/wn/wn2
0/schema/" version"2.0"gt
version
37
XSLT Structure
  • XSLT
  • is a recursive functional programming language
  • like programs in other languages, stylesheets can
    become quite long and complex
  • uses XPath to extract parts of the XML source
    document
  • XSLT stylesheets can be viewed as stylesheet
    programs containing stylesheet modules
  • External modules including zero or more
    additional stylesheets using ltxslincludegt or
    ltxslimportgt
  • Internal modules providing one or more templates
    in a stylesheet using ltxsltemplategt
  • Templates provide a way to modularize and reuse
    code

Source XSLT 2.0 Programmers Reference by
Michael Kay, 2004
38
Simple XSLT Overview
  • Basic Elements
  • stylesheet, template, version, type
  • Control Flow
  • choose, when, otherwise, for-each,
    apply-templates, call-template
  • Elements
  • variable, element, attribute
  • Matching
  • match, name, test, select, value-of,
    analyze-string, matching-substring

39
XSLT Example
  • Pseudocode
  • Define a template
  • Match part of the source XML doc
  • Specifically match any element tag rdfRDF
  • Gather together all nodes under this tag
  • Specifically that match rdfDescription
  • Find another template that matches each of these
    nodes

xsltemplate
ltxsltemplate match"rdfRDF"gt
ltxslapply-templates select"rdfDescription"/gt lt/
xsltemplategt
xslapply-templates
40
XSLT Example (contd)
  • Pseudocode
  • Define a template
  • Match part of the source XML doc
  • Specifically match any element tag rdfRDF
  • Gather together all nodes under this tag
  • Specifically that match rdfDescription
  • Find another template that matches each of these
    nodes

xsltemplate
ltxsltemplate match"rdfRDF"gt
ltxslapply-templates select"rdfDescription"/gt lt/
xsltemplategt
xslapply-templates
xslcall-template
41
XSLT Example (contd)
xslchoose
ltxsltemplate match"rdfDescription"gt
ltxslchoosegt ltxslwhen
test"_at_wn20schemasenseLabel"/gt
ltxslotherwisegt ltxslchoosegt
ltxslwhen test"matches(_at_rdfabout,
'-noun-')"gt ltxslelement
name"skosConcept"gt
ltxslvariable name"synsetName"gt
ltxslvalue-of select"_at_rdfabout"/gt
lt/xslvariablegt
ltxslattribute name"rdfabout"gt
ltxslvalue-of select"synsetName"/gt
lt/xslattributegt
ltxslvariable name"senseName"gt
ltxslanalyze-string
select"_at_rdfabout" regex"(.synset-)(.)(-.-.
)"gt
ltxslmatching-substringgt
ltxslvalue-of select"translate(reg
ex-group(2),'_',' ')"/gt
lt/xslmatching-substringgt
lt/xslanalyze-stringgt
lt/xslvariablegt
xslotherwise
xslelement
xslwhen
xslfor-each
42
XPath Example
  • General idea
  • Here XPath is being used in an XSLT stylesheet
  • It is extracting the definition (or gloss) of a
    concept of a term from another XML document

another XML file on the file system
ltxslvalue-of select"document('wordnet-glossary.r
df', document('')) //rdfDescription_at_rdfabout
synsetName/wn20schemagloss"/gt
variable representing the concept name
extract definition / glossary
43
Details of Conversion
  • Performance considerations
  • need to limit the number of synonyms used in
    building Oracle SQL statement that returns
    documents
  • Solution prioritize the synonyms and return the
    best ones
  • Problem
  • attaching arbitrary priority attributes to SKOS
    elements causes XML parsing errors because it
    does not fit in with the SKOS schema
  • changing the SKOS schema is not a desirable
    option because it is an emerging standard
  • Led to a two pass approach in creating SKOS
    result
  • added on extra priority-related elements

44
Details of Conversion (contd)
From previous examples
ltskosprefLabelgtavian_influenzalt/skosprefLabelgt
First Pass of conversion
ltskosConcept rdfabout"cipkmconceptsavianinfl
uenza"gt ltrdfslabelgtavian influenzalt/rdfslabe
lgt ltskosdefinitiongtAvian influenza is flu
from viruses adapted to birds, and is
noninfectious for most species.lt/skosdefinitiongt
ltskosprefLabel rdfnodeID"avian_influenza"/gt
ltskosaltLabel rdfnodeID"avian_influenza-av
ian_flu"/gt ltskosaltLabel rdfnodeID"avian_in
fluenza-bird_flu"/gt ltskosaltLabel
rdfnodeID"avian_influenza-ai"/gt
ltskosaltLabel rdfnodeID"avian_influenza-h5n1"/gt
ltskosaltLabel rdfnodeID"avian_influenza-a/
h5n1"/gt ltskosaltLabel rdfnodeID"avian_influ
enza-a(h5n1)"/gt ltskosaltLabel
rdfnodeID"avian_influenza-a(h5)"/gt
ltskosaltLabel rdfnodeID"avian_influenza-a/h5"/gt
ltskosbroader rdfresource"cipkmconceptspa
ndemicinfluenza"/gt ltcipkmfrequencygt0lt/cipkmf
requencygt ltcipkmhasOntologyCategory
rdfresource"hazardAvian_Flu"/gt
lt/skosConceptgt
45
Details of Conversion (contd)
Second Pass
ltrdfDescription rdfnodeID"avian_influenza"gt
ltrdfslabelgtAVIAN INFLUENZAlt/rdfslabelgt
ltcipkmprioritygt0lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-avian_flu"gt
ltrdfslabelgtAVIAN FLUlt/rdfslabelgt
ltcipkmprioritygt1lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-bird_flu"gt
ltrdfslabelgtBIRD FLUlt/rdfslabelgt
ltcipkmprioritygt2lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-ai"gt
ltrdfslabelgtAIlt/rdfslabelgt
ltcipkmprioritygt3lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-h5n1"gt
ltrdfslabelgtH5N1lt/rdfslabelgt
ltcipkmprioritygt4lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a(h5n1)"gt
ltrdfslabelgtA(H5N1)lt/rdfslabelgt
ltcipkmprioritygt5lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a/h5n1"gt
ltrdfslabelgtA/H5N1lt/rdfslabelgt
ltcipkmprioritygt6lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a(h5)"gt
ltrdfslabelgtA(H5)lt/rdfslabelgt
ltcipkmprioritygt7lt/cipkmprioritygt
lt/rdfDescriptiongt ltrdfDescription
rdfnodeID"avian_influenza-a/h5"gt
ltrdfslabelgtA/H5lt/rdfslabelgt
ltcipkmprioritygt8lt/cipkmprioritygt
lt/rdfDescriptiongt
46
Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
47
Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
48
SPARQL
  • Query Language for RDF
  • W3C Working Draft
  • SQL for Semantic Web
  • SQL
  • SELECT from tablename
  • SPARQL
  • SELECT ?x, ?y, ?z WHERE ?x ?y ?z
  • Reference implementation is currently ARQ, which
    is bundled with Jena
  • a Java API for building Semantic Web applications
  • provides a programmatic access to RDF, RDFS, OWL
  • includes SPARQL and a rule-based inference engine
  • Used to return the synonym expansion of keywords
    represented in SKOS

49
SPARQL Implementation Issues
  • Stored SKOS and ontologies in Oracle using Jena
  • As with SQL queries, performance is best when the
    most specific items are searched for before less
    specific ones
  • When reloading the database an index is lost
  • Jenas default index is not optimized for our
    SPARQL query
  • Must manually optimize the database table
    whenever the ontology model is reloaded
  • Output is a union of three queries, covering
    three cases
  • when the search keyword matches a preferred label
  • all alternate labels are returned
  • when the search keyword matches an alternate
    label
  • preferred label is returned
  • when the search keyword matches an alternate
    label
  • all other alternate labels of the same concept
    are returned
  • Other fields are returned
  • allows the calling program to sort the synonyms
    in priority order

50
SPARQL Query Example
SELECT ?synonym ?frequency ?priority ?prefLabel
?altLabel WHERE ?y rdfslabel searchString .
?x skosprefLabel ?y . ?x skosaltLabel ?a
. ?a rdfslabel ?synonym
. ?a rdfslabel ?prefLabel . ?a
cipkmpriority ?priority . ?x
cipkmfrequency ?frequency UNION ?y
rdfslabel searchString . ?x skosaltLabel ?y .
?x skosprefLabel ?a . ?a
rdfslabel ?synonym . ?a rdfslabel
?altLabel . ?a cipkmpriority
?priority . ?x cipkmfrequency
?frequency UNION ?y rdfslabel
searchString . ?x skosaltLabel ?y . ?x
skosaltLabel ?a . ?a rdfslabel
?synonym . ?a rdfslabel ?altLabel
. ?a cipkmpriority ?priority
. ?x cipkmfrequency ?frequency
51
SPARQL Query Example
SELECT ?synonym ?frequency ?priority ?prefLabel
?altLabel WHERE ?y rdfslabel searchString .
?x skosprefLabel ?y . ?x skosaltLabel ?a
. ?a rdfslabel ?synonym
. ?a rdfslabel ?prefLabel . ?a
cipkmpriority ?priority . ?x
cipkmfrequency ?frequency UNION ?y
rdfslabel searchString . ?x skosaltLabel ?y .
?x skosprefLabel ?a . ?a
rdfslabel ?synonym . ?a rdfslabel
?altLabel . ?a cipkmpriority
?priority . ?x cipkmfrequency
?frequency UNION ?y rdfslabel
searchString . ?x skosaltLabel ?y . ?x
skosaltLabel ?a . ?a rdfslabel
?synonym . ?a rdfslabel ?altLabel
. ?a cipkmpriority ?priority
. ?x cipkmfrequency ?frequency
Find synonyms in which searchString is prefLabel,
? return all alternate labels
Find synonyms in which searchString is altLabel,
? return its prefLabel
Find synonyms in which searchString is altLabel,
? return all its altLabels
Return all synonyms in priority order (based on
prefLabel or altLabel branching)
52
Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create a request for the most important synonyms
Documents in Database
Retrieve documents containing these synonyms
53
Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create SQL statement for highest priority synonyms
Documents in Database
Retrieve documents containing these synonyms
54
Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create SQL statement for highest priority synonyms
Documents in Database
Retrieve documents containing these synonyms
55
Synonym Expansion Process Overview
Source of synonyms
Enter keyword in search box
one time conversion
Synonyms in SKOS
Use SPARQL to expand keyword into all synonyms
Create SQL statement for top synonyms
Documents in Oracle
Pass SQL statement to Oracle to retrieve
documents
56
Lessons Learned (Good Bad)
  • WordNet
  • original RDF version (obtained from W3C) was
    incomplete
  • Mark van Assem corrected it (Prolog program)
  • we posted the SKOS version, along with our XSLT
    stylesheets used to create, it back to the
    community
  • Posting to Semantic Technology Research Community
  • contains many undesirable (unprofessional) words
    that we filtered out
  • SKOS SPARQL
  • we had to escape certain characters
  • in SKOS lt gt
  • in SPARQL query
  • Jena support group was very helpful
  • modified Jena to add extra log4j debug statements
    for us

R D ? R amp D
Presidents Directive ? President\\s
Directive
57
Tools and Support
  • Jena and SPARQL (API for Programmatic Access
    Query)
  • Yahoo Groups support site
  • Free membership in Yahoo Groups required to post
  • Timely support, RSS feeds, user conference
  • SKOS (Lightweight Taxonomy Framework)
  • Mailing list
  • Wiki
  • Posting of SKOS version of WordNet to Semantic
    Technology Research Community
  • THManager
  • XML Processors
  • XSLT / XPath
  • XSLT Processors
  • Saxon
  • Commercial and open-source versions available
  • Needed an XSLT 2.0 processor
  • Michael Kays Blog
  • XMLSpy
  • Has a nice XSLT debugger
  • Fixed stack size for XSLT recursion

58
Links for previous slide (Tools and Support)
  • Jena (http//jena.sourceforge.net/)
  • SPARQL (http//www.w3.org/TR/rdf-sparql-query/)
  • Jena Yahoo Groups support site (http//groups.yaho
    o.com/group/jena-dev/)
  • Jena User Conference (http//jena.hpl.hp.com/juc20
    06/index.html)
  • SKOS (http//www.w3.org/2004/02/skos/)
  • Mailing list (http//www.w3.org/2004/02/skos/mail)
  • Wiki (http//esw.w3.org/topic/SkosDev)
  • Posting of SKOS version of WordNet to Semantic
    Technology Research Community (http//esw.w3.org/t
    opic/SkosDev/DataZone)
  • THManager (http//thmanager.sourceforge.net/)
  • XSLT (http//www.w3.org/TR/xslt)
  • XPath (http//www.w3.org/TR/xpath)
  • XSLT Processors
  • Saxon (http//saxon.sourceforge.net/)
  • Michael Kays Blog (http//saxonica.blogharbor.com
    /blog)
  • XMLSpy (http//www.altova.com/products/xmlspy/xml_
    editor.html)
  • Xalan (http//xalan.apache.org/)
  • MSXML (http//msdn2.microsoft.com/en-us/xml/defaul
    t.aspx)

59
Introductory Resources
  • Introductory Overviews to the Semantic Web
  • The 2006 IEEE Intelligent Systems article "The
    Semantic Web Revisited, by Shadbolt, Nigel,
    Wendy Hall, and Tim Berners-Lee.
  • A five-year reassessment and follow-up to the
    famous 2001 Scientific American article "The
    Semantic Web" which inaugurated the field of the
    Semantic Web
  • A Semantic Web Primer by Grigoris Antoniou and
    Frank van Harmelen
  • Specific Technical Sources
  • SKOS
  • W3C Tutorial
  • SKOS Introduction by Peter Mikhalenko on
    www.xml.com
  • XSLT/XPath
  • Beginning XSLT by Jeni Tennison
  • XSLT 2.0 and XPath 2.0 by Michael Kay
  • Other
  • Saxon website

60
Links for previous slide (Introductory Resources)
  • Introductory Overviews to the Semantic Web
  • The Semantic Web Revisited (http//eprints.ecs.s
    oton.ac.uk/12614/01/Semantic_Web_Revisted.pdf)
  • The Semantic Web (http//www.sciam.com/print_ver
    sion.cfm?articleID00048144-10D2-1C70-84A9809EC588
    EF21)
  • A Semantic Web Primer (http//mitpress.mit.edu/cat
    alog/item/default.asp?ttype2tid10140)
  • Specific Technical Sources
  • SKOS W3C Tutorial (http//www.w3.org/2004/02/skos)
  • SKOS Introduction (http//www.xml.com/pub/a/2005/0
    6/22/skos.html)
  • Beginning XSLT (ISBN-10 1861005946 ISBN-13
    978-1861005946)
  • XSLT 2.0 (ISBN-10 0764569090 ISBN-13
    978-0764569098)
  • XPath 2.0 (ISBN-10 0764569104 ISBN-13
    978-0764569104)
  • Saxon (http//saxon.sourceforge.net/)

61
Acknowledgements
  • The National Infrastructure Simulation and
    Analysis Center (NISAC) is a program under the
    Department of Homeland Securitys (DHS)
    Preparedness Directorate. Sandia National
    Laboratories (SNL) and Los Alamos National
    Laboratory (LANL) are the prime contractors for
    NISAC under the programmatic direction of DHSs
    Infrastructure Protection/Risk Management
    Division.
  • Sandia is a multiprogram laboratory operated by
    Sandia Corporation, a Lockheed Martin Company for
    the United States Department of Energys National
    Nuclear Security Administration under contract
    DE-AC04-94AL85000.

62
Question Answer
Write a Comment
User Comments (0)
About PowerShow.com