Semantic Web Technologies: A Tutorial - PowerPoint PPT Presentation

About This Presentation
Title:

Semantic Web Technologies: A Tutorial

Description:

Joint work with Deborah McGuinness, Tim Finin and Anupam Joshi ... Magpie. browse. Semantic Web data. 17. Semantic Web data sources ... – PowerPoint PPT presentation

Number of Views:420
Avg rating:3.0/5.0
Slides: 44
Provided by: ebiqui
Category:

less

Transcript and Presenter's Notes

Title: Semantic Web Technologies: A Tutorial


1
Semantic Web TechnologiesA Tutorial
  • Li Ding
  • University of Maryland Baltimore County
  • Joint work with Deborah McGuinness, Tim Finin and
    Anupam Joshi
  • Presented at Kodak Research Laboratories,
    Rochester, New York
  • 18 July 2006

2
The Web has made people smarter
craigslist
Surfing
WWW
Search
bag-of-words
tagging
del.icio.us
3
But what about machines?
  • Machines still have a very minimal understanding
    of text and images.

4
Motivation machine-friendly data
  • Natural Language
  • XML represent structures
  • Semantic Web - represent more semantics
  • represent structures
  • enable common vocabulary
  • associate symbols with logic interpretation for
    inference

Li Ding is a person
LiDingisasaon
as seen by a machine
as seen by a person
ltongtLiDinglt/ongt
ltpersongtLi Dinglt/persongt
as seen by a person
as seen by a machine
5
Semantic Web Technologies
6
Semantic Web Layers
Semantic Aspect
Web Aspect
HTTP
"The Semantic Web is an extension of the current
web in which information is given well-defined
meaning, better enabling computers and people to
work in cooperation. Berners-Lee, Hendler
Lassila, Scientific American, 2001
Image source http//en.wikipedia.org/wiki/ImageW
3c_semantic_web_stack.jpg
7
The Semantic Web is simple
  • Each URI denotes a concept
  • URIs are connected by triples
  • Machines read data as directed RDF graph

Don't say "colour" say lthttp//example.com/2002/st
d6colgt
RDF (Resource Description Framework)
Relational database
Source Tim Berners-Lee, Putting the Web back
into Semantic Web, ISWC2005 Keynote
8
Example RDF graph and syntax
http//xmlns.com/foaf/0.1/name
  • RDF Graph
  • URI, Literal, BNode
  • Triple

Li Ding
t1
http//www.w3.org/1999/02/22-rdf-syntax-nstype
t2
http//xmlns.com/foaf/0.1/Person
The entire graph means there exist a person
whose name is Li Ding.
lt?xml version"1.0" encoding"utf-8"?gt ltrdfRDF
xmlnsfoafhttp//xmlns.com/foaf/0.1/
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-nsgt ltfoafPersongt ltfoafnamegtLi
Dinglt/foafnamegt lt/foafPersongt lt/rdfRDFgt
  • XML
  • unicode
  • Namespace
  • URI as tag

Data encoded in RDF/XML syntax
Alternative RDF syntax languages N3(notation 3),
N-Triples, Turtle
9
Example Surfing RDF graphs
G1 http//cs.umbc.edu/dingli1/foaf.rdf
Surf to definition
http//cs.umbc.edu/dingli1/foaf.rdfdingli
foafname
G3 http//xmlns.com/foaf/1.0/
rdftype
foafknows
Li Ding
foafPerson

wordNetAgent
rdftype
foafmbox
mailtofinin_at_umbc.edu
rdfssubClassOf
rdfsseeAlso
foafPerson
http//cs.umbc.edu/finin/foaf.rdf
rdftype
rdfsClass
Surf to another instance
rdfsdomain
foafmbox
G2 http//cs.umbc.edu/finin/foaf.rdf
foafmbox
rdftype

mailtofinin_at_umbc.edu
rdfProperty
foaffirstName
Tim
foafsurname
rdf http//www.w3.org/1999/02/22-rdf-syntax-ns r
dfs http//www.w3.org/2000/01/rdf-schema foaf
http//xmlns.com/foaf/1.0/
Finin
10
Example Serving human machine
The Original RDF/XML for machines
The HTML is generated by applying XSLT on RDF/XML
11
Ontology Spectrum
Thesauri narrower term relation
space of interest
Disjointness, Inverse,part of
Frames (properties)
Formal is-a
Catalog/ID
CYC
RDF
DAML
DB Schema
RDFS
UMLS
Wordnet
OO
IEEE SUO
OWL
General Logical constraints
Formal instance
Value Restriction
Terms/ glossary
Informal is-a
SimpleTaxonomies
ExpressiveOntologies
Source Originally by Deborah L. McGuinness
(KSL, Stanford), modified by Tim Finin
12
Ontology Languages RDFS and OWL
  • RDFS
  • Set theory rdfsClass
  • Relation rdfProperty, rdfsdomain, rdfsrange
  • Hierarchy rdfssubClassOf, rdfssubPropertyOf
  • Built-in Datatype xsdstring, xsddataTime
  • OWL
  • Description Logic
  • Class, Thing, Nothing
  • DatatypeProperty, ObjectProperty,
    AnnotationProperty,
  • Class axioms
  • oneOf, disjointWith, unionOf, complementOf,
    intersectionOf
  • Restriction, onProperty, cardinality, hasValue
  • Property axioms
  • inverseOf , TransitiveProperty ,
    SymmetricProperty
  • FunctionalProperty, InverseFunctionalProperty
  • Equality equivalentClass , sameAs ,
    differentFrom
  • Ontology annotation Ontology, imports,
    versionInfo

13
Example Inference using ontologies
  • Ontology Languages (RDFS, OWL) has formal
    foundations that allow us to infer additional
    (implicit) statements
  • RDFS provides basic ones, e.g. sub-class,
    sub-property, domain
  • OWL adds many more axioms, e.g. inverse-property,
    equality,
  • SWRL (Semantic Web Rule Language) enables a
    general purposed solution
  • Supports rule representation
  • But also requires inference support beyond RDFS
    and OWL

hasbrother rdfssubPropertyOf hasSibling
hasChild owlinverseOf hasParent
hasParent
hasSibling
Deborah
Louise
Joe
hasBrother
hasChild
hasUncle
SWRL (x hasParent y) (y hasBrother z) gt (x
hasUncle z)
Source Semantic Web tutorial (AAAI 2005) by
Deborah L. McGuinness
14
More languages and more ontologies
  • Languages (require special inference engine)
  • Trust/Uncertainty BayesOWL
  • Proof PML (Proof Markup Language)
  • Query/Data Access SPARQL Query Language for RDF
  • Rule SWRL( Semantic Web Rule Language)
  • Policy REI A Policy Specification Language
  • Service OWL-S by DAML (1.2 preview available)
  • Service SAWSDL (Semantic Annotations for WSDL)
  • Thesauri SKOS (Simple Knowledge Organization
    System)
  • Ontologies (only need RDFS and/or OWL inference)
  • Upper ontologies - OpenCyc, WordNet, OntoSem,
    SUO
  • Specialized common ontologies - FOAF, Dublin
    Core, RSS
  • Domain ontologies bibtex, biology, and many

Li Ding, Pranam Kolari, Zhongli Ding, and
Sasikanth Avancha, Using Ontologies in the
Semantic Web A Survey, in Ontologies in the
Context of Information Systems (book chapter),
2005. http//ebiquity.umbc.edu/paper/html/id/257/

15
Semantic Web Tools
  • Pellet (DL)
  • Racer (DL)
  • FACT (DL)
  • Jena
  • JTP
  • F-OWL
  • Euler
  • CWM

Editor
Online Registry
  • Protégé
  • Swoop

Reasoner
  • DAML Ontology Library
  • Schema Web
  • Jena (SPARQL)
  • KAON
  • Kowari
  • Seasam
  • OWLIM
  • 3store
  • Instance store
  • Redland
  • Tap
  • RDF store
  • Yars
  • IBM IODT
  • RDFLib
  • RDF gateway
  • allegro
  • Oracle 10

create
Search Engine
publish
inference
  • Swoogle
  • Semantic Web Search

Managing Ontologies
Triple store
instance
browse
Browser
update
  • Tabulator
  • IsaViz
  • Piggybank
  • Arago
  • Horus
  • Mspace
  • Magpie

extend
integrate
  • ONION
  • PROMPT
  • OntoMapper
  • Glue
  • OntoMerge
  • Ontomorph

Mapping Tools
source1 http//ebiquity.umbc.edu/paper/html/id/25
7/Using-Ontologies-in-the-Semantic-Web-A-Survey so
urce2 http//www.wiwiss.fu-berlin.de/suhl/bizer/t
oolkits/
16
Semantic Web data
17
Semantic Web data sources
  • Text editor I write RDF/XML manually.
  • Semantic Web Editors Protégé, Swoop
  • Information Extraction (consumer side)
  • NLP (hard), e.g. SemNews
  • heuristic scrapping (regular expr.), e.g. Semagix
    Freedom
  • Wrapped database content (publisher side)
  • blog, social network websites, e.g.
    livejournal.com
  • academic interests http//www.mindswap.org/,
    http//ebiquity.umbc.edu
  • Generated by software
  • creative commons license embedded in HTML
  • embedded metadata JPEG, PDF (XMP)
  • agent communication message

18
The Scale of the Semantic Web
Statistics based Semantic Web data indexed by
Swoogle
Year Terms(million) Documents(million) Individuals(million) Triples(million) Bytes(billion)
2004 0.15 0.33 7.3 48 4.3
2006 1.9 1.6 16 276 47
2008 10 100 1000 20,000 3000
Estimated number of documents based on Google
query
Docs Corresponding Google query
Optimistic 109 rdf OR inurlrss OR inurlfoaf -filetypehtml
Conservative 105 rdf filetyperdf
19
Where the data from
  • com has contributed the largest portion of
    websites (71) and pure SWDs (39) because
    industry has adopted virtual hosting technology
    as well as ontologies such as RSS and FOAF
  • most SWOs are from org (46, e.g. www.w3.org)
    and edu (14, e.g., spire.umbc.edu) because of
    the deep interests in developing ontologies from
    academia and non-profit organizations.

SWDs Semantic Web documents SWOs semantic
web ontologies pure SWD not embeded
note Statistics of top level domain is also used
in characterizing the Web (Henziger and Lawrence
2004)
20
Source websites of SWD
Jan 2005- Aug 2005
Jan 2005- Mar 2006
  • Invariant found!
  • The number of websites hosting more than m SWDs
    follows power law distribution
  • Similar to the Web
  • Head virtual hosting
  • Tail crawling strategy

21
Size of SWD
  • Embedded SWDs are small
  • 69 have 3 triples
  • 96 have lt10 triples
  • Pure SWDs
  • 60 have 5 to 1000 triples.
  • Special size of RSS 130
  • 17 triples for channel
  • 7 triples for each of the 15 items
  • SWOs
  • Biased by PML,
  • Small ones from RDF test
  • Largest is 1M

Number of SWDs
Number of SWOs
of triples
22
Age of SWD
  • Measured by the last-modified time of SWD
  • PSWD Exponential distribution
  • SWO flat tail -- ontology development interests
    decrease?

23
How Semantic Web Terms are used?
  • All usage distributions follow Power distribution
  • Few SWTs been well populated
  • 371 has gt100 class-instance
  • 1208 hasgt100 property-instances

24
Swoogle Rank (citation based)
http//www.w3.org/2000/01/rdf-schema
indegree432,984,mean(inflow)0.039
http//www.w3.org/1999/02/22-rdf-syntax-ns
0.51
1
indegree1,077,768,mean(inflow)0.100
0.11
0.10
2
0.25
0.30
0.35
5
http//purl.org/rss/1.0
0.11
http//www.w3.org/2002/07/owl
0.03
indegree270,178,mean(inflow)0.168
indegree86,959,mean(inflow)0.069
0.18
0.20
0.10
0.16
6
8
0.12
http//web.resource.org/cc
0.43
0.17
indegree57,066,mean(inflow)0.195
0.27
0.21
9
0.27
0.07
0.10
4
http//www.w3.org/2001/vcard-rdf/3.0
0.10
0.07
indegree155,949,mean(inflow)0.036
0.25
0.12
0.11
0.06
0.23
0.12
0.16
0.05
http//purl.org/dc/elements/1.1
10
0.03
indegree861,416,mean(inflow)0.096
7
0.20
http//www.hackcraft.net/bookrdf/vocab/0_1/
http//purl.org/dc/terms
0.08
indegree16,380,mean(inflow)0.167
indegree54,909,mean(inflow)0.042
3
0.17
http//xmlns.com/foaf/0.1/index.rdf
0.29
indegree512,790,mean(inflow)0.217
Computed using Swoogle metadata by May 2006
25
Semantic Web Applications
26
TAGA Travel Agent Game in Agentcities
  • Technologies
  • FIPA (JADE, April Agent Platform)
  • Semantic Web (RDF, OWL)
  • Web (SOAP,WSDL,DAML-S)
  • Internet (Java Web Start )
  • Features
  • Open Market Framework
  • Auction Services
  • OWL message content
  • OWL Ontologies
  • Global Agent Community
  • Motivation
  • Market dynamics
  • Auction theory (TAC)
  • Semantic web
  • Agent collaboration (FIPA Agentcities)
  • Ontologieshttp//taga.umbc.edu/ontologies/
  • travel.owl travel concepts
  • fipaowl.owl FIPA content lang.
  • auction.owl auction services
  • tagaql.owl query language

Owl for representation and reasoning
Owl for protocol description
Owl as a content language
Owl for service descriptions
FIPA platform infrastructure services, including
directory facilitators enhanced to use OWL-S for
service discovery
http//taga.umbc.edu (offline now)
27
Semantic Content Publishing
http//ebiquity.umbc.edu/person/html/Li/Ding/
  • data stored in database
  • PHP generates both HTML and OWL
  • HTML pages link to corresponding OWL
  • no more web scraping

http//ebiquity.umbc.edu/person/foaf/Li/Ding/foaf.
rdf
FOAF
PHP
PHP
Mysql database
http//ebiquity.umbc.edu/ -- ebiquity group
website
28
Rei Policy Language
  • Rei is a declarative policy language for
    describing policies over actions
  • Reasons over domain dependent information
  • Currently represented in OWL logical variables
  • Based on deontic concepts
  • Permission, Prohibition, Obligation, Dispensation
  • Models speech acts
  • Delegation, Revocation, Request, Cancel
  • Meta policies
  • Priority, modality preference
  • Policy engineering tools
  • Reasoner, IDE for Rei policies in Eclipse

http//rei.umbc.edu/
29
Example enforcing privacy policy
  • The speaker doesnt want others to know the
    specific room that hes in, but is willing for
    others to know hes on campus
  • He defines the following privacy policy
  • Share my location with a granularity gt State
  • The broker
  • isLocated(US) gt Yes!
  • isLocated(Maryland) gt Yes!
  • isLocated(UMBC) gt Uncertain..
  • isLocated(ITE-RM210) gt Uncertain..

30
Cobra Context Broker Architecture
  • Ontology
  • Agents
  • Service
  • Inference
  • Policy

http//cobra.umbc.edu/
31
Web-scale semantic web data access
data access service
the Web
agent
Index RDF data
ask (person)
Search vocabulary
Search URIrefs in SW vocabulary
inform (foafPerson)
Compose query
ask (?x rdftype foafPerson)
Search URLs in SWD index
Populate RDF database
inform (doc URLs)
Fetch docs
Query local RDF database
32
Swoogle Semantic Web Search Engine
  • Harvesting Semantic Web data from the Web
  • Provide search/navigation services for machines
    (via REST RDF/XML)
  • Digest doc, term, namespace
  • Links
  • Also serves human users
  • Status
  • Running since summer 2004
  • 1.6M RDF documents, 300M RDF triples, 10K
    ontologies

http//swoogle.umbc.edu/
33
Ontology Dictionary
  • From web of document to web of data
  • Aggregate from multiple sources
  • Inductively learned definition

Onto 1
Onto 2
rdftype
owlClass
foafname
rdfsdomain
foafPerson
foafPerson
foafAgent
rdfssubClassOf
foafname
rdfsdomain
rdftype
owlClass
wobhasInstanceDomain
foafPerson
wobhasInstanceDomain
foafAgent
dctitle
rdfssubClassOf
SWD3
foafname
Tim Finin
rdftype
foafPerson

dctitle
Dr.
http//swoogle.umbc.edu/2005/modules.php?nameOnto
logy_Dictionary
34
Semantic Web Challenges - Winners
2003
2004
Flink itself is also likely to be unique as a
crossover between a social experiment and a
semantic application.
CS AKTive Space (CAS) is an integrated Semantic
Web application which provides a way to explore
the UK Computer Science Research domain across
multiple dimensions for multiple stakeholders,
from funding agencies to individual researchers.
2005
CONFOTO is a browsing and annotation service for
conference photos.
http//challenge.semanticweb.org/
35
Triple Shop SPARQL dataset finder
Who knows Anupam Joshi? Show me their names,
email address and pictures
1. Compose a SPARQL query without FROM clause
2. Parse SPARQL query, search Swoogle for
related URLs, and compose a dataset
3. Run SPARQL query on dataset
http//sparql.cs.umbc.edu/tripleshop2/
36
Integrating Social Networks
FOAF Network
Reputation Systems
  • data
  • FOAF
  • knows RDF
  • RDF/XML
  • DBLP
  • Coauthor Database
  • HTML
  • Trust
  • Reputation
  • Trust network
  • Computation
  • Entity mapping
  • Tie strength
  • Trust aggregation

J. Golbeck
source
Google PageRank
knows
Citeseer Rank
L. Ding
J. Hendler
H. Chen
P. Kolari
knows
F. Perich
knows
A. Joshi
T. Finin
Kagal
Golbecks Trust Network
sink
hub
island
sameName
Y. Peng
L. Ding
co-author
6
1
28
A. Sheth
T. Finin
A. Joshi
L. Kagal
1
5
M. P. Singh
H. Chen
F. Perich
DBLP Coauthor Network
37
Inference Web Infrastructure
WWW
Toolkit
Trust computation
IWTrust
OWL-S/BPEL
SDS (DAML/SNRC)
Proof Markup Language (PML)
End-user friendly visualization
IW Explainer/ Abstractor
N3
CWM (TAMI)
Expert friendly Visualization
Trust
KIF
JTP (DAML/NIMD)
IWBrowser
search engine based publishing
Justification
SPARK-L
SPARK (CALO)
IWSearch
Provenance
provenance registration
Text Analytics
IWBase
UIMA (NIMD/Exp Agg)
Inference Web Framework for explaining question
answering tasks by abstracting, storing,
exchanging, combining, annotating, filtering,
segmenting, comparing, and rendering proofs and
proof fragments provided by question answerers.
38
PML Proof Markup Langauge
isQueryFor
IWBase
Question fooquestion1 (what is Tonys
Specialty)
Query fooquery1 (type TonysSpecialty
?x)
hasAnswer
hasLanguage
Justification Trace
NodeSet foons1 (hasConclusion )
Language
hasInferencEngine
fromQuery
isConsequentOf
InferenceEngine
hasRule
InferenceStep
InferenceRule
hasAntecendent
Source
NodeSet foons2 (hasConclusion )

hasVariableMapping
Mapping
isConsequentOf
fromAnswer
hasSourceUsage
hasSource
SourceUsage
InferenceStep
usageTime
39
IWBrowser Justification and Provenance
40
Tracking Provenance via RDF Molecule
decompose
The graphs RDF molecules
An RDF graph G
http//www.cs.umbc.edu/dingli1
t1
foafknows
foafname
t2
t1

Li Ding
foafname
t2
t3
t4
Tim Finin
foafmbox
t3
t4
t3
mailtofinin_at_umbc.edu
Match sub-Graph
Web pages containing one or more molecules
discovered by Swoogle
Ding, L. Finin, T. Peng, Y. Pinheiro da Silva,
P. McGuinness, D.L. Tracking RDF Graph
Provenance using RDF Molecules. Proceedings of
the Fourth International Semantic Web Conference
(poster), November 2005. 2005 ,
http//www-ksl.stanford.edu/KSL_Abstracts/KSL-05-
06.html
41
Conclusion
  • The Semantic Web
  • simple but powerful
  • Standardized by W3C RDF, RDFS, OWL
  • Current focuses
  • Query -- SPARQL
  • Rules SWRL, RIF
  • Web services OWL-S, WSDL-S, SAWSDL
  • Best practice and deployment
  • but cannot do everything
  • Open questions
  • Business model, Industry adoption?
  • Privacy?

42
Recommended Readings
  • Tutorials
  • Semantic Web Road map, (since 1998), Tim
    Berners-Lee
  • The Semantic Web, Scientific American, May 2001,
    Tim Berners-Lee, James Hendler and Ora Lassila
  • Ontology Development 101 A Guide to Creating
    Your First Ontology, 2001, Natalya F. Noy and
    Deborah L. McGuinness
  • Semantic Web Tutorials, http//www.w3.org/2001/sw/
    BestPractices/Tutorials
  • Starting points
  • W3C Semantic Web activity, http//www.w3.org/2001/
    sw/
  • W3C Semantic Web Interest Group,
    http//www.w3.org/2001/sw/interest/
  • W3C Semantic Web News, http//www.w3.org/2001/sw/n
    ews
  • Planet RDF - aggregated blogs, http//planetrdf.co
    m/
  • Dave Becketts Resource Description Framework
    (RDF) Resource Guide
  • Swoogle Semantic Web Search Engine,
    http//swoogle.umbc.edu
  • Semantic Web reference card, http//ebiquity.umbc.
    edu/resource/html/id/94/
  • Conferences and Journals
  • International Semantic Web Conference (ISWC)
  • European Semantic Web Conference (ESWC)
  • Semantic Technology Conference (SemTech)
  • Journal of Web Semantics

43
Ongoing W3Cs Semantic Web Activity
  • RDF Data Access Working Group
  • RDQL gt SPARQL
  • Rules Interchange Working Group
  • RuleML gt SWRLgt RIF
  • Best Practices Working Group
  • Vocabulary management, e.g. WordNet
  • Thesauri SKOS (Simple Knowledge Organization
    System)
  • Image Annotation
  • DOAP (Description of a Project)
  • Many tutorials and demos
  • Semantic Annotations for Web Services Description
    Language Working Group
  • OWL-S and WSDL-S
  • WSDL 2.0
Write a Comment
User Comments (0)
About PowerShow.com