SW Application Development with the Sesame Framework - PowerPoint PPT Presentation

Loading...

PPT – SW Application Development with the Sesame Framework PowerPoint presentation | free to download - id: 23756-ODU4Z



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

SW Application Development with the Sesame Framework

Description:

Schedule for the afternoon. 14:00 14:10 Introduction. RDF and ... 'Johnny' 'Depp' '5'^^xsd:int. um:ActorRating. rdf:type. rdf:type. rdf:value. foaf:firstName ... – PowerPoint PPT presentation

Number of Views:564
Avg rating:3.0/5.0
Slides: 63
Provided by: jeenbro
Learn more at: http://www.openrdf.org
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: SW Application Development with the Sesame Framework


1
SW Application Developmentwith the Sesame
Framework
  • Jeen BroekstraAduna

Atanas Kiryakov OntoText
James Leigh Workbrain
2
Schedule for the afternoon
  • 1400 1410 Introduction
  • RDF and RDF Schema
  • The Sesame project history, contributors, future
  • A running example for the rest of the afternoon
  • 1410 1445 The Sesame Framework
  • overview of architecture and features
  • how to use Sesame to build apps
  • 1445 1500 Querying RDF with SeRQL / SPARQL
  • 1500 1530 Using Context for Provenance and
    Time tracking
  • 1530 1550 Coffee break
  • 1550 1635 Elmo
  • 1635 1715 OWLIM
  • 1715 1730 Discussion / wrapup

3
Tutorial Materials online
  • Locationhttp//openrdf.org/conferences/eswc2006/
  • Links to Sesame 2 user, system, api documentation
  • Query Language documentation
  • Supportive material for this tutorial
  • example queries and an example Sesame server
  • a CVS repository containing some simple,
    runnable, code examples
  • example Elmo applications

4
Introduction
5
RDF in one slide
  • Data model for expressing knowledge
  • basic building block statement
  • Jeen .
  • groups of statements form graphs

name
Jeen
person001
email
j.broekstra_at_tue.nl
worksIn
projectMemberEmail
project001
Sesame
name
6
RDF Schema in one more slide
  • RDF Schema is a Vocabulary Description Language
  • it allows specification of domain vocabulary and
    a way to structure it
  • Class, Property, subClassOf, subPropertyOf,
    domain, range
  • Formal semantics add simple reasoning
    capabilities
  • class and property subsumption
  • domain and range inference

rdfsClass
rdftype
rdfProperty
Person
rdftype
rdfsdomain
rdfssubClassOf
Researcher
name
rdftype
person001
7
Turtle Syntax
  • _at_prefix rdf tax-ns .
  • _at_prefix rdfs a .
  • _at_prefix movie .
  • _at_prefix xsd .
  • _at_prefix foaf .
  • _at_prefix um .
  • a umUser
  • foaffirstName "Jeen"
  • foaffamilyName "Broekstra "
  • foafmbox om
  • rdfsseeAlso jeen.rdf .
  • umrating
  • a umActorRating
  • umonActor 4/
  • rdfvalue "6"xsdinteger
  • .

8
The Sesame Project
  • 1999 2001 EU IST On-To-Knowledge project
  • development of a research prototype RDF query
    engine for use in the project
  • Aduna developed a query engine (RQL) RDBMS
    backend
  • result Sesame 0.1
  • 2001 2003 Open Sesame
  • sponsored by the NLNet foundation
  • two-year open source development project
  • end-result Sesame 1.0
  • 2004-2006 Open Sesame 2
  • followup project
  • goal to increase adoptation of Sesame, and to
    further develop the framework
  • result Sesame 2.0
  • 2006 onwards
  • Aduna is committed to further developing Sesame
    as part of its product suite
  • Dedicated to open source
  • Open invitation for more participation in
    development wed like to have your contributions!

9
Tutorial example
  • A community system for Movie and Actor ratings
  • user model
  • user profile
  • ratings of actors and movies
  • movie ontology
  • movies with genres, descriptions, etc.
  • actors with foaf profiles

10
The Ontology (or Ontologies)
foafPerson
rdfssubClassOf
rdfssubClassOf
movieActor
umRating
umUser
rdfsrange
rdfssubClassOf
rdfsrange
rdfsdomain
rdfssubClassOf
umrating
umActorRating
umMovieRating
rdfsdomain
umonActor
rdfsdomain
umonMovie
11
User/Rating Model
movieActor
umActorRating
umUser
rdftype
rdftype
Johnny
rdftype
John
actor1
foaffirstName
foaffirstName
user1
Depp
foaffamilyName
foaffamilyName
Doe
umonActor
umrating
5xsdint
rdfvalue
12
Movie Model
movieGenre
movieMovie
rdftype
rdftype
movieRomance
movieComedy
rdftype
moviegenre
moviegenre
movie1
movieyear
1990
movieRole
movietitle
moviehasPart
rdftype
Edward ScissorHands
r1
Edward ScissorHands
moviecharacterName
actor1
movieplayedBy
13
The Sesame Framework
14
What is Sesame?
  • A framework for storage, querying and inferencing
    of RDF and RDF Schema
  • A Java Library for handling RDF
  • A Database Server for (remote) accessto
    repositories of RDF data

15
Sesame features
  • Light-weight yet powerful Java API
  • Highly expressive query and transformation
    languages
  • SeRQL, SPARQL
  • High scalability (O(107) triples on desktop
    hardware)
  • Various backends
  • Native Store
  • RDBMS (MySQL, Oracle 10, DB2, PostgreSQL)
  • main memory
  • Reasoning support
  • RDF Schema reasoner
  • OWL DLP (OWLIM)
  • domain reasoning (custom rule engine)
  • Transactional support
  • Context support
  • Rio Toolkit parsers and writers for different
    RDF syntaxes
  • RDF/XML, Turtle, N3, N-Triples

16
Sesame architecture
application

HTTP / SPARQL protocol
HTTP Server
application
Repository Access API
SeRQL
SPARQL
Rio
SAIL API
SAIL Query Model
RDF Model
17
Sesame architecture
application

Remote apps can communicate overthe Web with a
Sesame server and update data or do queries
HTTP / SPARQL protocol
HTTP Server
application
Allows deployment of Sesame as a web-enabled
database server (e.g. in Tomcat). Implements a
superset of SPARQL protocol (HTTP REST)
Local apps can just include (parts of) Sesame as
a Java library and use it to process RDF data
efficiently.
Repository Access API
Main Access API of SesameOffers
developer-friendly methods for manipulating RDF
data (query, adding, removing, updating)
SeRQL
SPARQL
Declarative Querying and other higher-level
functions on SAILs
Rio
SAIL API
SAIL Query Model
Storage And Inference Layer System API for
wrapping storage backend
RDF I/O Set of parsers and writers for RDF/XML,
Turtle, N3, N-Triples.Can be used separately.
RDF Model
The core RDF model, containing objects and
interfaces for URIs, blank nodes, literals,
statements.
18
The SAIL API
  • Storage And Inferencing Layer
  • Abstraction from physical storage
  • allows other Sesame components to function on any
    type of store
  • can be used as a wrapper layer for aparticular
    data source
  • System Internal API
  • application developers typically do not use it
    directly

19
The Repository Access API
  • A single Java object representation for a Sesame
    database, offering methods for
  • evaluating a query and retrieving the result
  • adding RDF data from local file, from the web, as
    a text string, etc.
  • adding/removing (sets of) RDF statements
  • starting/stopping transactions

20
Installing a Server
  • Prepare the environment
  • install a Java Servlet Container (we recommend
    Apache Tomcat 5.x)
  • install a full Java 5.0 environment(we
    recommend Sun J2SE 1.5.x)
  • deploy the sesame.war web application
  • TOMCAT/webapps/sesame
  • configure the Sesame server
  • SESAME/WEB-INF/server.conf.example
  • edit as XML file in your favourite editor

21
A repository configuration
  • id is the string identifier for the repository
  • title is a human-readable title
  • sailstack contains a list of sails
    (top-to-bottom)
  • the bottom sail is the actual storage layer
  • each layered sail adds functionality as a
    filter on top of the store (e.g. inferencing,
    caching, etc.)

Main Memory
RDF Schema repository
l.memory.MemoryStoreRDFSInferencer"/ class"org.openrdf.sesame.sailimpl.memory.MemorySt
ore" value/data/mem-rdfs.dat/

22
Reasoning
  • Sesame supports RDF Schema entailment
  • a reasoner is a stacked SAIL and applied to a
    storage backend at configuration time
  • current implementation computes the RDFS closure
    at upload-time and stores it
  • main advantage
  • faster querying
  • main disadvantage
  • slower update speed

23
Querying with SeRQL / SPARQL
  • play along at
  • http//www.openrdf.org/sesame2-webclient

24
Querying RDF
  • RDF is a labeled, directed graph of
    semistructured data
  • no rigid schema
  • An RDF query language needs to be able to address
    this
  • graph path expressions
  • dealing with semistructured nature of RDF
  • flexible querying of both data and schema

25
SeRQL vs. SPARQL
  • Both expressive query and transformation
    language for RDF
  • SELECT and CONSTRUCT
  • optional path expressions
  • support for context/named graphs
  • SeRQL (circle)
  • nested queries (IN, EXISTS operators)
  • user-friendly syntax (a matter of taste of
    course)
  • efficient Sesame implementation
  • SPARQL (sparkle)
  • W3C Standard (in progress)
  • tool interoperability Jena, Redland, 3Store,
    Sesame,

26
SeRQL vs. SPARQL example
SELECT DISTINCT X, T FROM X movietitle
T moviehasPart Y moviecharacterName
Z WHERE Z Edward Scissorhands_at_en USING
NAMESPACE movie s/
PREFIX movie
SELECT DISTINCT ?x ?t WHERE ?x movietitle ?t
moviehasPart ?y . ?y
moviecharacterName ?z . FILTER (?z Edward
Scissorhands_at_en)
27
SeRQL path expressions
  • X moviehasPart role1
  • X moviehasPart Y
  • X P Y

28
Chaining, branching and comparing
  • Chaining
  • X moviehasPart Y moviecharacterName Z
  • Branching
  • Y rdftype movieRole moviecharacterName
    Z
  • Comparison operators
  • String comparison
  • Z like Hands
  • boolean comparison
  • X

29
SeRQL query composition
  • Using the building blocks, we can compose complex
    queries.
  • SeRQL uses a select-from-where syntax (like SQL)
  • select the variables that you want to return
  • from the path in the graph that you want to get
    the information from
  • where additional constraints on the values using
    operators

SELECT X, Y, Z FROM X moviehasPart Y
moviecharacterName Z WHERE Z LIKE edward
scissorhands IGNORE CASE USING NAMESPACE
movie
30
Optional path expressions
  • RDF is semi-structured
  • Even when the schema says some object should have
    a particular property, it may not always be
    present in the data
  • Users have names and email addresses, but
    Geert-Jan is a user without a known email address

foaffirstName
Jeen
umUser
type
type
person001
foafmbox
j.broekstra_at_tue.nl
person002
Geert-Jan
foaffirstName
31
Optional path expressions (2)
  • To be able to query for all users, their first
    names, and if known their email address, SeRQL
    introduces optional path expressions

SELECT DISTINCT Person, Name, Email FROM
Person rdftype umUser
foaffirstName Name foafmbox
Email USING NAMESPACE foaf
, um

32
CONSTRUCT queries
  • CONSTRUCT-queries return RDF statements
  • each RDF statement matching the query pattern is
    returned
  • The query result is
  • a subgraph of the original graph, or
  • a transformed graph
  • This mechanism allows formulation of simple rules

33
CONSTRUCTing subgraphs
  • Retrieve for each movie the title and year as RDF
    Statements

CONSTRUCT FROM M rdftype movieMovie
movieyear Y movietitle
T USING NAMESPACE movie /movies/
34
Movie Model (repeat)
movieGenre
movieMovie
rdftype
rdftype
movieRomance
movieComedy
rdftype
moviegenre
moviegenre
movie1
movieyear
1990
movieRole
movietitle
moviehasPart
rdftype
Edward ScissorHands
r1
Edward ScissorHands
moviecharacterName
actor1
movieplayedBy
35
Graph Transformations
  • Create a graph of actors and relate them to the
    movies they play in (through a new playsInMovie
    relation)
  • CONSTRUCT
  • A foaffirstName FN
  • foaffamilyName LN
  • myplaysInMovie M movietitle T
  • FROM
  • M movietitle T
  • moviehasPart movieplayedBy A
    foaffirstName FN
  • foaffamilyName LN
  • USING NAMESPACE
  • movie ,
  • foaf ,
  • my

36
Nested Queries
  • often necessary for conditions on a set (rather
    than a single value)
  • if a value x exists such that
  • if the property p is in the set
  • SeRQL has three nested query forms
  • IN operator
  • ANY and ALL operator quantification
  • EXISTS operator

37
Using nested queries (1)
  • Using EXISTS to retrieve all movies for which no
    rating is known

SELECT DISTINCT movie, mtitle FROM movie
rdftype movieMovie movietitle
mtitle WHERE NOT EXISTS (SELECT rating
FROM rating rdftype
umMovieRating
umonMovie movie) USING NAMESPACE movie
, um

38
Using nested queries (2)
  • Using the ALL modifier to retrieve the highest
    actor rating for each user
  • SELECT DISTINCT user, rating, fname, lname
  • FROM
  • user rdftype umUser
  • umrating rdfvalue rating
  • umonActor
    foaffirstName fname

  • foaffamilyName lname
  • WHERE rating ALL
  • (SELECT otherRating
  • FROM user umrating rdftype
    umActorRating
  • rdfvalue
    otherRating)
  • USING NAMESPACE
  • foaf ,
  • um

39
Using nested queries (3)
  • Using the IN operator to find all movies which
    share at least one genre with Gone with the Wind

SELECT DISTINCT movie, mtitle FROM movie
rdftype movieMovie movietitle
mtitle moviegenre genre WHERE
genre IN (SELECT otherGenre FROM
rdftype movieMovie movietitle
gwtw moviegenre otherGenre
WHERE label(gwtw) LIKE "gone with the wind"
IGNORE CASE) USING NAMESPACE movie
, um

40
Using the Sesame API
  • example applications in Sesame CVS

41
Using Sesame as a library
  • Include Sesame jar files in your classpath
  • sesame.jar, openrdf-util.jar, openrdf-model.jar,
    rio.jar
  • sparql-sesame.jar, sparql-core.jar optional
  • Use the Sesame Repository API to create, access,
    query, etc. RDF models in Sesame repositories.

42
Creating a Repository object
  • import org.openrdf.sesame.repository.
  • import org.openrdf.sesame.sailimpl.memory.
  • // first repository an in-memory store
  • Repository rep new Repository(
  • new MemoryStore())
  • rep.initialize()
  • // second repository an in-memory store
  • // with RDFS inferencing enabled
  • Repository rep2 new Repository(
  • new MemoryStoreRDFSInferencer(
  • new MemoryStore()))
  • rep2.initialize()

43
Querying a Sesame Repository
  • String query SELECT X, Y FROM X P Y
  • // execute the query and give me the result.
  • QueryResult result
  • rep.evaluateTupleQuery(QueryLanguage.SERQL,
    query)
  • // a query result is a set of solutions.
  • for (Solution solution result)
  • // each solution is a set of variable bindings.
  • Value x solution.getValue(X)
  • Value y solution.getValue(Y)
  • // do something interesting with the values
    here
  • result.close()

44
Transactions
  • Sesame repositories have full transaction support
  • By default, the repository runs in autoCommit
    mode
  • every add or remove operation is treated as a
    single transaction
  • Explicit Transaction objects can be used to group
    operations into transactions

45
Using Transactions
  • Transaction txn rep.startTransaction()
  • try
  • // add the first file
  • txn.add(inputFile1, baseURI1, RDFFormat.RDFXML)
  • // add the second file
  • txn.add(inputFile2, baseURI2, RDFFormat.RDFXML)
  • txn.commit()
  • finally
  • if (txn.isActive())
  • // something went wrong during the transaction,
  • // so we want to cancel it completely, and
    return to
  • // the state before the transaction started
  • txn.rollback()

46
Context support
  • Mechanism to identify groups of statements
  • Each statement gets an (optional) extra context
    identifier (a URI)
  • Instead of triples we now have quads
  • We can make additional statements about the group
    by using the context identifier

47
How context works
Sesame Repository
context1
context2
48
Some Use Cases for Context
  • Provenance tracking
  • allows easy updates when a source document has a
    new version
  • allows querying of particular sources within one
    repository
  • Versioning and Time Tracking
  • the context identifier can be used to indicate
    different versions of the same information
  • we can also use it to identify which information
    is valid at which period in time

49
Default vs. Named Context
  • A named context is a context with an associated
    context identifier.
  • Each repository can have any number of named
    contexts
  • The default context is that part of the store
    that is queried when no named context is
    specified in the query
  • Each repository has exactly one default context
  • Depending on configuration, the default context
    can contain
  • only the statements which have no associated
    named context (exclusive mode)
  • all statements, including those in all named
    contexts (inclusive mode)

50
Inclusive vs. Exclusive
Sesame Repository
Default Context (inclusive mode)
source
context1
context2
foaf-chris.rdf
foaf-jeen.rdf
source
51
Querying Context in SeRQL
  • The FROM CONTEXT clause can be used to query a
    specific named context

SELECT DISTINCT firstname, lastname, mbox FROM
CONTEXT x
foaffirstName firstname foaffamilyName
lastname foafmbox mbox USING
NAMESPACE foaf
  • or to retrieve the context associated with
    certain data

SELECT DISTINCT c, firstname, lastname, mbox
FROM CONTEXT c x foaffirstName
firstname foaffamilyName lastname
foafmbox mbox USING NAMESPACE foaf

52
Querying Context in SeRQL
  • Combinations of contextualized and unrestricted
    clauses can be used to do mixed querying of
    domain information and context information

SELECT DISTINCT c, version, firstname, lastname,
mbox FROM c exversion version FROM
CONTEXT c x foaffirstName firstname
foaffamilyName lastname foafmbox
mbox USING NAMESPACE foaf
, ex

53
Context vs. Reification
  • Reification identifies individual triples,
    context identifies groups
  • RDF Reification has practical drawbacks
  • reifying a single triple takes at least four
    additional triples inefficient, non-scalable
  • there is no conceptual link between the reified
    object and the actual statement

c
a
54
Reasoning and Context
  • Current RDFS reasoner works on the default
    context
  • all entailed triples are asserted without
    associated context
  • Other possible modes of operation
  • per-context, entailed triples asserted in context
  • default context, entailed triples asserted in
    context(s) of premises

55
Reasoning for OWL
  • OWLIM plugin support (by OntoText)
  • inductive, scalable reasoning over a pragmatic
    subset of OWL
  • Custom reasoner
  • rule-based reasoner with user-defined rules
  • can be used to capture (part of) the semantics of
    OWL Lite / DL.

56
Remote Access
  • HTTP REST API
  • extension of the SPARQL protocol
  • each repository and each context has a URI
  • operations on the repository by HTTP methods
    (GET, POST, PUT, DELETE) and parameters

GET /sesame/testdb?queryselect from x p
y HTTP/1.1 Host localhost Accept
application/sparql-resultsxml
POST /sesame/testdb HTTP/1.1 Host localhost
Content-Type application/rdfxml Transaction
data
57
Remote Access Client API
  • A repository object can have a HTTPSail backend
  • The (clientside) repository behaves as any Sesame
    repository, but all operations are forwarded to a
    remote server through HTTP by the HTTPSail

HTTPSail remoteStore new HTTPSail(http//open
rdf.org/sesame/testdb) Repository rep new
Repository(remoteStore) rep.initialize()
58
Sesame Native Store
  • Binary On-Disk persistence
  • Streaming access
  • B-Tree indexing
  • user configurable to optimize performance for
    particular access patterns
  • Design Goals
  • fast performance
  • high scalability
  • no installation hassle

59
Native Store Performance
  • LUBM Benchmark
  • generated data set test queries
  • we used LUBM-500 about 69 million triples
  • hardware 2.8GhZ P4 with 1G RAM, running Linux
    Suse 10.0 (kernel 2.6), J2SE 1.5.0_06.
  • native store setup no RDFS entailment, SPOC
    POSC index
  • Results
  • upload time approx. 3 hours (avg. 5,700
    triples/s)
  • query performance on test queries
  • most queries evaluate within 10-50 ms or 1-5
    secs.
  • some (two or three) queries take long however
  • worst case we encountered was 1.5 hours
  • cause is typically many unrestricted conjunctions
  • Detailed results available on Sesame forum

60
Development status
  • Sesame 2.0-alpha-3
  • API tuned and (almost) stable
  • Some important features still missing
  • HTTPSail
  • Web Client Interface
  • MySQL storage backend
  • OWLIM plugin, custom reasoner
  • Final release expected end of July 2006

61
Pointers
  • Sesame websitehttp//www.openrdf.org/
  • Sesame 2 documentation
  • User Documentationhttp//www.openrdf.org/doc/sesa
    me2/users/
  • System Design Documentationhttp//www.openrdf.org
    /doc/sesame2/system/
  • JavaDoc APIhttp//www.openrdf.org/doc/sesame2/api
    /
  • RDF(S) specs at W3Chttp//www.w3.org/RDF
  • SPARQL / Data Access Working Grouphttp//www.w3.o
    rg/2001/sw/DataAccess/

62
Questions?
About PowerShow.com