Title: The PolicyAware Web: Privacy and Transparency on the Semantic Web
1The Policy-Aware Web Privacy and Transparency on
the Semantic Web
- Jim Hendler
- Hendler_at_cs.umd.edu
- http//www.cs.umd.edu/hendler
2004 NSF National Priorities ITR to UMCP and MIT
(Hendler, Berners-Lee, Weitzner- PIs)
2Outline
- Motivation
- Example
- Digression
- Content
- Challenge(s)
- Summary
3"Because it's there"
4(No Transcript)
5Access and Privacy Control
6As we publish more info- how do we control access
Who can see What??
7Current Policy Languages
- A number of languages being explored
- P3P (data-centric relational semantics -gt
relational database) - WS-Policy (propositional, and or, but weak not)
- Features and Properties (no operators, easier to
map to RDF) - Combinators (choose one/all, similar to
WS-Policy) - KaOS Policy and Domain Services
- WSPL and EPAL (subsets of XACMLs)
- XACML (and, or, not, first and higher order bag
functions) - Rei (OWL-Lite logic-like variables)
- A lot of ambiguity about exact expressivity and
computational properties (or even the semantics!)
8An example WS-Policy
- WS-Policy provides a flexible grammar for
expressing CC of web services - Normalized form (maybe to do non normalized)
- Two translation approaches
- Policies as Instances
- Readable, but hard to capture semantics
- Available at
- http//mindswap.org/dav/ontologies/ws-policy_inst
ance.owl - Policies as Classes
- Translate WS-Policy constructs into OWL
constructs - E.g., wspAll --gt owlintersectionOf
9WS-Policy Example
- ltwspPolicygt
- ltwspExactlyOnegt
- ltwspAllgt
- ltwsseSecurityTokengt
- ltwsseTokenTypegtwsseKerberosv5TGTlt/wsseTokenT
ypegt - lt/wsseSecurityTokengt
- lt/wspAllgt
- ltwspAllgt
- ltwsseSecurityTokengt
- ltwsseTokenTypegtwsseX509v3lt/wsseTokenTypegt
- lt/wsseSecurityTokengt
- lt/wspAllgt
- ltwspAllgt
- ltwsseSecurityTokengt
- ltwsseTokenTypegtwsseUserNameTokenlt/wsseTokenT
ypegt - lt/wsseSecurityTokengt
- lt/wspAllgt
- lt/wspExactlyOnegt
- lt/wspPolicygt
10Mapping WS-Policy to OWL
- all is easy its logical conjuction (i.e.,
intersectionOf) - exactlyOne is harder, two readings
- Older version oneOrMore
- Inclusive OR, maps to owlunionOf
- exactlyOne suggests XOR
- Have to map to a disjunction of conjunctions
- Quadratic increase in size of disjuncts
- Ontology http//www.mindswap.org/dav/ontologies/p
olicytest.owl -
11Example
- _at_prefix owl lthttp//www.w3.org/2002/07/owlgt
._at_prefix policytest lthttp//www.mindswap.org/ko
lovski/policytest.owlgt .policytestTestPolicy
a owlClass owlintersectionOf (
owlunionOf ( policyte
stSecurityTokenTypeUsernameToken p
olicytestSecurityTokenTypeX509 pol
icytestSecurityTokenTypeKerberos )
owlcomplementOf owlunionOf
( owlintersectionOf
( policytestSecurityTo
kenTypeUsernameToken
policytestSecurityTokenTypeX509 )
owlintersectionOf (
policytestSecurityTokenTypeUsernameToken
policytestSecurityToken
TypeKerberos ) owlintersecti
onOf (
policytestSecurityTokenTypeX509
policytestSecurityTokenTypeKerberos ) )
) .
12Use OWL tools
13Digression
14SWOOP OWL ontology tool
15(No Transcript)
16Ontology Debugging Service
- Example taken from Sweet-JPL OWL Ontology, where
13 out of 3000 axioms make one class
unsatisfiable
17Under the hood
- The Semantic Web vision requires "plumbing" that
lives on the Web, but provides support for - Ontologies linked together
- Reasoning that can scale
- Limited expressivity (OWL)
- Mixed Logics and Rules (RIF)
- Open World reasoning (CW is key to many
algorithms performance) - "Hidden" logic - users want results, not symbols
- Modularity and collaboration
- Teams of people creating teams of ontology
- And much more
- Triple store scaling, HTTP embedding (state
free),URIs
18Pellet a reasoner for the SemWeb
19Pellet OWL reasoner
- Description Logic reasoner based on tableaux
algorithms - Specifically designed for OWL
- Primarily for OWL-DL ontologies
- Heuristics to repair OWL Full ontologies
- Research extensions to OWL FULL
- First reasoner to support all of OWL-DL
- Implements SHOIQ algorithm by Horrocks and
Sattler - Provides all the standard reasoning services
- KB consistency, concept satisfiability,
classification, realization - Plus
20Special Features
- Query Answering
- Conjunctive ABox queries expressed in RDQL or
SPARQL - Datatype Reasoning
- Check if the intersection of XML Schema datatypes
is satisfiable - Support reasoning with user-defined derived
datatypes - e.g. numeric or time intervals
- Multi-Ontology Reasoning using E-Connections
- Defining and instantiating combinations of OWL-DL
ontologies - An alternative to owlimports
- Ontology Debugging
- Explaining the cause of unsatisfiable concepts
- Relations between unsatisfiable concepts
- Non-monotonic Reasoning with K-operator
- Closed-world queries using ALCK
21Pellet (more)
- Coerces DL-izable OWL Full ontologies into OWL
DL - OWL Full and OWL DL can be unified
- Inverse functional properties on datatype
properties - Punning Metaclasses allowed
- Type assignment for untyped classes
- Combines inverse and nominal correctly
(decidably) - Extended datatype support (more built in and user
defined datatypes) - Incremental reasoning through update of the KB
- Optimized classification and realization (50 to
order of magnitude improvements) - Working on updating the completion graph to speed
initial consistency check
22Performance
- Dynamic completion strategy selection based on
the ontology expressivity - Nominals (oneOf, hasValue), Inverse Properties
(inverseOf), Individuals - Includes standard optimization techniques
- Normalization, simplification, absorption,
semantic branching, dependency directed
backjumping, caching, model merging, binary
instance retrieval - Several novel optimizations (see KR 06 paper)
- Nominal absorption, learning-based disjunct
selection, partial backjumping, nominal-based
model merging, lazy forest generation, forest
caching
23Applications using Pellet
- Ontology editing and management
- Available as a Swoop plug-in
- DIG interface to support Protégé
- Web Service composition
- Matchmaking for Web Services
- Reasoning about preconditions and effects
- Fujitsu Task Computing Environment
- Interacting with devices and Web Services
- Reasoning about policies
- Policy consistency, policy containment, etc.
- Process WS-Policy descriptions
24Policy Aware Web
(NSF ITR Hendler, Berners-Lee, Weitzner 2005)
25PAW demo
26(No Transcript)
27Use case A Web browser requests the home page
for a girl scout troop and is given it by a Web
server.
Web Server
Content
Demo
28However, requests for images result in HTTP Error
401, Unauthorized
Web Server
401
Content
401
29The 401 Unauthorized response has been modified
to provide a URL to a policy
HTTP/1.1 401 Not authorized Date Sat, 03 Dec
2005 153218 GMT Server TwistedWeb/2.0.1
Policy http//groups.csail.mit.edu/dig/2005/09/re
in/examples/troop42-policy.n3 Content-type
text/html charsetUTF-8 Connection
close 103220 ERROR 401 Not authorized.
Demo
30Policies use linked rules
- Example policies
- Photos taken at meetings of the troop can be
shared with any current member of the troop. - Photos taken at a jamboree can be shared with
anyone in the troop or with anyone who attended
the jamboree. - Photos of any girl in the troop can be shared
with the world if that girl's parent has given
permission
REQ a reinRequest. REQ reinresource PHOTO.
?F a TroopStuff logincludes PHOTO a
tPhoto tlocation LOC. LOC a
tMeeting . REQ reinrequester WHO. WHO
sessionsecret ?S. ?S cryptomd5 TXT. ?F a
TroopStuff logincludes tmember
is foafmaker of PG . LOC tattendee
is foafmaker of PG . PG logsemantics
logincludes PG foafmaker
sessionhexdigest TXT . gt WHO
httpcan-get PHOTO .
31Rein "ontology"
32Rein example
lthttp//dig.csail.mit.edu/2005/09/rein/examples/
troop42.rdfgt logsemantics ?F gt ?F a
TroopStuff . Photos take at meetings of the
troop can be shared with any current member
of the troop REQ a reinRequest. REQ
reinresource PHOTO. ?F a TroopStuff
logincludes PHOTO a tPhoto
tlocation LOC. LOC a tMeeting .
REQ reinrequester WHO. WHO sessionsecret ?S.
?S cryptomd5 TXT. ?F a TroopStuff
logincludes tmember is
foafmaker of PG . LOC tattendee is
foafmaker of PG . PG logsemantics
logincludes PG foafmaker
sessionhexdigest TXT . gt WHO
httpcan-get PHOTO . Photos taken at a
jamboree can be shared with anyone in the
troop or with anyone who attended the
jamboree. (i) anyone who is in the troop REQ
a reinRequest. REQ reinresource PHOTO. ?F a
TroopStuff logincludes PHOTO a tPhoto
tlocation LOC. LOC a tJamboree . REQ
reinrequester WHO. WHO sessionsecret ?S. ?S
cryptomd5 TXT. ?F a TroopStuff logincludes
tmember is foafmaker of PG . .
PG logsemantics logincludes PG
foafmaker sessionhexdigest TXT .
gt WHO httpcan-get PHOTO .
(ii) anyone who attended the jamboree REQ
a reinRequest. REQ reinresource PHOTO. ?F a
TroopStuff logincludes PHOTO a tPhoto
tlocation LOC. LOC a tJamboree . REQ
reinrequester WHO. WHO sessionsecret ?S. ?S
cryptomd5 TXT. ?F a TroopStuff logincludes
LOC tattendee is foafmaker of PG . .
PG logsemantics logincludes PG
foafmaker sessionhexdigest TXT .
gt WHO httpcan-get PHOTO .
The RDF/XML syntax is even worse
Authorability/Editability are important issues
Specialized use (cf. Creative Commons) a
partial out.
33Use of the PAW proof-generation proxy results in
a proof which satisfies the policy
Web Server
Proof
Third-party services may be consulted to help
construct the proof.
34- The proxy
- Uses Rein, a policy engine, to specify rules
which match a given policy. - The Rein rules are run in Cwm, a forward-chaining
reasoner for the Semantic Web. This generates a
proof. - Proof is HTTP-PUT on the server, and a HTTP-GET
on same document is then invoked (requires HTTP
1.1)
35The Web server checks the proof and serves the
content if it is valid.
Web Server
Content
36- The server
- Uses Cwm to validate the proof.
- Takes action based on validation (serves content
or denies).
37- Current demo work
- Make use of multiple distributed authentication
systems (instead of holding secrets in the
proxy). - Associate content with RDF metadata and base
policy decisions on the RDF - Address issues of eventual integration of the
proxy with a Web browser (e.g. cookie storage). - Extend system to "distributed" scenarios
(different authorities hold parts of policy, may
have own rules on access) - Attack user interface issues
38Open, Distributed Policy Challenges
- Identity vs. privacy
- How do you identify yourself w/o violating the
very privacy concerns we hope to address? - Current identity schemes are centralized and
universal - Can we do a distributed ID model (maybe email
based)? - Inconsistency
- In logic "P -P gt Q"
- On Web it better not!
- (Supports(Hillary) -Supports(Hillary)) gt you
owe me 1000 - Can we use a "non-standard" logic solution?
- Provenance and downstream tracking
- As information flows through the system, later
access may depend on earlier decisions - Policies often dependent on use context
- Policies may change depending on how information
was acquired
39Provenance Tracking on the Semantic Web
- Provenance of Data
- Who or what services created/input the data
- Files on which the data depends
- Date and time of creation
- Steps taken to compute / produce the data
- "recursively" ground to the above
40Producing Provenance Data
- On the Semantic Web
- Provenance can be stored and tracked
- Services represented by Service Descriptions
- All files created and and referenced by URIs
- Web service executes and also outputs and OWL
model of the service execution, including all
provenance data - Service outputs a file with provenance for each
output file - Semantic Web triple stores maintain mapping to
this file from triples or subgraphs
41"Magic" is in URIs
Every piece of data gets its own "web page"
42Ontology for provenance
The "Web page" itself is machine-readable (OWL)
43Validation - IPAW provenance Challenge
- E.g. A user has run the workflow twice, in the
second instance replacing each procedures
(convert) in the final stage with two procedures
pgmtoppm, then pnmtojpeg. Find the differences
between the two workflow runs.
Answeredevery querysuccessfully
44Dana's Challenge
- All data directly output from a Predator UAV is
classified. - Classified data combined with unclassified data
is considered classified. - Classified data can only be viewed by persons
with top secret clearance, with the following
exceptions - In warfare conditions, unclassified persons
may view perishable data that is classified if
the persons life is threatened due to lack of
that data and if the person's superior has top
secret clearance and has approved such viewing.
Can we apply PAW to Army policies w/in B3AN?
45Conclusions
- Information lives in specific contexts
- The Semantic Web helps us place information into
these (multiple) contexts. - Control of information requires control of
contexts - Explication of policies
- Linked in a Web-like way
- Integrated directly into the Web
- With extensions for rules and proofs
- Is really hard
- Issues of identity, inconsistency, provenance,
change over time - But holds great potential
- Flexible and adaptive
- "Policy-Aware" Web project (joint between UMCP
and MIT) - First step towards "Semantic Accountability"
applications
http//www.policyawareweb.org/
46(No Transcript)
47Another Cool thing
- What is a rule of logic?
- In traditional philosophy it relates to "Truth"
- What is truth on the Web?
- Ex How many cows are in Texas?
- On the Web, we could use an idea of agreed upon
rules, grounded at URI - Social definition of truth via shared contexts
- Ex Because Mom said so
48Truth on Web Pages based on Heflin etal, 1998
- Inference rules could be used to determine the
credibility of claims - I might believe the claims made by a reliable
Newspaper - Trustable(x) - x reliableNewspaper.
- And I could establish the Washington Post as
reliable... - i.e. I assert
- http//www.washingtonpost.com owlclass
reliableNewspaper. - or if I infer it
- ReliableNewspaper(X) -gt
- X owlclass ReliableNewspaperhttp//MediaWatchL
ist. - (?) reliableNewspaper(X) -
- X owlclass ReliableNewspaper src
trusted(src). - The rules are "grounded" in a testable way
- cf. If I can HTTP-get the fact, then it is
asserted
49Rule Sets could be shared
- You can ground your sources
- X - X src src owlclass TrustedSource
http///myMomSet.rdf - Or infer trusted sources based on other rule sets
- X - X src src owlclass TrustedSource
http//ex.com/RushLimbaughSet.rdf - X - X src src owlclass TrustedSource
http//ex.com/UnabomberRules.rdf - --( Xhttp//www.rushLimbaugh.com/truths.rdf)
50Annotated Logic(in 25 words or less)
- Traditional Logic
- P -P gt Q (P and -P are inconsistent)
- Annotated Logic
- PX -PY are not inconsistent
- PX -PX gt QX but not QY
- PX -(PX) is inconsistent and must be avoided
(but this is easily checked if inference of RHS
is restricted)
51On the Web
ltfoafPersongt ltfoafnamegtJim
Hendlerlt/foafnamegt ltfoaftitlegtDrlt/foaftitlegt
ltfoaffirstNamegtJimlt/foaffirstNamegt
ltfoafsurnamegtHendlerlt/foafsurnamegt
ltfoafmbox_sha1sumgt be972c7a602683f7cf3c7a1fd0
949c565debe4d3 lt/foafmbox_sha1sumgt
ltfoafhomepage rdfresource"http//www.cs.umd.edu
/hendler"/gt ltfoafdepiction
rdfresource"http//www.semanticgrid.org/q-iantbl
jim.jpg"/gt ltfoafworkplaceHomepage
rdfresource"http//owl.mindswap.org"/gt lt/foafPe
rsongt
ltfoafnamegtJim Hendlerlt/foafnamegt
http//www.cs.umd.edu/hendler/2003/foaf.rdf
http//www.cs.umd.edu/hendler/2003/foaf.rdf
- Annotations represent document contexts
- XY and -(XY) cannot co-occur (unless Web is
broken) - (modulo temporal change, but that's another talk)
52Leveraging Work in
Link-mining in PiT data (w/Getoor)
Policy Aware Web (W/Berners-Lee)
Incremental OWL reasoning
Ontology debugging