Finding and Ranking Knowledge on the Semantic Web - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Finding and Ranking Knowledge on the Semantic Web

Description:

http://xmlns.com/foaf/0.1/index.rdf. http://xmlns.com/foaf/0.1/index.rdf ... DIY ontology engineering. Search an appropriate class C ... – PowerPoint PPT presentation

Number of Views:85
Avg rating:3.0/5.0
Slides: 36
Provided by: ebiqui
Category:

less

Transcript and Presenter's Notes

Title: Finding and Ranking Knowledge on the Semantic Web


1
Finding and RankingKnowledge on theSemantic Web
  • Li Ding, Rong Pan, Tim Finin, Anupam Joshi, Yun
    Peng and Pranam Kolari
  • University of Maryland, Baltimore County

? http//creativecommons.org/licenses/by-nc-sa/2.0
/ This work was partially supported by DARPA
contract F30602-97-1-0215, NSFgrants CCR007080
and IIS9875433 and grants from IBM, Fujitsu and
HP.
2
This talk
  • Motivation
  • Swoogle overview
  • Bots navigate the Semantic Web
  • Ranking Semantic Web content
  • Use cases and applications
  • Conclusions

3
Google has made us smarter
4
But what about our agents?
  • A Google for knowledge on the SemanticWeb is
    needed by people and software agents

5
This talk
  • Motivation
  • Swoogle overview
  • Bots navigate the Semantic Web
  • Ranking Semantic Web content
  • Use cases and applications
  • Conclusions

6
title
  • text

7
Swoogle Architecture
data analysis
interface
IR analyzer
SWD analyzer
Web Server
Web Service
SWD Metadata
SWD Cache
metadata creation
Agent Service
SWD Reader
SWD discovery
The Web
Candidate URLs
Web Crawler
Swoogle 2 340K SWDs, 48M triples, 5K SWOs, 97K
classes, 55K properties, 7M individuals
(4/05) Swoogle 3 700K SWDs, 135M triples, 7.7K
SWOs, (11/05)
8
Find Time Ontology
Demo1
We can use a set of keywords to search ontology.
For example, time, before, after are basic
concepts for a Time ontology.
9
Digest Time Ontology (document view)
Demo2(a)
10
Digest Time Ontology (term view)
Demo2(b)
TimeZone
before
.
intAfter
11
Find Term Person
Demo3
Not capitalized! URIref is case sensitive!
12
Digest Term Person
Demo4
167 different properties
562 different properties
13
Demo5(a)
Swoogle Today
14
Demo5(b)
Swoogle Statistics
FOAF
Trustix
W3C
Stanford
15
Swoogles Triple Store lets you shop
And check out your triples into any of several
reasoners
16
Summary
2004
  • Automated SWD discovery
  • SWD metadata creation and search
  • Ontology rank (rational surfer model)
  • Swoogle watch
  • Web Interface

Swoogle (Mar, 2004)
  • Ontology dictionary
  • Swoogle statistics
  • Web service interface (WSDL)
  • Bag of URIref IR search
  • Triple shopping cart

Swoogle2 (Sep, 2004)
  • Better (re-)crawling strategies
  • Better navigation models
  • Index instance data
  • More metadata (ontology mapping and OWL-S
    services)
  • Better web service interfaces
  • IR component for string literals

2005
Swoogle3 (July 2005)
17
This talk
  • Motivation
  • Swoogle overview
  • Bots navigate the Semantic Web
  • Ranking Semantic Web content
  • Use cases and applications
  • Conclusions

18
The Semantic Web Onion
Universal RDF Graph
The Semantic Web (About 10M documents)
Physically hosting knowledge (About 100 triples
per SWD in average)
RDF Document
Class-instance
triples modifying the same subject
Molecule
Finest lossless set of triples
Triple
Atomic knowledge block
Swoogle maintains metadata about objects in
different layers of the Semantic Web Onion.
19
Semantic Web Navigation Model
Navigating the HTML web is simple theres just
one kind of link. The SW has more kinds of links
and hence more navigation paths.
20
Semantic Web Navigation Model
sameNamespace, sameLocalname Extends
class-property bond
Term Search
1
RDF graph
Resource
SWT
literal
uses populates
2
5
4
defines
3
officialOnto isDefinedBy
isUsedBy isPopulatedBy
Web
rdfssubClassOf
SWD
SWO
6
7
rdfsseeAlso rdfsisDefinedBy
owlimports
Document Search
Relations in 1 and 3 and parts of 4 require a
global view to discover
21
An Example
http//xmlns.com/foaf/0.1/index.rdf
http//xmlns.com/foaf/0.1/index.rdf
http//www.w3.org/2002/07/owl
owlimports
owlClass
owlInverseFunctionalProperty
owlThing
rdftype
rdftype
rdftype
rdfsrange
foafPerson
foafAgent
foafmbox
rdfssubClassOf
rdfsdomain
http//www.cs.umbc.edu/finin/foaf.rdf
http//www.cs.umbc.edu/dingli1/foaf.rdf
rdftype
rdftype
foafPerson
foafPerson
foafmbox
rdfsseeAlso
mailtofinin_at_umbc.edu
http//www.cs.umbc.edu/finin/foaf.rdf
We navigate the Semantic Web via links in the
physical layer of RDF documents and also via
links in the logical layer defined by the
semantics of RDF and OWL.
22
This talk
  • Motivation
  • Swoogle overview
  • Bots navigate the Semantic Web
  • Ranking Semantic Web content
  • Use cases and applications
  • Conclusions

23
Rank has its privilege
  • Google introduced a new approach to ranking query
    results using a simple popularity metric.
  • It was a big improvement!
  • Swoogle ranks its query results also
  • When searching for an ontology, class or
    property, wouldnt one want to see the most used
    ones first?
  • Ranking SW content requires different algorithms
    for different kinds of SW objects
  • For SWDs, SWTs, individuals, assertions,
    molecules, etc

24
Googles PageRank
  • A pages rank is a function ofhow many links
    point to it and the rank of the pages hosting
    those links.
  • The random surfer model provides the intuition
  • Jump to a random page
  • Select and follow a random link on the page and
    repeat until bored
  • If bored, go to (1)
  • Ranked pages by the relative frequency with which
    they are visited.

yes
no
25
Ranking Semantic Web Documents
  • Target a pure SW dataset
  • Nodes a collection of online SWDs (330K SWDs,
    1.5 are labeled as ontologies)
  • Links in addition to hyperlinks, term level
    relations are generalized into TM, EX, IM.
  • Rational surfer model (extension of weighted
    PageRank)
  • Semantic content (term level relations) encoded
    into links
  • rank of node iteratively spread via links
  • weight/capacity of link vary according to link
    semantics
  • propagate weight to imported ontologies
  • Evaluation
  • Method Compare OntoRank with PageRank for
    promoting ontologies even using the same Pure SW
    Dataset

26
An Example
http//www.w3.org/2000/01/rdf-schema
wPR 300
OntoRank 403
TM
http//xmlns.com/wordnet/1.6/
TM
wPR 3
OntoRank 103
EX
http//xmlns.com/foaf/1.0/
TM
wPR 100
OntoRank 100
http//www.cs.umbc.edu/finin/foaf.rdf
wPR 0.2
OntoRank 0.2
27
Ontology Dictionary
  • Motivation
  • One ontology does not always provide all needed
    vocabulary
  • There could be many scenario that requires
    assembling terms from multiple ontologies
  • DIY ontology engineering
  • Search an appropriate class C
  • Search for popular properties used for modifying
    Cs class instance
  • Go back to step 1 if more classes are needed

28
Ranking Semantic Web Terms
  • Pr(TermDoc) can be measured by the normalized
    value of the product of the terms
  • Popularity how many SWDs is using the term.
  • Frequency how many times the term is used in
    the SWD
  • SWDs are accessed non-uniformly by OntoRank
  • TermRank estimates a terms importance as
  • ? Pr(TermDoc) OntoRank(Doc)
  • Evaluation
  • Compare TermRank with Terms popularity for the
    top 10 highest rated terms and compose analytical
    evaluation.

29
Class-Property Bonds
  • Class-Property Bond
  • (introduced by ontology)
  • foafmbox
  • foafname

SWD1
foafmbox
  • Class Definition
  • rdfssubClassOf -- foafAgent
  • rdfslabel Person
  • Class-Property Bond
  • (introduced by instances)
  • foafname
  • dctitle

foafname
rdfsdomain
rdfsdomain
SWD3
SWD2
rdftype
owlClass
rdftype
foafPerson

foafname
rdfssubClassOf
Tim Finin
foafAgent
dctitle
rdfscomment
Tims FOAF File
a human being
30
This talk
  • Motivation
  • Swoogle overview
  • Bots navigate the Semantic Web
  • Ranking Semantic Web content
  • Use cases and applications
  • Conclusions

31
Supporting Semantic Web Developers
  • Finding SW content
  • Ontologies, classes, properties, molecules,
    triples, partial ontology mappings, authoritative
    copies
  • Ad hoc data collection
  • Exploring how the SW is being used, e.g.
  • Computing basic statistics
  • Ranking properties used with foafperson
  • And misused
  • Finding common typos

32
Applications and use cases
  • Supporting Semantic Web developers, e.g.,
  • Ontology designers
  • Vocabulary discovery
  • Whos using my ontologies or data?
  • Etc.
  • Searching specialized collections, e.g.,
  • Proofs in Inference Web
  • Text Meaning Representations of news stories in
    SemNews
  • Supporting SW tools, e.g.,
  • Discovering mappings between ontologies

33
(No Transcript)
34
(No Transcript)
35
(No Transcript)
36
This talk
  • Motivation
  • Swoogle overview
  • Bots navigate the Semantic Web
  • Ranking Semantic Web content
  • Use cases and applications
  • Conclusions

37
Will it Scale? How?
  • Heres a rough estimate of the data in RDF
    documents on the semantic web based on Swoogles
    crawling

We think Swoogles centralized approach can be
made to work for the next few years if not longer.
38
How much reasoning?
  • SwoogleN (Nlt3) does limited reasoning
  • Its expensive
  • Its not clear how much should be done
  • More reasoning would benefit many use cases
  • e.g., type hierarchy
  • Recognizing specialized metadata
  • E.g., that ontology A some maps terms from B to C

39
Conclusion
  • The web will contain the worlds knowledge in
    forms accessible to people and computers
  • We need better ways to discover, index, search
    and reason over SW knowledge
  • SW search engines address different tasks than
    html search engines
  • So they require different techniques and APIs
  • Swoogle like systems can help create consensus
    ontologies and foster best practices

40
For more information
http//ebiquity.umbc.edu/
Annotatedin OWL
Write a Comment
User Comments (0)
About PowerShow.com