Exploiting large scale web semantics to build end user applications - PowerPoint PPT Presentation

About This Presentation

Title:

Exploiting large scale web semantics to build end user applications

Description:

Knowledge Media Institute. The Open University. Aims of the Talk. What is the Semantic Web ... video game, you're playing against an A.I. system.' Rodney Brooks ... – PowerPoint PPT presentation

Number of Views:63

Avg rating:3.0/5.0

Slides: 71

Provided by: Harriett9

Category:

more less

Transcript and Presenter's Notes

Title: Exploiting large scale web semantics to build end user applications

1
Exploiting large scale web semantics to build
end user applications

Enrico Motta
Professor of Knowledge Technologies
Knowledge Media Institute
The Open University

2
Aims of the Talk

What is the Semantic Web
Perspectives
The SW as a web of data
The SW as a new context in which to build
semantic applications and an unprecedented
opportunity in which to address some classic AI
problems
Typical misconceptions
What the SW is not!
Semantic Web for Users
Applications that do something interesting and
useful to users, by exploiting available web
semantics

3
The Semantic Web as a Web of Data

Making data available to SW-aware software

4
(No Transcript)
5
(No Transcript)
6
ltfoafPerson rdfabout"http//identifiers.kmi.ope
n.ac.uk/people/enrico-motta/"gt
ltfoafnamegtEnrico Mottalt/foafnamegt
ltfoaffirstNamegtEnricolt/foaffirstNamegt
ltfoafsurnamegtMottalt/foafsurnamegt ltfoafphone
rdfresource"tel44-(0)1908-653506"/gt
ltfoafhomepage rdfresource"http//kmi.open.ac.uk
/people/motta/"/gt ltfoafworkplaceHomepage
rdfresource"http//kmi.open.ac.uk/"/gt
ltfoafdepiction rdfresource"http//kmi.open.ac.u
k/img/members/enrico.jpg"/gt ltfoaftopic_interest
gtKnowledge Technologieslt/foaftopic_interestgt
ltfoaftopic_interestgtSemantic Weblt/foaftopic_inte
restgt ltfoaftopic_interestgtOntologieslt/foaftopi
c_interestgt ltfoaftopic_interestgtProblem
Solving Methodslt/foaftopic_interestgt
ltfoaftopic_interestgtKnowledge Modellinglt/foaftop
ic_interestgt ltfoaftopic_interestgtKnowledge
Managementlt/foaftopic_interestgt
ltfoafbased_neargt ltgeoPointgt
ltgeolatgt52.024868lt/geolatgt
ltgeolonggt-0.707143lt/geolonggt
ltcontactnearestAirportgt
ltairportnamegtLondon Luton Airportlt/airportnamegt
ltairportiataCodegtLTNlt/airportiataCodegt
ltairportlocationgtLuton, United
Kingdomlt/airportlocationgt
ltgeolatgt51.866666666667lt/geolatgt
ltgeolonggt-0.36666666666667lt/geolonggt
ltrdfsseeAlso rdfresource"http//www.daml.org/cg
i-bin/airport?LTN"/gt ltfoafcurrentProjectgt ltf
oafProjectgt ltfoafnamegtAquaLoglt/foafnamegt
lt/foafcurrentProjectgt
7
The web of SW documents
8
Current status of the semantic web

10-20 million semantic web documents
Expressed in RDF, OWL, DAMLOIL
7K-10K ontologies
These cover a variety of domains - music,
multimedia, computing, management, bio-medical
sciences, upper level concepts, etc
Hence
To a significant extent the semantic web is
already in place
However, domain coverage is very uneven
Still primarily a research enterprise, however
interest is rapidly increasing in both
governmental and business organizations
early adopters phase

The above figures refer to resources which are
publicly accessible on the web
9
ltdata data datagt
ltdata data datagt
ltdata data datagt
ltdata data datagt
ltdata data datagt
ltdata data datagt
10
(No Transcript)
11
Bibliographic Data
CS Dept Data
Geography
AKT Reference Ontology
RDF Data
12
(No Transcript)
13
Corporate Semantic Webs

A corporate ontology is used to provide a
homogeneous view over heterogeneous data sources.
Often tackle Enterprise Information Integration
scenarios
Hailed by Gartner as one of the key emerging
strategic technology trends
E.g., Garlik is a multi-million startup recently
set up in UK to support personal information
management, which uses an ontology to integrate
data mined from the web on a large scale

14
(No Transcript)
15
(No Transcript)
16
AquaLog
17
Applications that exploit large scale semantic
content
18
The web of data
19
Gateways to the SW
SemanticWeb
Application
Semantic Web Gateway
20

Sophisticated quality control mechanism
Detects duplications
Fixes obvious syntax problems
E.g., duplicated ontology IDs, namespaces, etc..
Structures ontologies in a network
Using relations such as extends,
inconsistentWith, duplicates
Provides interfaces for both human users and
software programs
Provides efficient API
Supports formal queries (SPARQL)
Variety of ontology ranking mechanisms
Modularization/Combination support
Plug-ins for Protégé and NeOn Toolkit
Very cool logo!

21
(No Transcript)
22
(No Transcript)
23
Case Study 1 Automatic Alignment of Thesauri in
the Agricultural/Fishery Domain
24
Method

SCARLET - matching by Harvesting the SW
Automatically select and combine multiple online
ontologies to derive a relation

Access
Semantic Web
Scarlet
Deduce
Concept_A (e.g., Supermarket)
Concept_B (e.g., Building)
Semantic Relation ( )
25
Two strategies
Building
OrganicChemical
PublicBuilding
Lipid
Shop
Steroid
Steroid
Supermarket
Cholesterol
Semantic Web
Scarlet
Scarlet
Building
Cholesterol
OrganicChemical
Supermarket
(A)
(B)
Deriving relations from (A) one ontology and (B)
across ontologies.
26
Experiment

Matching
AGROVOC
UNs Food and Agriculture
Organisation (FAO) thesaurus
28.174 descriptor terms
10.028 non-descriptor terms
NALT
US National Agricultural
Library Thesaurus
41.577 descriptor terms
24.525 non-descriptor terms

27
226 Used Ontologies
http//139.91.183.309090/RDF/VRP/Examples/tap.rdf
http//reliant.teknowledge.com/DAML/SUMO.daml
http//reliant.teknowledge.com/DAML/Mid-level-onto
logy.daml
http//reliant.teknowledge.com/DAML/Economy.daml
http//gate.ac.uk/projects/ htechsight/Technologie
s.daml
28
Evaluation 1 - Precision

Manual assessment of 1000 mappings (15)
Evaluators
Researchers in the area of the Semantic Web
6 people split in two groups
Results
Comparable to best results for background
knowledge based matchers.

29
Evaluation 2 Error Analysis
30
Case Study 2Folksonomy Tagspace Enrichment
31
Features of Web2.0 sites

Tagging as opposed to rigid classification
Dynamic vocabulary does not require much
annotation effort and evolves easily
Shared vocabulary emerge over time
certain tags become particularly popular

32
Limitations of tagging

Different granularity of tagging
rome vs colosseum vs roman monument
Flower vs tulip
Etc..
Multilinguality
Spelling errors, different terminology, plural vs
singular, etc
This has a number of negative implications for
the effective use of tagged resources
e.g., Search exhibits very poor recall

33
Giving meaning to tags
34
What does it mean to add semantics to tags?

1. Mapping a tag to a SW element
"japan"
ltaktCountry Japangt

35
Applications of the approach

To improve recall in keyword search
To support annotation by dynamically suggesting
relevant tags or visualizing the structure of
relevant tags
To enable formal queries over a space of tags
Hence, going beyond keyword search
To support new forms of intelligent navigation
i.e., using the 'semantic layer' to support
navigation

36
Folksonomy
Clustering
Analyze co-occurrence of tags
Co-occurence matrix
Cluster tags
Cluster1
Cluster2
Clustern

Concept and relation identification
Yes
2 related tags
SW search engine
Remaining tags?
Wikipedia
Find mappings relation for pair of tags
No
Google
END
ltconcept, relation, conceptgt
37
Examples
Cluster_1 admin application archive collection component control developer dom example form innovation interface layout planning program repository resource sourcecode
38
Examples
Cluster_2 college commerce corporate course education high instructing learn learning lms school student
1http//gate.ac.uk/projects/htechsight/Employment.
daml. 2http//reliant.teknowledge.com/DAML/Mid-lev
el-ontology.daml. 3http//www.mondeca.com/owl/mos
es/ita.owl. 4http//www.cs.utexas.edu/users/mfkb/R
KF/tree/CLib-core-office.owl.
39
Faceted Ontology

Ontology creation and maintenance is automated
Ontology evolution is driven by task features and
by user changes
Large scale integration of ontology elements from
massively distributed online ontologies
Very different from traditional top-down-designed
ontologies

40
Case Study 3Reviewing and Rating on the Web
41
Revyu.com
42
(No Transcript)
43
(No Transcript)
44
(No Transcript)
45
(No Transcript)
46
(No Transcript)
47
Trust Factors
expertise the source has relevant expertise of the domain of the recommendation-seeking this may be formally validated through qualifications or acquired over time.
experience the source has experience of solving similar scenarios in this domain, but without extensive expertise.
impartiality the source does not have vested interests in a particular resolution to the scenario.
affinity the source has characteristics in common with the recommendation seeker, such as shared tastes, standards, values, viewpoints, interests, or expectations.
track record the source has previously provided successful recommendations to the recommendation seeker.
48
solution
subjective
objective
affinity
expertiseexperience
factorsemphasised
49
Applying the framework to revyu.com

Affinity
Operationalised as the degree of overlap in items
reviewed, and in ratings given
Experience
Proxy metric Usage of particular tags (as
proxies for topics)
Experience scores based on tagging data
Integrates also data from del.icio.us for those
users who have chosen to publish their
del.icio.us account on FOAF
Expertise
Proxy metric Credibility
Captures the social aspect of expertise
endorsement

50
Using trust factors for ranking reviews
51
(No Transcript)
52
PowerAqua and PowerMagpie
53
How does the Semantic Web relate to Artificial
Intelligence research?
54
AI as Heuristic Search
55
The knowledge-based paradigm in AI

Today there has been a shift in paradigm. The
fundamental problem of understanding intelligence
is not the identification of a few powerful
techniques, but rather the question of how to
represent large amounts of knowledge in a fashion
that permits their effective use
Goldstein and Papert,1977

56
(No Transcript)
57
Knowledge Representation Hypothesis in AI

Any mechanically embodied intelligent process
will be comprised of structural ingredients that
we as external observers naturally take to
represent a propositional account of the
knowledge that the overall process exhibits, and
independent of such external semantic
attribution, play a formal but causal and
essential role in engendering the behaviour that
manifests that knowledge
Brian Smith, 1982

58
Knowledge-Based Systems
Intelligent Behaviour
59
The Knowledge Acquisition Bottleneck
KA Bottleneck
Intelligent Behaviour
60
The Cyc project
61
Structured libraries of reusable components
Problem Solving Method
Generic Task
Parametric Design
Library of PSMs
Classification
Mapping Knowledge
Application-specific Problem-Solving Knowledge
Scheduling
Ontology
Mapping Ontology
Etc
Application Configuration
Domain Model
62
The next knowledge medium

However, our approach based on structured
libraries of problem solving components only
addressed the economic cost of KBS development

63
SW as Enabler of Intelligent Behaviour
Both a platform for knowledge publishing and a
large scale source of knowledge
Intelligent Behaviour
64
KBS vs SW Systems
Classic KBS SW Systems
Provenance Centralized Distributed
Size Small/Medium Extra Huge
Repr. Schema Homogeneous Heterogeneous
Quality High Very Variable
Degree of trust High Very Variable
65
Key Paradigm Shift
Classic KBS SW Systems
Intelligence A function of sophisticated, logical, task-centric problem solving A side-effect of being able to integrate different types of reasoning to handle size and heterogeneous quality and representation
66
Conclusions
67
Typical misconceptions

The SW is a long-term vision
Ehmactually it already exists
The SW will never work because nobody is going
to annotate their web pages
The SW is not about annotating web pages, the SW
is a web of data, most of which are generated
from DBs, or from web mining software, or from
applications which produce SW technology
The idea of a universal ontology has failed
before and will fail again. Hence the SW is
doomed
The SW is not about a single universal ontology.
Already there are around 10K ontologies and the
number is growing
SW applications may use 1, 2, 3, or even hundreds
of ontologies.

68
Large Scale Distributed Semantics

Widespread production of formalised knowledge
models (ontologies and metadata), from a variety
of different groups and individuals
E.g., legal, bio-medical, governmental,
environmental, music, art, multimedia, computing,
etc..
Knowledge modelling to become a new form of
literacy?
Stutt and Motta, 1997
This large scale heterogenous resource will
enable a new generation of semantic-aware
technologies
These developments may provide a new context in
which to address the economic barriers to KBS
development
The SW already exists to some extent, however
there is still a way to go, before it will reach
the required degree of maturity

69
Large Scale Distributed Semantics

Much like AI, the semantic web will only succeed
if it becomes ubiquitous and hidden

There's this stupid myth out there that A.I. has
failed, but A.I. is everywhere around you every
second of the day. People just don't notice it.
You've got A.I. systems in cars, tuning the
parameters of the fuel injection systems. When
you land in an airplane, your gate gets chosen by
an A.I. scheduling system. Every time you use a
piece of Microsoft software, you've got an A.I.
system trying to figure out what you're doing,
like writing a letter, and it does a pretty
damned good job. Every time you see a movie with
computer-generated characters, they're all little
A.I. characters behaving as a group. Every time
you play a video game, you're playing against an
A.I. system. Rodney Brooks
70
(No Transcript)

Write a Comment

User Comments (0)