Accessing Cultural Heritage Collections using Semantic Web Techniques - PowerPoint PPT Presentation

1 / 40
About This Presentation
Title:

Accessing Cultural Heritage Collections using Semantic Web Techniques

Description:

Accessing Cultural Heritage Collections using Semantic Web Techniques. Antoine ISAAC ... Some Needs for CH Collections. Representation of objects and knowledge ... – PowerPoint PPT presentation

Number of Views:55
Avg rating:3.0/5.0
Slides: 41
Provided by: csVu
Category:

less

Transcript and Presenter's Notes

Title: Accessing Cultural Heritage Collections using Semantic Web Techniques


1
Accessing Cultural Heritage Collections using
Semantic Web Techniques
  • Antoine ISAAC
  • STITCH Project
  • SIKS Semantic Web Seminar, Utrecht
  • April 11th, 2007

2
Background
  • CATCH_at_ NWO
  • Continuous Access To Cultural Heritage
  • 10 computer science projects applied to the CH
    field
  • Personalization of access, image/text/audio
    analysis
  • Integration of projects in CH institutes
    (museums, archives)
  • STITCH
  • SemanTic Interoperability To access Cultural
    Heritage
  • Exchanging and integrating metadata
  • Vrije Universiteit, Koninklijke Bibliotheek Max
    Planck Institute

3
Agenda
  • Cultural Heritage and Semantic Web
  • Two important issues
  • Representing Cultural Heritage vocabularies on
    the Semantic Web
  • Vocabulary alignment
  • Demo

4
Some Needs for CH Collections
  • Representation of objects and knowledge about
    them
  • Pointing at collection artifacts books
  • Describing them creating metadata
  • Specific metadata structures (metadata schemes)
  • Controlled expert vocabularies (e.g. thesauri)
  • Accessing artifacts using metadata
  • E.g. search using information contained in
    thesauri

5
KB Illustrated Manuscripts Iconclass vocabulary
6
KB Illustrated Manuscripts
7
Some Needs for CH Collections (2)
  • Communicating data to the outside world
  • Web portals
  • Integrating different collections
  • Virtual collections
  • The European Library, http//www.theeuropeanlibrar
    y.org
  • Geheugen van Nederland, http//www.geheugenvannede
    rland.nl

8
(Biased) Semantic Web
  • Pointing at resources documents, knowledge
    objects
  • Enabling structured assertions
  • Metadata about entities present on the Web
  • Using vocabularies with defined semantics
  • Ontologies formal definitions of shared
    conceptual vocabularies
  • RDF Schema /OWL

ltowlClass rdfabout"Bird"gt
ltowldisjointWithgt ltowlClass
rdfabout"Mammals"/gt lt/owldisjointWithgt
ltrdfssubClassOfgt ltowlClass
rdfID"Animals"/gt lt/rdfssubClassOfgt
lt/owlClassgt ltBird rdfabout"tweety"/gt
9
(Biased) Semantic Web
  • Web-based resources allow division/sharing of
  • document
  • vocabulary
  • metadata

http//www.geo.org/voc/

(doc3, hasSubject, Amsterdam)
http//www.kb.nl/eDepot
http//www.ned.nl/doc3
different owners locations
10
Cultural Heritage Collections and Semantic Web
  • Categorizing/classifying things
  • Structuring descriptions
  • Web-based approach
  • Semantic Web techniques are good candidates for
    representing and exploiting Cultural Heritage
    metadata

11
Important line of research
  • Long-term projects
  • MuseumFinland, http//www.museosuomi.fi/
  • eCulture, http//e-culture.multimedian.nl/
  • Common portals to (many) collections
  • Exploiting the data found in the original systems
  • Metadata content place, date, creator
  • Semantics of vocabularies used to create this
    information
  • E.g. hierarchical information
  • A Picture featuring a crow features a bird

12
(No Transcript)
13
Agenda
  • Cultural Heritage and Semantic Web
  • Two important issues
  • Representing Cultural Heritage vocabularies on
    the Semantic Web
  • Vocabulary alignment
  • Demo

14
Representing CH vocabularies on the Semantic Web
- Similarities
  • Both ontologies and thesauri bring concept
    hierarchies
  • giving the intended meaning of a vocabulary
    through links between its items
  • concept/term ? owlClass
  • broader ? rdfssubClassOf
  • scope notes ? rdfscomment

15
Representing CH vocabularies on the Semantic Web
- Problems
  • Thesauri designed for humans, no formal
    interpretation
  • How to interpret a thesaurus in RDFS/OWL
  • If (Story of) Hercules is a class, what are its
    instances?
  • Is Hercules shooting Nessus a subclass of
    Love-affairs of Hercules?
  • Thesaurus hierarchy subsumption, mereological
    relation,

16
Representing CH vocabularies on the Semantic Web
Different approaches
  • Ontologising
  • Cleaning thesaurus by distinguishing roles,
    kinds, etc.
  • Cleaning the hierarchical links
  • Representing knowledge found in sources as such
  • Informal knowledge represented in RDF/OWL formal
    framework

17
SKOS
  • Simple Knowledge Organization Systems
  • (Future) W3C standard
  • Model to represent controlled and structured
    vocabularies on the Semantic Web
  • Compatible with community needs
  • Core model for representing thesauri,
    classification schemes, etc.

18
SKOS
  • Building blocks (ontology) to create XML/RDF data
    about controlled vocabularies
  • Classes Concept and ConceptScheme
  • Lexical properties
  • prefLabel
  • altLabel
  • Semantic properties
  • broader, narrower
  • related
  • Properties for notes and comments
  • scopeNote
  • definition

19
SKOS Brinkman Trefwoorden (KB)
  • 075607204 geneeskunde
  • RT geneesmiddelen
  • NT kindergeneeskunde
  • 075607220 geneesmiddelen
  • UF medicijnen
  • 075611791 kindergeneeskunde
  • BT geneeskunde
  • noot kinderen ouder dan 12 vallen niet onder
  • kindergeneeskunde
  • medicijnen
  • USE geneesmiddelen

20
SKOS Brinkman Trefwoorden (KB)
skos http//www.w3.org/2004/02/skos/corebk
http//www.kb.nl/brinkman/
21
SKOS Brinkman Trefwoorden (KB)
ltskosConcept rdfabout"http//www.kb.nl/brinkman
/bk075607204"gt ltskosprefLabelgtgeneeskundelt/skos
prefLabelgt ltskosrelated rdfresource"http//www
.kb.nl/brinkman/bk075607220"/gt lt/skosConceptgt ltsk
osConcept rdfabout"http//www.kb.nl/brinkman/bk
075607220"gt ltrdftype rdfresource"skosConcept
"/gt ltskosprefLabelgtgeneesmiddelenlt/skosprefLabe
lgt ltskosaltLabelgtmedicijnenlt/skosaltLabelgt lt/sk
osConceptgt ltskosConcept rdfabout"http//www.kb
.nl/brinkman/bk075611791"gt ltrdftype
rdfresource"skosConcept"/gt ltskosprefLabelgtki
ndergeneeskundelt/skosprefLabelgt ltskosbroader
rdfresource"http//www.kb.nl/brinkman/bk07560720
4"/gt ltskosscopeNotegtkinderen ouder dan 12
vallen niet onder kindergeneeskundelt/skosscopeNo
tegt lt/skosConceptgt
22
Agenda
  • Cultural Heritage and Semantic Web
  • Two important issues
  • Representing Cultural Heritage vocabularies on
    the Semantic Web
  • Vocabulary alignment
  • Demo

23
Cultural Heritage Interoperability Problems
  • Problem integrating different databases/metadata
    schemes/vocabularies
  • Syntactic interoperability can be solved
  • Common format XML (RDF)
  • Common vocabulary model (SKOS)
  • How about conceptual heterogeneity?

24
The semantic interoperability problem
  • There is no standard thesaurus
  • We dont really want it
  • different vocabularies for different expertise
    domains, traditions, tasks
  • Consequence
  • klassieke ruïnes vs. landschap met ruïnes
  • maagd Maria vs. Heilige Moeder
  • Practical problem
  • Searching for Heilige Moeder misses maagd
    Maria
  • Unless we know both vocabularies

25
Old situation
26
Vocabulary alignment
  • STITCH aim find correspondences between
    vocabulary elements
  • klassieke ruïnes landschap met ruïnes
  • maagd Maria Heilige Moeder

27
New situation
28
Automatic alignment techniques
  • Lexical
  • Labels of entities and textual definitions
  • Structural
  • Structure of the formal definitions of entities,
    position in the hierarchy
  • Statistical
  • Object information (e.g. book indexing)
  • Background knowledge
  • Using a shared conceptual reference to find links

29
Lexical alignment
  • Use preferred labels, synonyms, notes
  • Heuristic methods to discover equivalence and
    specialization relations

30
Automatic Alignment Techniques
  • Lexical
  • Labels of entities and textual definitions
  • Structural
  • Structure of the formal definitions of entities,
    position in the hierarchy
  • Statistical
  • Object information (e.g. book indexing)
  • Shared background knowledge
  • Using a conceptual reference to deduce
    correspondences

31
Statistical alignment
32
Statistic approach Koninklijke Bibliotheek case
  • Situation 2 overlapping collections indexed with
    different thesauri
  • Comparison means measuring overlap between
    concepts from the thesauri
  • Using the sets of books indexed by these concepts
  • Results
  • 1 9132.9 Schilderijen - schilderkunst
  • 2 8088.5 Kwaliteitszorg - kwaliteitsmanagement
  • 3 6232.7 Personeelsmanagement - personeelsbeleid
  • ...
  • 17 3421.8 Diabetes mellitus - suikerziekte

33
Agenda
  • Cultural Heritage and Semantic Web
  • Two important issues
  • Representing Cultural Heritage vocabularies on
    the Semantic Web
  • Vocabulary alignment
  • Demo

34
Demo
  • KB Illuminated Manuscripts
  • French National Library Mandragore Manuscripts

35
Manuscripts, 2nd Collection BNF Mandragore
36
Manuscripts, 2nd Collection BNF Mandragore
37
Demo
  • http//stitch.cs.vu.nl/rp33333/MANDRA-SV-ICE-mandr
    aNewNONE , amphibians
  • http//stitch.cs.vu.nl/rp33333/MANDRA-SV-MANDRA-ma
    ndraNewNONE, wheat

38
Conclusion Semantic Web can help Cultural
Heritage
  • Representation of collections and associated
    expert vocabularies
  • Semantic integration through correspondences
    between different vocabularies
  • New opportunities for exploiting cultural
    heritage information

39
Thanks!
40
Links
  • Semantic Web at Vrije Universiteit
  • http//www.cs.vu.nl/ai/kr/
  • http//www.cs.vu.nl/bi/
  • SKOS
  • http//www.w3.org/2004/02/skos/
  • Other Cultural Heritage and Semantic Web projects
  • MuseumFinland, http//www.museosuomi.fi/
  • eCulture, http//e-culture.multimedian.nl/
Write a Comment
User Comments (0)
About PowerShow.com