Controlled Vocabularies in TELPlus - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Controlled Vocabularies in TELPlus

Description:

Integrating services with TEL portal. User personalisation services ... STITCH project team. National Library of the Netherlands. TEL Office. French National Library ... – PowerPoint PPT presentation

Number of Views:38
Avg rating:3.0/5.0
Slides: 36
Provided by: lui188
Category:

less

Transcript and Presenter's Notes

Title: Controlled Vocabularies in TELPlus


1
Controlled Vocabularies in TELPlus
  • Antoine ISAAC
  • Vrije Universiteit Amsterdam
  • EDLProject Workshop
  • 22-23 November 2007

2
Agenda
  • TELPlus Context
  • Improving subject access
  • 3 sub-tasks
  • Services for TEL

3
TELPlus Context
  • Started October 2007
  • Running 27 months
  • Content WPs
  • OCRing previously digitised material
  • Improving the usability of TEL through OAI PMH
    compliancy
  • Improving Access
  • Integrating services with TEL portal
  • User personalisation services
  • Extending TEL to Bulgaria Romania

4
WP3 Improving Access
  • Task 1 Indexing for usability
  • Review/test state-of-the-art semantic search
    engines
  • On content of documents
  • Task 2 Improving subject access
  • Task 3 FRBR aggregation, search and browsing
  • Create/exploit FRBR metadata repositories
  • Task 4 Focus on users
  • Focus groups on prototypes

5
WP 3 Task 2 Improving Subject Access
  • Improving subject access via semantic alignment
    between subjects
  • Search through collections
  • Using metadata
  • In a controlled setting
  • Paving the way for enhanced usages
  • Advanced treatments mentioned in TELplus need
    conceptual structures and links between these
    structures
  • E.g. clustering

6
WP 3 Task 2 Improving Subject Access
  • Improving subject access via semantic alignment
    between subjects
  • Reference MACS project
  • Manually-built semantic equivalences between
    Rameau, SWD LCSH headings

7
MACS Querying Collections
8
MACS Query Reformulation Options
9
WP 3 Task 2 Improving Subject Access
  • Improving subject access via semantic alignment
    between subjects
  • Reference MACS project
  • Manual equivalences between Rameau, SWD, LCSH
    headings
  • Here an experiment on deploying automatic
    alignment techniques
  • Determining possible strategies
  • Assessing feasibility and usefulness
  • MACS context

10
WP3.2 Sub-tasks
  • 3.2.1. Converting the subjects to standard
    representation language
  • Semantic web format (SKOS)
  • 3.2.2. Aligning the vocabularies
  • Semantic correspondences between subjects
  • 3.2.3. Deploying the alignment knowledge obtained
    into TEL framework
  • E.g. using links to reformulate queries from one
    subject list to the other

11
Converting subjects to standard representation
language
  • Goal solving syntactic heterogeneity between
    vocabularies
  • Enabling the use of standard tools
  • E.g. for query (re)formulation
  • Paving the way for dealing with semantic
    heterogeneity
  • Definitions of concepts expressed according to a
    common model

12
Converting subjects to standard representation
language
  • Approach Semantic Web and SKOS
  • Semantic Web
  • Knowledge objects as web resources (URIs)
  • Description by linking resources (RDF)
  • Description using shared formal vocabularies
    (ontologies)
  • SKOS
  • A standard Semantic Web model (ontology)
  • For knowledge organization systems (thesauri,
    subject heading lists)

13
SKOS Example
skosConceptScheme
rdftype
skosConcept
http//www.iconclass.nl/
rdftype
skosinScheme
http//www.iconclass.nl/s_11F
skosprefLabel
skosbroader
the Virgin Mary_at_en
la Vierge Marie_at_fr
skosprefLabel
http//www.iconclass.nl/s_11
14
Converting subjects to standard representation
language - Process
  • Getting processable versions from owners
  • E.g. XML
  • Analyzing the models
  • Converting to SKOS

15
WP3.2 Sub-tasks
  • 3.2.1. Converting the subjects to standard
    representation language
  • Semantic web format (SKOS)
  • 3.2.2. Aligning the vocabularies
  • Semantic correspondences between subjects
  • 3.2.3. Deploying the alignment knowledge obtained
    into TEL framework
  • E.g. using links to reformulate queries from one
    subject list to the other

16
Vocabulary Alignment
  • Specifying required alignment format (links)
  • Type of mapping links equivalence, broader
  • Cardinality one-to-one, one-to-many
  • Taking application context (TEL) into account

17
Vocabulary Alignment
  • Specifying required alignment format (links)
  • Selecting ( running) alignment techniques/tools
  • Inspired by semantic web approaches

18
Vocabulary Alignment Techniques
  • Similar to ontology alignment problem
  • Existing approaches for (semi-) automatic
    ontology alignment
  • Using techniques from linguistics, computer
    science, statistics
  • Problem performances do not allow 100 automatic
    alignment
  • Problem multilingual case
  • Some techniques cannot be used

19
Technique Using Background Knowledge
  • Using a shared conceptual reference to find links

Publication
Calendar
SHL 1
SHL 2
20
Technique Statistical Alignment
  • Object information (book indexing)

Dutch Literature
SHL 1
SHL 2
Dutch
Dually-indexed books
21
Vocabulary Alignment
  • Specifying required alignment format (links)
  • Selection ( running) of tool/method
  • Evaluation ( cleaning)
  • Considering application

22
Evaluation of Alignments
  • MACS has produced mappings!
  • Possible gold standard
  • But has MACS produced all mappings?
  • Which proportion of the SHLs is covered?
  • Taking into account all indexing strings?
  • Are MACS mappings the only interesting ones?
  • Serendipity mappings
  • Concepts that are not equivalent but could bring
    useful results when added to queries
  • Compensating for indexing variability

23
Evaluation of Alignments
  • Several scenarios for using and evaluating
    alignments
  • Concept-based search
  • Re-indexing
  • Integration of one SHL into the other
  • SHL Merging
  • Free-text search
  • Navigation

24
Evaluation of Alignments
  • Several scenarios for using and evaluating
    alignments
  • Concept-based search
  • Retrieving books indexed by SHL1 using SHL2
    concepts
  • Re-indexing
  • Integration of one SHL into the other
  • SHL Merging
  • Free-text search
  • Matching user search terms to both SHL1 or SHL2
    concepts
  • Navigation
  • Browsing several collections using one SHL
    structure

25
Evaluation of Alignments
  • Several settings for a single scenario
  • Fully automatic reformulation vs assisted
    reformulation (candidates)
  • Different evaluation measures
  • Good mappings vs acceptable ones
  • Number of candidates for reformulation
  • Semantic closeness to original query

26
Vocabulary Alignment
  • Specifying required alignment format (links)
  • Selection ( running) of tool/method
  • Evaluation ( cleaning)
  • Assessment of the approach
  • Efforts required, quality, extendibility

27
WP3.2 Sub-tasks
  • 3.2.1. Converting the subjects to standard
    representation language
  • Semantic web format (SKOS)
  • 3.2.2. Aligning the vocabularies
  • Semantic correspondences between subjects
  • 3.2.3. Deploying the alignment knowledge obtained
    into TEL framework
  • E.g. using links to reformulate queries from one
    subject list to the other

28
Deploying the alignment knowledge obtained into
TEL framework
  • Observing integration of MACS data into TEL
  • Conceptual input for alignment requirements
  • Integration of the obtained alignment in TEL
  • Assessment of the alignment integration
  • Technical aspects, usage aspects

29
Reminder
  • Alignment is a difficult problem
  • Application-specific alignment pretty much
    unexplored in Semantic Web research
  • More a feasibility study than a complete solution
    to the problem
  • Practical goal investigate how automatic
    techniques could help MACS-like initiatives
  • Manual mapping is labour-intensive

30
Agenda
  • TELPlus Context
  • Improving subject access
  • 3 sub-tasks
  • Services for TEL

31
WP4 Integrating services with the European
Library portal
  • Theo van Veen (KB)
  • Tasks
  • Identifying services that are going to give the
    user the greatest return
  • Creating new services
  • Integrating services within TEL

32
WP4 Some Services Mentioned
  • Preliminary inventory no official commitment!
  • Services based on controlled vocabularies
  • Thesaurus and name authority service
  • Providing terms linked to query terms
  • Semantic enrichment service
  • Users can annotate search results with terms
  • Distance between terms and related terms

33
WP4 Some Services Mentioned
  • Preliminary inventory no official commitment!
  • Services based on controlled vocabularies
  • Thesaurus and name authority service
  • Semantic enrichment service
  • Distance between terms and related terms
  • Adding more value from controlled vocabularies
    and alignments between them

34
Thanks!
35
WP 3 Task 2 Improving Subject Access
  • Participants
  • Vrije Universiteit Amsterdam
  • STITCH project team
  • National Library of the Netherlands
  • TEL Office
  • French National Library
  • German National Library
Write a Comment
User Comments (0)
About PowerShow.com