Title: Terminology mapping for subject cross-browsing in distributed information environments
1Terminology mapping for subject cross-browsing in
distributed information environments
- Libo Si
- PhD student in the Department of Information
Science, Loughborough University
2Background
- Users have to face different information
resources using different schemes. - Library portal systems, such as MetaLib, SirSi
Room. - These provide a single access point.
3Background
- Keyword cross-searching
- Mapping different metadata schemes.
- Make them interoperable.
- Subject cross-browsing
- Integrate different KOSs together into a
hierarchical tree. - Issues
- Interoperability between different knowledge
organisation systems - Interoperability between metadata standards
-
4My Research
- Aim
- To develop methods to facilitate both subject
cross-browsing and cross-searching for library
portal systems. - Objectives
- To investigate different methods to develop
cross-search service in a library portal product
- To investigate different methods to make
different metadata standards interoperable - To investigate different methods to make
different knowledge organisation systems
interoperable - To indicate some trends to establish ontologies
to facilitate both cross-searching and
cross-browsing by subject for the development of
library portal systems.
5Methodology
- Case study HILT, Renardus, MetaNet, ABC
Ontology, OpenCyc Ontology, ePrint UK, and UMLS. - Investigate different methods used by these
projects to facilitate subject cross-browsing and
cross-searching service.
6Methods to cross-search (1)
Federated Search (Sadeh 2006)
7Methods to cross-search (1)
- A cross-search service can create and maintain
their own repository of resource metadata (Sadeh
2004). - Issues
- Loss of data value
- Cannot capture rich knowledge organisation
systems used by different online databases due to
the lack of methods to reuse different metadata
schemes and controlled vocabularies (Hughs and
Kamat 2005).
8Methods to cross-search (2)
- An alternative is
- In the semantic web community, the construction
of ontologies to maximise the use of both subject
classification systems and metadata schemes
across different collections is possible. - Each participating resource providers can offer
metadata and classification systems to any
cross-search service.
9Mapping semantics of different metadata standards
- Derivation
- Application profile
- Crosswalk (one-to-one, and switch)
- Metadata registry
- Data reuse and integration (RDF)
- Aggregation.
- - Chen and Zheng (2006)
10Derivation
- One metadata scheme can be developed based on the
principle and structure of an existing one (Chan
and Zeng 2006a). - Ex. TEI Lite is derived from the full Text
Encoding Initiative (TEI).
11Application profile
- An application profile can be defined by
combining a selected range of metadata elements
from different metadata schemes for some
application-specific purpose (Heery and Patel
2004).
12Project using Application Profile
- Five namespaces used by Renardus application
profile - http//renardus.sub.unigoettingen.de/renap/renap.h
tml -
- Renardus Metadata Element Set (rmes),
- Renardus Metadata Element Set Qualifiers (rmesq),
- Dublin Core Metadata Element Set, version 1.1 (dc
1.1), - Dublin Core Metadata Element Set Qualifiers
(dcterms), - DCMI Type Vocabulary (dcmitype).
13Crosswalk
- A crosswalk is a specification for mapping one
metadata standard to another (St. Pierre and
LaPlant 1998). - One-to-one
- Many-to-many (switch scheme)
14Metadata scheme registry
- A metadata registry refers to an application that
provides services based on information about
'metadata terms' and about related resources
(Johnston 2005). - Ex the CORES registry lists more than 40
metadata schemes, and supports searching and
browsing by metadata scheme developer,
maintenance agency, element sets, elements,
encoding schemes, application profiles and
element usages. - (http//www.cores-eu.net/registry/)
15Data reuse and integration
- This refers to describing information objects by
using different elements from different metadata
schemes or application profiles (Chan and Zeng
2006b). - The Resource Description Framework (RDF) provides
a basic platform for integrating different
metadata schemes to describe web resources (Heery
and Patel 2004). - RDF can facilitate the use of different
application profiles.
16An RDF example
lt?xml version"1.0" ?gt - ltrdfRDF
xmlnsrdf"http//www.w3.org/1999/02/22-rdf-syntax
-ns" xmlnsdchttp//purl.org/dc/elements/1.0/
xmlnsbc"http//www.schemas-forum.org/registry/sc
hemas/BIBLINK/1.0/bc-ap"gt - ltrdfDescription
about"urnisbn0-89887-113-1"gt
ltdctitlegtPatrologia Latina Databaselt/dctitlegt
ltdccreatorgtJacques Paul Mignelt/dccreatorgt
ltdcdategt1993lt/dcdategt
ltdclanguagegtlalt/dclanguagegt ltbcextentgt2
computer laser optical disks 4 3/4
inlt/bcextentgt ltbcsystemRequirementsgtMultim
edia PC 486x or higher, 8mb memory, CD-ROM drive,
sound card, SVGA 256-colour monitor, Windows 95
or Windows 3.1lt/bcsystemRequirementsgt
ltdcsubject rdfvalue"Christian literature,
Early" bcsubjectScheme"LCSH" /gt
ltdcidentifier rdfvalue"isbn0-89887-113-1"
bcidentifierScheme"URN" /gt
ltbcplacePublicationgtCambridgelt/bcplacePublicatio
ngt ltdcpublishergtChadwyck-Healeylt/dcpublish
ergt lt/rdfDescriptiongt lt/rdfRDFgt
17Aggregation
- This refers to
- Employing a central knowledge base to gather
metadata records from different online databases
using different metadata standards - Converting heterogeneous metadata records into a
consistent form - Developing a range of enhancement services to
enrich the metadata records gathered.
18Project using Aggregation - ePrint UK
(Powell 2001)
19Mapping semantics of different KOSs
- Derivation
- Direct mapping
- Switch language
- Merging
- Co-occurrence mapping
- Satellite and leaf node linking
20Derivation
- A subject-specific vocabulary is developed based
on some widely-used general vocabularies. - Ex MeSH was developed based on the structure of
LCSH.
21Direct mapping
(Chan and Zeng 2004)
22Switch language
(Mai 2003)
23Projects using a switch language
- The HILT Project
- Uses DDC as a switch language to navigate users
to find relevant information. - The Renardus Project.
24Co-occurrence mapping
(Zeng and Chan 2004)
25Merging
- Different vocabularies in the same domain can be
merged into a super-thesaurus. - Ex The Unified Medical Language System (UMLS)
merges concepts from about fifty medical
controlled vocabularies into a metathesaurus.
26Satellite and leaf node linking
- Editors can select and adapt parts of a general
vocabulary as a subject-specific vocabulary for
some particular requirements. - Ex A number of domain-specific controlled
vocabularies have been developed by selecting
parts of LCSH.
27Ontology mapping for subject cross-search and
browsing
- Current efforts within the digital library
community include developing ways to map
different metadata schemes, and ways to map
different knowledge organisation systems. - In the semantic web community, the ways to
improve semantic interoperability include the
construction of ontology and ontology mapping. - There is much in common between the methods used
by these two communities.
28What is an ontology?
- Definition An ontology is a formal (explicit)
specification of a conceptualization shared by a
community of people (R.Studer,1998). - The difference between an ontology and other
knowledge organisation systems.
29Types of ontologies in digital libraries
- Upper level ontology
-
- Domain ontology.
30Upper level ontology
- Refers to a common vocabulary including the basic
concepts, such as things, space, events, time,
behaviour, etc, and the relations between them
(Gomez-Perez and Benjamins 1999 Ding and Foo
2004a). - Ex OpenCyc, WordNet, and ABC ontology.
31ABC Ontology
- It provides the notional basis for developing
domain, role, or community specific ontologies,
and it incorporates a number of basic entities
and relationships common across other metadata
ontologies including time and object
modification, agency, places, concepts, and
tangible objects. Communities wishing to build
their own metadata ontologies and models may then
extend the ABC entities and relationships as
needed (Lagoze and Hunter 2001). - ABC Ontology is designed to incorporate basic
entities and relationships common across
different metadata standards, and provide a basis
to create metadata ontologies, into which
different metadata schemes can be mapped.
32OpenCyc Ontology
- This is a universal ontology, in which "every
concept one can imagine can be correctly linked
into the OpenCyc Ontology in appropriate places,
no matter how general or specific, no matter how
arcane or prosaic, no matter what the context
(nationality, age, native language, epoch,
childhood experiences, current goals, etc.) of
the imaginer" (Stubkjar 2001). - It provides a framework for further establishing
custom, and domain-specific ontologies.
33WordNet Ontology
- This is a manually constructed online lexical
reference system (Noy and Hafner 1997). In
WordNet, different lexical objects are organised
systematically with the basic distinction between
nouns, verbs, adjectives, and adverbs. Nouns are
grouped by different concepts, and different
concepts are organised hierarchically. In
WordNet, a verb is related to a concepts
function, and an adjective is related to a
concepts property. - The WordNet ontology is often applied to offer a
taxonomic tree, and also support natural language
processing.
34Domain ontology
- A domain-specific vocabulary that encompass the
concepts in a given domain (such as medical,
agriculture, computer science, etc) and their
relationships (Gomez-Perez and Benjamins 1999
Uschold and Gruninger 1996 Guarino 1997). - In some cases, potentially, some traditional KOSs
can be integrated together, and form a basis to
create a domain ontology.
35Use of ontologies
- MetaNet
- Different metadata elements from different
metadata schemes have been mapped to ABC
ontology. - Mappings between E-learning object metadata and
OpenCyc ontology - Mappings between MeSH and OpenCyc ontology
- Mappings between different subject classification
systems and OpenCyc
36An Ontology Library System
- An ontology library system is a library system
that offers various functions for managing,
adapting and standardizing groups of different
ontologies (Ding and Fensel 2001). - To support searching and browsing different
ontologies.
37Conclusion (1)
- A library portal system should be able to
maximise the reuse of existing library resources,
such as metadata schemes, and knowledge
organisation systems. - In order to improve semantic interoperability, it
is expected that each resource provider publishes
metadata schemes, and knowledge organisation
systems in semantic web enabled format to
facilitate reusing these resources. - RDF, XML
38Conclusion (2)
- In order to facilitate cross-searching
- Develop or apply a common metadata scheme, into
which different metadata elements from different
metadata schemes can be mapped. - Different metadata schemes can also be mapped
into an upper level ontology. - These two ways can be developed together.
39Conclusion (3)
- To facilitate cross browsing by subject
- Different knowledge organisation systems can be
mapped into a DDC as a subject navigation tree. - In order to support more powerful computational
semantics, all concepts, intra-relationships, and
inter-relationships in different knowledge
organisation systems can be mapped into an upper
level ontology.
40Conclusion (4)
- A variety of mappings have been developed.
- Each type of mapping is designed to offer
specific capabilities to improve semantic
interoperability, and limited search or browsing
functions. - A combination of the different types of mapping
is required
41Thank you and questions!
Libo Si l.si_at_lboro.ac.uk