SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory

Description:

Alistair Miles, Ecoterm 2006, ... Formal language for representing controlled structured vocabularies (thesauri, ... this document is about romantic love' ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 28
Provided by: ajmi6
Category:

less

Transcript and Presenter's Notes

Title: SKOS Ecoterm 2006 Alistair Miles CCLRC Rutherford Appleton Laboratory


1
SKOSEcoterm 2006Alistair MilesCCLRC
Rutherford Appleton Laboratory
Semantic Web Best Practices and Deployment
2
Reminder what is it?
  • Simple Knowledge Organisation System
  • Formal language for representing controlled
    structured vocabularies (thesauri, classification
    schemes, ?)
  • Subject metadata information retrieval
  • this document is about romantic love.
  • this document is about the cure of tuberculosis
    by x-ray in India in the 1950s.
  • Application of RDF

3
Since Ecoterm 2005
  • SKOS Core Guide SKOS Core Vocabulary
    Specification
  • First Working Draft May 2005
  • Second Working Draft October 2005
  • Minor changes
  • Quick Guide to Publishing a Thesaurus on the
    Semantic Web
  • First Working Draft May 2005

4
What comes next ?
  • Life after SWBPD-WG ?
  • Plans for next phase of W3C Semantic Web Activity
  • New WG?
  • SKOS W3C Recommendation by end 2007?
  • N.B. Not yet approved!

5
If Rec then
  • What is the scope? What is the fundamental design
    goal?
  • First part of SKOS Rec would be requirements
    specification.
  • Between now and Sept/Oct 2006 define scope and
    requirements.

6
What Id like to do here
  • Talk about some of the assumptions behind SKOS.
  • Sketch some ideas on how to define scope and
    requirements for SKOS.
  • Get your feedback.
  • public-esw-thes_at_w3.org
  • SKOS Requirements for Standardization
  • isegserv.itd.rl.ac.uk/public/skos/press/dc2006/pap
    er.pdf

7
Brief history of scope
  • 2003-04 SWAD-Europe
  • ISO 2788 thesauri
  • Non-standard thesauri via extensibility e.g.
    GeMET
  • Classification scheme (PACS)
  • Multilingual thesauri
  • Semantic mapping
  • 2004 W3C Glossaries
  • 2005 Discussion re terminologies
  • Subject headings? Gazeteers? Folksonomies?
    Taxonomies?

8
Assumptions purpose
  • Formal representation of controlled structured
    vocabularies intended for use in information
    retrieval applications.

9
Assumptions workflow
  • Build a vocabulary
  • Build an index
  • Retrieve

10
Assumptions components
  • Vocabulary Development Application
  • Something to help build a vocabulary
  • Indexing Application
  • Something to help build an index
  • Retrieval Application
  • Something to help retrieve things
  • SKOS ultimately designed to support
    interoperation of these three key components.

11
Proposed scope
  • SKOS is a formal language for representing
    controlled structured vocabularies intended for
    use within information retrieval applications.
  • SKOS is required to support the interoperation of
    these three key components.
  • I.e. define the requirements for SKOS by
    describing a set of functionalities that must be
    enabled.

12
Other components
  • Vocabulary mapping ?
  • Metadata registries ?
  • ?

13
Component specs
  • first discuss social and technological context,
    then return to component specs

14
Context
  • What is the social and technological context in
    which controlled structured vocabs are used?
  • Assume two basic needs
  • Locate something I already know about.
  • Discover something new.
  • N.B. a good location service is not necessarily a
    good discovery service.
  • Cf. Google and del.icio.us

15
Strategies
  • Basic strategies for implementing retrieval
    services
  • Statistical text analysis
  • Analysis of user behaviour
  • Index with controlled vocab
  • Other strategies
  • kos-assisted text analysis?

16
Cost problem
  • Given that applying controlled structured vocab
    for retrieval involves significant initial and
    ongoing investment
  • Given that other strategies are cheaper
  • Huge pressure to drive down cost and increase
    utility.
  • Requirement for seamless integration.
  • I.e. controlled vocab is seldom used in
    isolation, most applications will combine
    strategies.

17
Use case
  • Search portal
  • Use combined strategies.

18
Component specs
  • Important factors
  • Minimise cost.
  • Decentralisation.
  • Assistance.
  • Maximise utility.
  • Query expansion.
  • Smart ranking.
  • Maximize lifetime.
  • Use the Semantic Web!
  • Situation A. search across many collections,
    where indexers use same controlled vocab.
  • Situation B. search across many collections,
    where indexes use different controlled vocabs.

19
Focus areas
  • Decentralisation requires different models of
    collaboration and change.
  • Representing change a key factor to keeping a
    vocab applicable.
  • Ranking and scoring well understood for text,
    less so for controlled index.
  • Theory of query expansion? Field trials of query
    expansion?
  • Strategies for providing assistance?

20
Change and collaboration
  • Continuum of collaboration models centralized
    lt-gt decentralised
  • Continuum of change management models continuous
    lt-gt discrete
  • Decentralization can reduce cost of development
    and maintenance
  • Change management can ensure continued utility
    maximize ROI
  • Support for declarative representation of change
    a requirement for SKOS.

21
Semantic Web architecture
  • Exploit Semantic Web facility to distribute and
    merge data.
  • However, publication of data in the Semantic Web,
    best practices need work.
  • See Best Practice Recipes for Publishing RDF
    Vocabularies W3C Working Draft (Google
    publishing RDF).

22
Semantic Web architecture
23
Direct interaction
24
Information retrieval
  • Indexing and query evaluation well understood for
    text content.
  • Less well understood for controlled metadata.
  • Query types?
  • Query evaluation strategies, e.g. query
    expansion?
  • Ranking?

25
Assistance for indexers
  • Provide suggestions
  • Comparison of labels and annotations
  • Machine learning
  • Exploit lexical resources
  • ?

26
Assistance for mappers
  • Provide suggestions
  • Analysis of labels and annotations
  • Exploit lexical resources
  • ?

27
Summary
  • SKOS fundamental requirement to support
    information retrieval using controlled structured
    vocabularies.
  • Define requirements by describing information
    retrieval functionalities.
  • Divide functionalities into
  • Presentation styles
  • Query types e.g. compound queries, coordination
  • Query evaluation strategies
  • Assumptions
  • Key components
  • Semantic Web interaction
  • Context pressure to make vocabularies
    profitable
  • Issues change, assistance, theory
Write a Comment
User Comments (0)
About PowerShow.com