Enhancing social tagging with a knowledge organization system - PowerPoint PPT Presentation

1 / 33
About This Presentation
Title:

Enhancing social tagging with a knowledge organization system

Description:

Provides advanced IT development and services to the STFC Science Programme ... help move social tagging beyond personal bookmarking to aid resource discovery ... – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 34
Provided by: epubsC
Category:

less

Transcript and Presenter's Notes

Title: Enhancing social tagging with a knowledge organization system


1
Enhancing social tagging with a knowledge
organization system
  • Brian Matthews
  • STFC

2
Outline
  • Who are STFC ?
  • Controlled Vocabulary
  • Social Tagging
  • EnTag
  • Aims
  • Glamorgan/UKOLN/Intute Experiment
  • STFC Experiment
  • SKOS

3
Science and Technology Facilities Council
  • Provide large-scale scientific facilities for UK
    Science
  • particularly in physics and astronomy
  • E-Science Centre at RAL and DL
  • Provides advanced IT development and services to
    the STFC Science Programme
  • Also includes library and institutional
    repository
  • Strong interest in Digital Curation of our
    science data
  • Keep the results alive and available
  • RD Programme
  • DCC, CASPAR
  • EnTag

4
Controlled Vocabulary
  • Traditional way of providing subject
    classification
  • For shelf-marking
  • For searching
  • For association of resources
  • Several different types used, such as
  • Subject Classification
  • Keyword lists
  • Thesaurus
  • Each has different characteristics

5
HASSET (I)
  • UK Data Archive, Univ of Essex
  • Humanities and Social Science Electronic
    Thesaurus
  • Some 1000s of terms
  • Structure based on British Standard
    57231987/ISO 2788-1986 (Establishment and
    development of monolingual thesauri).
  • preferred terms, broader-narrower relations,
    associated terms
  • http//www.data-archive.ac.uk/search/hassetSearch.
    asp

6
HASSET (II)
7
HASSET (III)
8
Observations on using controlled vocabularies
  • Precise classification of resources
  • Good for precision and recall
  • Can exploit the hierarchy to modify query
  • Using the broader/narrower/related terms
  • Highly expensive
  • Requires investment in specialist expertise to
    devise the vocabulary
  • Requires investment in specialist expertise to
    classify resources.
  • Hard to maintain currency

9
Social Tagging
  • The Web 2.0 way of providing search terms
  • People tag resources with free-text terms of
    their own choosing
  • Tags used to associate resources together
  • del.icio.us, flickr
  • Folksonomy
  • the terms a community choses to use to tag its
    resources.

10
Connotea
11
Connotea sharing tags
12
Connotea Tag Cloud
13
Observations on Social Tagging
  • People often use the same tags or keywords (e.g.
    Preservation, Digital Library)
  • this makes things which mean the same thing to
    people easier to find
  • Cheap way of getting a very large number of
    resources marked up and classified
  • Represents the community consensus in some
    sense
  • The Wisdom Of Crowds
  • Has currency as people update
  • Tag clouds of popular tags
  • However, people often use similar but not the
    same tags
  • e.g. Semantic Web, SemanticWeb, SemWeb, SWeb
  • People make mistakes in tags
  • mispellings, using spaces incorrectly.
  • Some tags are more specific than others
  • E.g. controlled vocabulary, thesaurus, HASSET
  • People often associate the same words together
    with particular ideas in images
  • these are captured in clusters

14
EnTag Project
  • Enhanced tagging for discovery
  • JISC funded project
  • Partners
  • UKOLN
  • University of Glamorgan
  • STFC
  • Intute
  • Non-funded
  • OCLC Office of Research, USA
  • Danish Royal School of Library and Information
    Science
  • Period 1 Sep 2007 -- 30 Sep 2008
  • http//www.ukoln.ac.uk/projects/enhanced-tagging/

15
EnTag Background
  • Controlled vocabularies
  • Improve information retrieval and discovery
  • But, costly to index with, especially the amount
    of digital documents
  • Require subject and classification experts
  • Social tagging
  • Holds the promise of reducing indexing costs
  • Uses terms describing how people see the resource
  • Serendipity
  • But, tags uncontrolled,
  • missed associations
  • Relating different views
  • Highly personal (me, important),
  • Quality and ranking
  • Depth of term


16
EnTag Purpose
  • Investigate the combination of controlled and
    social tagging approaches to support resource
    discovery in repositories and digital collections
  • Aim to investigate
  • whether use of an established controlled
    vocabulary can help move social tagging beyond
    personal bookmarking to aid resource discovery

17
EnTag Objectives
  • Investigate indexing aspects when using only
    social tagging versus when using social tagging
    in combination with a controlled vocabulary
  • In particular, does this lead to
  • Improve tagging
  • Relevance of tags (perspective, aspects,
    specificity, exhaustivity, terminology
    (linguistic level, semantic level, contextual
    level)
  • Consistency
  • Efficiency (time used, user satisfaction)
  • Use (tags selected, clouds consulted, order of
    consultation)
  • Improve retrieval
  • Effectiveness (degree of match between user and
    system terminology)
  • In two different contexts
  • Tagging by readers
  • Tagging by authors

18
Testing Approach
  • Main focus
  • free tagging with no instructions
  • Versus
  • tagging using a combined system and guidance for
    users
  • Two demonstrators
  • Intute digital collection http//www.intute.ac.uk
  • Major development
  • Tagging by reader
  • DDC
  • STFC repository http//epubs.cclrc.ac.uk/
  • Complementary development
  • Tagging by author
  • A more qualitative approach

19
Intute
20
Intute demonstrator searching
21
Intute demonstrator basic tagging
22
Intute demonstrator enhanced tagging
23
EnTag Intute user study (II)
  • Test setting
  • 50 graduate students in political science
  • 60 documents, covering up to four topics of
    relevance for the students
  • Data collection
  • Logging time spent, selection patterns,
  • Pre- and post-questionnaires

24
EnTag Intute user study (I)
  • Test comparison of basic and advanced system
  • Indexing
  • Perspective, specificity, exhaustivity
  • Linguistics (word class, single word/compound,
    spelling, language)
  • Consistency
  • Efficiency (time used, user satisfaction)
  • Use (tags selected, clouds consulted, order of
    consultation)
  • Retrieval efficiency
  • Degree of match between user and system
    terminology
  • user tags, DDC tags, controlled Intute keywords,
    title terms, text terms

25
STFC Case Study EPubs
26
STFC demonstrator
27
STFC Author study
  • A study on a Authors of papers
  • Smaller number - c.10-12.
  • Regular depositors ( gt 10 papers each)
  • Subject experts
  • Expect that they would want their papers
    accurately tagged so that they are precisely
    found
  • A more qualitative study

28
Expected Feedback
  • Relative value of tagging vs. controlled terms
  • Does it give more satisfactory (accurate,
    consistent) tags?
  • Does it lead to the consideration of tags they
    would not have thought of?
  • Do they select deeply in the hierarchy?
  • Is this something they would like to see
    supported more, and would use?
  • Is it worth the overhead?
  • How we should use a combination of tagging and
    controlled vocab in our system ?
  • To Be Continued..

29
Building a Web of Knowledge
  • Social tagging and controlled vocabulary
    complement each other
  • Tagging entry level, quick, does the job, but
    error prone, fuzzy
  • Controlled vocabulary, accurate, but slow and
    expensive
  • Use one to leverage the other
  • Use both to build a Web of knowledge
  • The things in the world and their link via their
    subjects
  • Get the users to build the means of organising
    the knowledge

30
http//purl.org/net/aliman
30
31
SKOS Simple conceptual relationships
32
Conclusions
  • Controlled vocabulary and Tags complement each
    other
  • Hope to get some interesting evidence over the
    next month as the studies are complete.
  • Web 2.0 world offers the possibility of
    combining these results
  • SKOS a format to use both tags and controlled
    vocabulary as part of the Web of Linked Data
  • Also use Web 2.0 to build the vocab themselves.

33
  • Questions?
  • b.m.matthews_at_rl.ac.uk

34
EnTag Enhanced tagging for discovery
  • Research collaboration between Glamorgan
    University, UKOLN, INTUTE, CCLRC, OCLC, and DB
  • Financed by JISC Capital Programme
  • Research goal
  • Investigation of the combination and comparison
    of controlled and folksonomy approaches to
    semantic interoperability supporting resource
    discovery in repositories and digital collections
  • Evaluation in two communities of use at Intute
    (Social science), focussing on tagging by readers
    (postgraduate users), and at CCLRC, focussing on
    tagging by authors
  • The two studies are carried out as separate
    projects
  • Intute project use DDC as controlled vocabulary
  • Evaluation by quantitative and qualitative
    measures

35
Evaluation Intute focus and objective
  • Context tagging as part of information
    searching and relevance assessment, tagging for
    recommendation and sharing
  • Hybrid system investigate whether tagging can
    be improved by a combination of traditional tag
    clouds and clouds of controlled descriptors,
    including interactive tools such as tag
    suggestions, access to browsing of DDC, etc.
  • Improve tagging
  • Relevance of tags (perspective, aspects,
    specificity, exhaustivity, terminology
    (linguistic level, semantic level, contextual
    level)
  • Consistency
  • Efficiency (time used, user satisfaction)
  • Use (tags selected, clouds consulted, order of
    consultation)
  • Improve retrieval
  • Effectiveness (degree of match between user and
    system terminology)
Write a Comment
User Comments (0)
About PowerShow.com