An introduction to metadata Metadata : from soup to nuts, NOF-digitise programme seminar, London, 5 February 2002 - PowerPoint PPT Presentation

About This Presentation
Title:

An introduction to metadata Metadata : from soup to nuts, NOF-digitise programme seminar, London, 5 February 2002

Description:

Structured data about resources that can be used to help support a wide range of ... collections. services. physical places. people. abstract 'works' concepts. events ... – PowerPoint PPT presentation

Number of Views:64
Avg rating:3.0/5.0
Slides: 45
Provided by: petejo
Category:

less

Transcript and Presenter's Notes

Title: An introduction to metadata Metadata : from soup to nuts, NOF-digitise programme seminar, London, 5 February 2002


1
An introduction to metadata Metadata from soup
to nuts, NOF-digitise programme seminar,London,
5 February 2002
Email p.johnston_at_ukoln.ac.uk URL http//www.ukoln.
ac.uk/
  • Pete Johnston
  • UKOLN, University of Bath
  • Bath, BA2 7AY

UKOLN is supported by
2
An introduction to metadata
  • What is metadata what is it used for?
  • Metadata for resource discovery introducing the
    Dublin Core
  • How is metadata created?
  • How is metadata shared?
  • Resource discovery metadata in the NOF-digitise
    programme

3
What is metadata what is it used for?
4
What is metadata?
  • Metadata creation the art formerly known as
    cataloguing?
  • delivery of resources by resource creators/owners
  • rather than (or as well as) by intermediary
  • remote access to resources for all
  • (potentially)
  • emphasis on customer/user
  • information overload
  • quantity vs. quality?
  • the Google effect

5
What is metadata? (2)
  • Data associated with objects which relieves
    their potential users of having to have full
    advance knowledge of their existence or
    characteristics. A user might be a program or a
    person.
  • Dempsey and Heery, 1998
  • Machine understandable information about web
    resources or other things.
  • Berners-Lee, 1997
  • Structured data about resources that can be used
    to help support a wide range of operations

6
Who/what uses metadata?
  • Human agent
  • owner managing resources
  • researcher seeking resources
  • third party services
  • Software agents
  • aggregators
  • portals presenting landscape to user
  • brokers performing query tasks on behalf of user

7
What resources, objects, things?
  • HTML documents
  • digital images
  • databases
  • books
  • museum objects
  • archival records
  • metadata records
  • collections
  • services
  • physical places
  • people
  • abstract works
  • concepts
  • events

8
What operations?
  • Different flavours of metadata serve different
    purposes
  • simple, generic vs. rich, specific
  • published widely vs. shared within community vs.
    used by resource owner/manager
  • Owner / manager / provider wants to
  • establish control of resources
  • administer/manage resources (through time)
  • disclose/promote resources
  • enable and control access/use of resources
  • contextualise resources

9
What operations? (2)
  • End user wants to
  • find
  • identify
  • select
  • obtain/use
  • interpret
  • Third party service may want to
  • disclose/promote
  • enable and control access/use
  • annotate
  • re-contextualise

10
What information required in metadata?
  • No one size fits all solution
  • Depends on operation which metadata supports
  • Refer to standards
  • Benefit of others experience, expertise
  • Provide basis for good practice
  • Reflect consensus, so facilitate exchange,
    access, interoperability
  • May have support in software tools
  • Administrative metadata

11
Metadata for resource discovery introducing the
Dublin Core
12
Resource discovery metadata
  • Resource users may wish to
  • search across descriptions from different
    providers
  • compare/combine descriptions from different
    providers
  • Resource providers may wish to
  • disseminate descriptions widely
  • share descriptions with other providers, 3rd
    parties
  • describe relationships between resources
  • Third parties may wish to
  • build services on descriptions prepared by
    others
  • annotate descriptions prepared by others

13
Resource discovery metadata
  • Metadata for resource discovery is
  • used beyond its creator community
  • combined with metadata from other communities
  • Metadata is aggregated or cross-searched
  • challenge of semantic interoperability

14
Resource discovery metadata
  • Typically covers
  • description of resource content
  • what is it?
  • may include some description of context
  • description of resource form
  • how is it constructed?
  • description of resource use
  • what tools do I need to use it?
  • can I afford it?

15
Introducing the Dublin Core
  • Initiative to improve resource discovery on Web
  • not for complex resource description
  • simple document-like objects
  • extended to other classes of resource
  • Interdisciplinary consensus on simple element set
  • 15 elements
  • all optional
  • all repeatable

16
Introducing the Dublin Core
  • Title
  • Subject
  • Description
  • Creator
  • Publisher
  • Contributor
  • Date
  • Type
  • Format
  • Identifier
  • Source
  • Language
  • Relation
  • Coverage
  • Rights

17
Introducing the Dublin Core
  • Simplicity of semantics, ease of use
  • Provides basic semantic interoperability
  • across domains
  • across language communities
  • Allows for extensibility
  • but tension between extending DC and choosing
    other, richer schema

18
Introducing the Dublin Core
  • Interoperability requires
  • use of content rules/standards
  • clarity about resource being described
  • e.g. work, expression, manifestation, item
  • Real resources more complex than (stable)
    document-like object?
  • characteristics of resources change through time
  • agents perform actions which produce changes

19
Using the Dublin Core
  • Not a replacement for richer descriptive
    standards
  • A pidgin language for use by tourists on the
    Internet commons
  • Tom Baker, A Grammar of Dublin Core
  • Can provide 15 windows into richer resource
    descriptions
  • disclose rich description in simple form
  • semantic cross-walks, mappings
  • export rather than create?
  • NOF-digitise guidelines (5.2.1) mandate
    generation of simple DC records at item-level

20
Using the Dublin Core
title
creator
date
desc
rights
Simple DC description
Rich description
21
How is metadata created?
22
How is metadata created?
  • By software tools
  • indexing robots, web crawlers
  • from resource content, from server info
  • By human agents
  • description by resource creator/owner
  • description by third party services
  • Creating (and maintaining) good quality metadata
    is not cheap
  • rights issues for metadata as well as for
    resources?

23
Where is metadata stored?
  • Embedded in resource
  • depends on format of resource
  • can metadata be extracted from resource?
  • Linked to resource
  • Created as record in database
  • may be remote database
  • Adopt approach which offers most flexibility
  • may need to present different subsets of full
    metadata in different contexts

24
Metadata embedded in resource
Metadata embedded in resource
Creator
Date
Title
Doc
Creator J Smith
Date 2001-11-05
Title Report
J Smith
2001-11-05
Report
1
Resource1
Metadata database
25
Metadata record as linked resource
Metadata record as linked resource
Doc 1
Creator
Date
Title
Doc
Creator J Smith
Date 2001-11-05
Title Report
Metadata rec 1
J Smith
2001-11-05
Report
1
Metadata rec 1
Resource 1
Metadata database
26
Metadata record created in database
Metadata record created in database
Creator
Date
Title
Doc
J Smith
2001-11-05
Report
1
Resource 1
Metadata database
27
How is metadata shared?
28
How is metadata shared?
  • How does a data provider make metadata records
    available in a commonly understood form?
  • How does a service provider obtain these metadata
    records from data providers?

29
How is metadata shared?
  • Metadata as language metadata records as sets of
    statements
  • Effective transmission of information requires
    agreement on
  • semantics
  • what terms mean
  • e.g. cat, to sit, mat
  • structure
  • significance of arrangement of terms
  • e.g. sentence subject -gt verb -gt object (in
    English.)
  • syntax
  • rules of expression
  • The cat sat on the mat.

30
How is metadata shared?
  • A resource description community is defined by
    consensus on conventions
  • Consensus on syntax
  • use of XML
  • Consensus on semantics of terms
  • meaning of (uniquely named through XML namespace)
    elements/attributes
  • Consensus on meaning of structure
  • use of community standard XML DTD/Schema

31
Introducing XML
  • Extensible Markup Language
  • Recommendation of W3C, 1998, 2000
  • Defines means of describing tree-structured data
    in text-based format
  • embedded markup delimits and describes data
  • Simple, platform-independent syntax
  • Standard programming interfaces
  • reusable software components
  • Widely adopted for transferring data between
    programs, systems

32
Creator
Date
Title
Doc
J Smith
2001-11-05
Report
1
lttablegt ltrecordgt ltdocgt1lt/docgt ltcreatorgtJ
Smithlt/textgt ltdategt2001-11-05lt/dategt lttitlegtReport
lt/titlegt lt/recordgt lt/tablegt
33
Creator
Date
Title
Doc
Serialisation
ltrecordgt ... lt/recordgt
Transmission
ltrecordgt ... lt/recordgt
Remote application
De-serialisation
34
Introducing XML (2)
  • Support from major software vendors
  • Use of XML
  • invisible to end-user
  • increasingly invisible to information manager?
  • generated and consumed by software
  • requires consensus on structure amongst
    communication partners
  • Use XML for exchange when
  • partners (humans, applications) both know
    semantics conveyed by structure of (meta)data
  • Use RDF/XML for exchange when
  • (meta)data potentially used by applications
    without prior knowledge of specific schema
  • (meta)data incorporates overlapping structures
    from different domains

35
Introducing OAI
  • Open Archives Initiative
  • develops/promotes interoperability standards to
    facilitate dissemination of content
  • roots in e-prints community
  • Archive repository, not archive
  • Open in terms of architecture, not
    free/unlimited access to repository

36
Introducing OAI MHP
  • OAI Metadata Harvesting Protocol
  • lightweight protocol which allows data providers
    to expose metadata records for retrieval by
    service providers
  • built on HTTP, XML
  • requests from service provider to data provider
    sent using HTTP GET/POST
  • Six verbs
  • responses from data provider to service provider
    as XML documents
  • Must provide simple DC (OAI provides XML Schema)
  • May provide other metadata formats (in XML)

37
Introducing OAI MHP (2)
  • Supports selective harvesting
  • by sets
  • by datestamps
  • Example
  • http//www.myarchive.org/cgi-bin/oai?verbListReco
    rdsfrom2002-01-01metadataPrefixoai_dc
  • List all records added since Jan 1 2002 in oai_dc
    format (simple DC)
  • Returns XML document containing records
  • OAI MHP is not a distributed search protocol

38
Resource discovery metadata in the NOF-digitise
programme
39
Resource discovery within a project
Resources
40
Resource discovery across the programme
41
Resource discovery the larger context
42
N.B. .
  • N.B. Previous diagrams should be treated as
    illustration of potential, not description of
    architecture!
  • Role of collection-level description in
    disclosing existence of collections/repositories
    to portals

43
Summary
  • Metadata for resource discovery and resource
    management
  • Resource discovery metadata made to be shared
  • Communication syntax, semantics, structure
  • Role of standards
  • Lightweight protocols for metadata exchange
  • balance functionality and cost
  • Enhance access to your projects resources

44
Acknowledgements
  • UKOLN is funded by Resource the Council for
    Museums, Archives and Libraries, the Joint
    Information Systems Committee (JISC) of the UK
    higher and further education funding councils, as
    well as by project funding from the JISC and the
    European Union. UKOLN also receives support from
    the University of Bath where it is based.
  • http//www.ukoln.ac.uk/
Write a Comment
User Comments (0)
About PowerShow.com