Metadata Modularization Concepts and Tools - PowerPoint PPT Presentation

About This Presentation
Title:

Metadata Modularization Concepts and Tools

Description:

'Document has creator William Shakespeare' 'Document has subject love and anguish' ... Hamlet has a creator Stratford. birthplace. The playwright of Hamlet was ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 36
Provided by: lag7
Category:

less

Transcript and Presenter's Notes

Title: Metadata Modularization Concepts and Tools


1
Metadata ModularizationConcepts and Tools
  • Carl Lagoze
  • CS502
  • 2001-03-14

2
Metadata
Structured data about data.
3
Why is Metadata important?
  • Key to organizing, managing, preserving, and
    locating content and services in digital libraries

4
Why is Metadata difficult?
  • Cost
  • Interoperability
  • Syntax
  • Semantics
  • Customizability
  • Extensibility
  • Distribution
  • Integrity, Authenticity, Quality
  • Human and Machine Factors
  • Naming

5
Metadata Thoughts
  • Metadata takes a variety of forms
  • descriptive cataloging
  • specialized
  • terms and conditions
  • administrative
  • content ratings
  • provenance
  • linkage

6
More Metadata Thoughts
  • New metadata sets will continually evolve
  • Many metadata sets are community-specific
  • administration
  • use
  • Human and machine use

7
Dublin Core
  • Metadata Set for Simple Resource Discovery
  • 15 elements allowing simple descriptive sentences
    about document like objects
  • Document has title Hamlet
  • Document has creator William Shakespeare
  • Document has subject love and anguish

8
The Dublin Core 15
  • Title
  • Creator
  • Subject /Keywords
  • Description
  • Publisher
  • Other Contributor
  • Date
  • Resource Type
  • Format
  • Resource Identifier
  • Source
  • Language
  • Relation
  • Coverage
  • Rights Management

9
A Scope for the Dublin Core
  • Increase or decrease number of elements?
  • Structured or Unstructured value syntax?
  • Accommodate community extensions?

10
Warwick Framework
  • Provide context for Dublin Core effort
  • Integrate multiple sets of metadata addressing
    issues of
  • individual integrity
  • distinct audiences
  • separate realms of responsibility and management

11
Warwick Framework Design
  • Containers for aggregating
  • Packages of typed metadata sets
  • General principles - information hiding
  • only operation defined at container level returns
    sequence of contained packages
  • packages are opaque at the container level
  • access to package contents subject to terms and
    conditions

12
Package Types
  • Simple metadata set
  • segregating distinct metadata into separate
    packages
  • Recursive container
  • nesting semantically related metadata sets
  • Indirect reference
  • allowing distribution and sharing of metadata sets

13
Metadata Container
Container

Package Dublin Core
Package MARC record
Package Indirect Reference
Package Terms and Conditions
URI
14
Open Implementation Issues
  • Data encoding
  • Semantic interaction of overlapping sets
  • between semantically-related packages
  • between semantically distinct packages
  • Type registry

15
Modeling Encoding Metadata Components XML
Namespaces
  • Prevent term clash
  • record?, creator?
  • Establish concept spaces through URIs

xmlnsdchttp//purl.org/dcxmlnsabchttp//ilr
t.ac.uk/abcltdccreatorgtHerbert Van de
Sompellt/dccreatorgtltabcorganizationgtCornell
Universitylt/abcorganizationgt
16
Modeling Encoding Metadata Components RDF
  • RDF (Resource Description Format)
  • The instantiation of the Warwick Framework on the
    Web
  • Provides enabling technology for
    richly-structured metadata
  • Rich data model supporting notions of distinct
    entities and properties
  • Syntax expressed in XML

17
RDF Components
  • Formal data model
  • Syntax for interchange of data
  • Schema Type system (schema model)

18
RDF Data Model
  • Directed labeled graphs
  • Model elements
  • Resource
  • Property
  • Value
  • Statement
  • Containers

19
RDF Model Primitives
Resource
Property
Value
20
RDF Syntax Example
URIR
Title
CIMI Presentation
Creator
Eric Miller
ltRDF xmlns http//www.w3.org/TR/WD-rdf-syntax
xmlnsdc http//purl.org/dc/element
s/1.0/gt ltDescription about URIRgt
ltdcTitlegt CIMI Presentation lt/dcTitlegt
ltdcCreatorgt Eric Miller lt/dcCreatorgt
lt/Descriptiongt lt/RDFgt
21
RDF Model Example 2
URIR
Title
CIMI Presentation
Creator
Eric Miller
22
RDF Syntax Example 2
ltRDF xmlns http//www.w3.org/TR/WD-rdf-syntax
xmlnsdc http//purl.org/dc/element
s/1.0/ xmlnsbib http//www.bib.org
/personsgt ltDescription about URIRgt
ltdcTitlegt CIMI Presentation lt/dcTitlegt
ltoaCreatorgt ltDescriptiongt
ltbibNamegt Eric Miller lt/bibNamegt
ltbibEmailgt emiller_at_oclc.org lt/bibEmailgt
ltbibAff resource http//www.oclc.org /gt
lt/Descriptiongt lt/oaCreatorgt
lt/Descriptiongt lt/RDFgt
23
RDF Containers
  • Permit the aggregation of several values for a
    property
  • Express multiple aggregation semantics
  • unordered
  • sequential or priority order
  • alternative

24
RDF Schemas
  • Declaration of vocabularies
  • properties defined by a particular community
  • characteristics of properties and/or constraints
    on corresponding values
  • Schema Type System - Basic Types
  • Property, Class, SubClassOf, Domain, Range
  • Minimal (but extensible) at this time
  • minimize significant clashes with typing system
    designed for XML Schema WG
  • Expressible in the RDF model and syntax

25
Relationships among vocabularies
dcCreator
marc100
msdirector
bibAuthor
26
Bringing it together
  • RDF Data Model
  • Support consistent encoding, exchange and
    processing of metadata critical when aggregating
    data from multiple sources
  • RDF Schema
  • Declare, define, reuse vocabularies
  • RDF Metadata transmission
  • XML encoding

27
Interoperability among Metadata Vocabularies
- projections to application-specific metadata
vocabularies
core classes
28
Attribute/Value approaches to metadata
The playwright of Hamlet was Shakespeare
Hamlet has a creator
Shakespeare
29
run into problems for richer descriptions
The playwright of Hamlet was Shakespeare,who was
born in Stratford
Hamlet has a creator
Shakespeare
30
because of their failure to model entity
distinctions
Shakespeare
name
R1
R2
creator
birthplace
title
Stratford
Hamlet
31
Understanding Metadata based on Query Capabilities
  • Simple boolean tags?
  • Agent, time, place questions?
  • Who was responsible for what and when

32
Applying a Model-Centric Approach
  • Formally define common entities and relationships
    underlying multiple metadata vocabularies
  • Describe them (and their inter-relationships) in
    a simple logical model
  • Provide the framework for extending these common
    semantics to domain and application-specific
    metadata vocabularies.

33
Conceptual BasisEvolution of Content over Time
IFLA Entity Model
From Bearman, et. al., D-Lib Magazine, January
1999.
34
Events are key to understanding metadata
relationships?
  • Recognizing inherent lifecycle aspects of digital
    content - transformation of input resources to
    output resources and of their descriptions.
    (e.g., IFLA model)
  • Modeling implied events as first-class objects
    provides attachment points for common entities
    e.g., agents, contexts (times places), roles.
  • Clarifying attachment points facilitates mapping
    across common entities in different vocabularies.

35
Content, Events, Descriptions
36
Museum Data
Write a Comment
User Comments (0)
About PowerShow.com