Encoding DC in XHTML, XML and RDF - PowerPoint PPT Presentation

About This Presentation
Title:

Encoding DC in XHTML, XML and RDF

Description:

zero or one resource URI (a URI reference that identifies the resource being ... CDATA, i.e. a sequence of characters from the document character set which may ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 109
Provided by: andyp74
Category:

less

Transcript and Presenter's Notes

Title: Encoding DC in XHTML, XML and RDF


1
Encoding DC in (X)HTML, XML and RDF
Tutorial at ECDL 2004, Bath September 2004
  • Andy Powell
  • a.powell_at_ukoln.ac.uk
  • UKOLN, University of Bath, UK
  • http//www.ukoln.ac.uk/

UKOLN is supported by
2
Contents
  • an abstract model for DC (30 mins)
  • encoding DC in XHTML (30 mins)
  • encoding DC in XML (30 mins)
  • encoding DC in RDF/XML (30 mins)
  • practical examples
  • OAI Protocol forMetadata Harvestingand RSS (20
    mins)
  • assigning identifiers (20 mins)

Note you are going to see lots ofangle-brackets
but no XML schemas!
3
Important
  • this is a tutorial
  • please feel free to ask questionsas we go
    through!

4
Important DCMI documents
  • DCMI Abstract Model DRAFThttp//www.ukoln.ac.uk
    /metadata/dcmi/abstract-model/
  • Expressing Dublin Core in HTML/XHTML meta and
    link elementshttp//dublincore.org/documents/dcq-
    html/
  • Guidelines for implementing Dublin Core in
    XMLhttp//dublincore.org/documents/dc-xml-guideli
    nes/
  • Expressing Simple Dublin Core in
    RDF/XMLhttp//dublincore.org/documents/dcmes-xml/
  • Expressing Qualified Dublin Core in
    RDF/XMLhttp//dublincore.org/documents/dcq-rdf-xm
    l/
  • Namespace Policy for the DCMIhttp//dublincore.or
    g/documents/dcmi-namespace/
  • DCMI Metadata Termshttp//dublincore.org/document
    s/dcmi-terms/

5
Implementing DC
  • this tutorial is about the mechanics of
    implementing DC in HTML, XML and RDF
  • it doesnt really consider which implementation
    strategy isthe best!
  • ask yourself two questions
  • what am I trying to achieve?
  • does using HTML, XML or RDF help me achieve it?
  • do software and services exist that will support
    the creation and use of mymetadata?

6
Abstract models for DC
7
Why abstract models?
  • the first part of this tutorial isnt going to
    show any angle brackets!
  • why?
  • because before we start creating DCMI
    descriptions we need to understand
  • the DCMI view of the world/resources we want to
    describe (the DCMI resource model)
  • the DCMI view of the descriptions we make about
    that world (the DCMI description model)

8
DCMI resource model
  • each resource that we want to describe has zero
    or more properties
  • a property is a specific aspect, characteristic,
    attribute or relation used to describe a resource
  • each property has one or more values
  • each value is a resource (the physical or
    conceptual entity that is associated with a
    property when it is used to describe a resource)

9
Err but what is a resource?
  • W3C/IETF definition of resource is
  • anything that has identity. Familiar examples
    include an electronic document, an image, a
    service (e.g., "today's weather report for Los
    Angeles"), and a collection of other resources.
    Not all resources are network "retrievable"
    e.g., human beings, corporations, and bound books
    in a library can also be considered resources.
  • i.e. a resource is anything
  • physical things (books, cars, people)
  • digital things (Web pages, digital images)
  • conceptual things (colours,points in time)

10
Yeah, but no, but
  • but this seems to be too wide for the things we
    can describe with DC!
  • can we really describe people using DC?
  • do people have titles and subjects?
  • no in general we only use DC to describe a
    sub-set of all resources
  • anything covered by the DCMIType list
  • Collection, Dataset, Event, Image (Still or
    Moving), Interactive Resource, Service, Software,
    Sound, Text, Physical Object

11
DCMI resource model (2)
  • each resource may be a member of one or more
    classes
  • each class may be related to one or more other
    classes by a refines (sub-class) relationship
  • the two classes share some semantics such that
    all resources that are members of the sub-class
    are also members of the related class

where the resource is the value of a property,
the class is referred to as a vocabulary encoding
scheme
12
DCMI resource model (3)
  • each property may be related to exactly one other
    property by a refines (sub-property) relationship
  • the two properties share some semantics such that
    all valid values of the sub-property are also
    valid values of the related property

13
DCMI description model
  • a description is made up of
  • one or more statements (about one, and only one,
    resource) and
  • zero or one resource URI (a URI reference that
    identifies the resource being described)
  • each statement is made up of
  • a property URI (that identifies a property),
  • zero or one value URI (that identifies a value of
    the property),
  • zero or one encoding scheme URI (that identifies
    the class of the value) and
  • zero or more value representations of the value

14
DCMI description model (2)
  • each property is an attribute of the resource
    being described
  • each property URI may be repeated in multiple
    statements
  • the value representation may take the form of a
    value string, a rich value or a related
    description

15
DCMI description model (3)
  • each value string is a simple, human-readable
    string that represents the value of the property
  • each value string may have an associated encoding
    scheme URI that identifies a syntax encoding
    scheme
  • each value string may have an associated value
    string language that is an ISO language tag (e.g.
    en-GB)

16
DCMI description model (4)
  • each rich value is some marked-up text, an image,
    a video, some audio, etc. or some combination
    thereof that represents the resource that is the
    value of the property
  • each related description is a description of
    (i.e. some metadata about) the resource that is
    the value of the property

17
The 11 principle
  • notice that the model indicates that each
    property used in a description must be an
    attribute of the resource being described
  • this is commonly referred to as the 11 principle
    - the principle that a DCMI metadata description
    describes one, and only one, resource
  • however

18
Description sets
  • real-world metadata applications tend to be based
    on loosely grouped sets of descriptions (where
    the described resources are typically related in
    some way)
  • known here as description sets
  • for example, a description set might comprise
    descriptions of both a painting and the artist

19
DCMI records
  • description sets are instantiated, for the
    purposes of exchange between software
    applications, in the form of metadata records
  • each record conforms to one of the DCMI encoding
    guidelines (XHTML meta tags, XML, RDF/XML, etc.)

ltdctitlegt a document lt/dctitlegt ltdccreatorgt and
y powell lt/dccreatorgt
20
Values (again!)
  • a value is the physical or conceptual entity that
    is associated with a property when it is used to
    describe a resource
  • the value of the DC Creator property is a person,
    organisation or service - a physical entitiy
  • the value of the DC Date property is a point in
    time - a conceptual entity
  • the value of the DC Coverage property may be a
    geographic region or country - a physical entity
  • the value of the DC Subject property may be a
    concept - a conceptual entity - or a physical
    object or person - a physical entity
  • each of these entities is a resource

21
Simple vs. qualified DC?
  • the notions of simple DC and qualified DC are
    often referred to in DCMI discussions
  • a view of what these two terms mean is presented
    here
  • note that not everyoneagrees with mydefinitions!

22
Simple DC record
  • a simple DC record is a record that
  • conforms to the abstract model,
  • comprises only a single description,
  • uses only the 15 properties in the Dublin Core
    Metadata Element Set,
  • makes no use of value URIs, encoding schemes,
    rich values or related descriptions.

23
A couple of notes
  • there is no guaranteed linkage between a simple
    DC record and the resource being described
    because the resource URI is optional
  • such a linkage may be made by encoding the URI of
    the resource as the value string of the DC
    Identifier element, however this is not mandatory
    everything in DC is optional
  • while the value string of a property may look
    like a URI, there is nothing in the simple DC
    model that indicates this is the case

at their own risk, implementations may choose to
guess which value strings are URIs and which are
not
24
Qualified DC model
  • a qualified DC record is a record that
  • conforms to the DCMI abstract model,
  • contains at least one property taken from the
    DCMI Metadata Terms recommendation

25
A couple of notes
  • it is still the case that there is no guaranteed
    linkage between a qualified DC record and the
    resource being described!
  • a linkage may be made by encoding the URI of the
    resource as the value string of the DC Identifier
    element, however this is not mandatory
    everything in DC is optional

where the value of a property is a URI, we can
now indicate that it is a URI by using the URI
encoding scheme
26
Dumb-down
  • the process of translating a qualified DC
    metadata record into a simple DC metadata record
    is normally referred to as dumbing-down
  • can be separated into two parts property
    dumb-down and value dumb-down.
  • each of these processes can be be approached in
    one of two ways
  • informed dumb-down
  • uninformed dumb-down

27
Dumb and dumberer
  • informed dumb-down takes place where the software
    performing the dumb-down algorithm has knowledge
    built into it about the property relationships
    and values being used within a specific DCMI
    metadata application
  • uninformed dumb-down takes place where the
    software performing the dumb-down algorithm has
    no prior knowledge about the properties and
    values being used

28
Dumb-down algorithm
element
value
uninformed
informed
  • and in all cases
  • ignore any related descriptions and rich values,
  • ignore any encoding scheme URIs.

29
Encoding DC in XHTML (and HTML!)
30
What is being described?
  • a DC record embedded in an (X)HTML document
    describes that document
  • if you want to describe something else, dont
    embed it in the (X)HTML document!
  • not everyone would
    agree with this

31
Simple DC elements
  • use the name and content attributes of the
    XHTML ltmetagt element to encode the DC element
    (one of the 15 DCMES elements) and it's value
    string. Use the following patternltmeta
    name"DC.element" content"Value string" /gt
  • for exampleltmeta name"DC.date"
    content"2001-07-18" /gt
  • the element names of the 15 DCMES
    elementsalways have a lower-case first letter

32
Simple DC values
  • value strings go in the XHTML ltmetagt element
    content attribute
  • the string in the content attribute is defined
    to be CDATA, i.e. a sequence of characters from
    the document character set which may include
    character entities
  • long value strings may be wrappedacross
    multiple lines as necessarywill need to
    escape some characters, amp, lt, gt, etc

33
Language of the value
  • where the language of the value string is
    indicated, it should be encoded using the
    xmllang attribute of the XHTML ltmetagt element.
    For exampleltmeta name"DC.subject"
    xmllang"en" content"seafood" /gtltmeta
    name"DC.subject" xmllang"fr" content"fruits
    de mer" /gt

34
Repeated elements
  • multiple property values should be encoded by
    repeating the XHTML ltmetagt element for that
    property, for exampleltmeta name"DC.title"
    content"First title" /gtltmeta name"DC.title"
    content"Second title" /gt

35
Other DC elements
  • use the name and content attributes of the
    XHTML ltmetagt element to encode the DC element
    (e.g. audience) and it's value. Use the following
    patternsltmeta name"DCTERMS.element"
    content"Value" /gt
  • for exampleltmeta name"DCTERMS.audience"
    content"software developers" /gt
  • element names may be mixed-case butshould
    always have a lower-case first letter

36
Element refinements
  • use the name and content attributes of the
    XHTML ltmetagt element to encode the element
    refinement and it's value. Use the following
    patternltmeta name"DCTERMS.elementRefinement
    content"Value" /gt
  • for exampleltmeta name"DCTERMS.modified"
    content"2001-07-18" /gt

37
Element refinements (2)
  • element refinements should use the names
    specified inDCMI Metadata Termshttp//dublinco
    re.org/documents/dcmi-terms/
  • as a general rule, element refinement names may
    be mixed-case but should always have a lower-case
    first letter

38
Encoding schemes
  • encoding schemes are encoded using the scheme
    attribute of the XHTML ltmetagt element, using the
    following patternltmeta name"DC.element"
    scheme"DCTERMS.Scheme" content"Value" /gt
  • for exampleltmeta name"DC.date"
    scheme"DCTERMS.W3CDTF" content"2001-07-18"
    /gt

39
Encoding schemes (2)
  • encoding schemes should use the names specified
    inDCMI Metadata Termshttp//dublincore.org/doc
    uments/dcmi-terms/
  • as a general rule, encoding scheme names may be
    mixed-case but should always start with an
    upper-case letter. Encoding scheme names are
    often all upper-case

40
Handling namespaces
  • the DC. and DCTERMS. prefixes are used to
    indicate the namespace from which the property is
    taken
  • put the namespace URI in an XHTML ltlinkgt
    elementltlink rel"schema.DC"
    href"http//purl.org/dc/elements/1.1/" /gtltlink
    rel"schema.DCTERMS" href"http//purl.org/dc/term
    s/" /gt
  • while any string is allowable as the prefix,
    current practice is to use DC. and DCTERMS.

41
Value URIs
  • where the value of a property is the URI of
    another resource (e.g. DC.relation) an
    alternative form of encoding using the XHTML
    ltlinkgt element is preferred. Use the following
    patternltlink rel"propertyName"
    hrefvalueURI" /gt
  • for exampleltlink rel"DC.relation"href"http/
    /www.example.org/" /gtltlink rel"DCTERMS.reference
    s"href"http//www.example.org/176459.pdf" /gt

42
HTML profile
  • to give recipient software applications an
    indication of the XHTML profile that was used to
    encode the DCMI metadata, the profile attribute
    of the XHTML ltheadgt element should be
    usedlthead profile"http//dublincore.org/docum
    ents/dcq-html/"gt

43
The case of names
  • note that some of the old DCMI documents
    recommend(ed) using an uppercase first letter for
    the names of DCMES elements and element
    refinements, e.g. using Title rather than
    title
  • this form of DCMES element naming is acceptable
    but is no longer considered the preferred form

44
The case of names (2)
  • in general, any software applications that
    consume DC records embedded into XHTML Web pages
    should ignore the case of names and should treat
    both . (full-stop) and (colon) as valid
    characters after the prefix
  • i.e. all the following forms should be treated as
    being equivalent ltmeta name"DC.date"
    content"2001-07-18" /gt ltmeta name"DC.Date"
    content"2001-07-18" /gt ltmeta name"dc.date"
    content"2001-07-18" /gt ltmeta name"dcdate"
    content"2001-07-18" /gt

45
Older versions of HTML
  • all the examples in this tutorial conform to
    XHTML 1.0
  • most of the recommendations in this tutorial can
    be applied to older versions of HTML (e.g. HTML
    4.01) but
  • use lang rather than xmllang
  • older versions of HTML do not require the
    trailing / before the closing gt in the HTML
    ltmetagt element
  • very old versions of HTML have no ltmetagt scheme
    attribute

46
Mixing DC and non-DC
  • DC metadata can be mixed with non-DC metadata in
    XHTML ltmetagt elements
  • the following example embeds DC, AGLS and
    unspecified metadata properties in the same XHTML
    Web pageltlink rel"schema.DC"
    href"http//purl.org/dc/elements/1.1/" /gtltlink
    rel"schema.AGLS"href"http//www.naa.gov.au/reco
    rdkeeping/gov_online/agls/1.2" /gtltmeta
    name"DC.title" content"Services to Government"
    /gtltmeta name"keywords" content"archives,
    information management, public administration"
    /gtltmeta name"AGLS.Function" scheme"AGIFT"
    content"recordkeeping standards" /gt

47
A couple of examples
  • Simple DCexample 1
  • Qualified DCexample 2
  • ScreenCam of using DC-dothttp//www.ukoln.ac.uk/
    metadata/dcdot/

48
Encoding DC in XML
49
Introduction to XML
  • this section is based onGuidelines for
    implementing Dublin Core in XMLhttp//dublincore.
    org/documents/dc-xml-guidelines/
  • nine recommendations for encoding DC in XML
  • Note these recommendations do not create
    RDF/XML. This is not intended to imply that
    plain XML is better than RDF/XML RDF is covered
    in the next section!

50
Recommendation 1
  • implementers should base their XML applications
    on XML Schemas rather than XML DTDs
  • approaches based on XML Schemas are more flexible
    and are more easily re-used within other XML
    applications
  • in some cases it may be sensible to provide
    both an XML Schema and a DTD for the application.
    Where XML Schemas are not used, a DTD should be
    provided instead

51
Recommendation 1 (2)
  • the DCMI maintains a list of XML schemas that are
    in use in projects or products using DCMI
    metadataDCMI Metadata expressed in XML Schema
    Languagehttp//dublincore.org/schemas/xmls/

52
Recommendation 2
  • implementers should use URI references (see
    later) to uniquely identify DC elements, element
    refinements and encoding schemesDC namespaces
    are defined in the DCMI Namespace
    Recommendation...

53
Container elements
  • note that it is anticipated that records will be
    encoded within one or more container XML
    element(s) of some kind
  • this tutorial makes no recommendations for the
    name of any container element, nor for the
    namespace that the element should be taken from
  • candidate container element names include ltdcgt,
    ltdublinCoregt, ltresourcegt, ltrecordgt and ltmetadatagt

54
Recommendation 3
  • implementers should encode properties as XML
    elements and values as the content of those
    elements
  • the name of the XML element should be an XML
    qualified name (QName) of the propertyltdctitlegt
    Dublin Core in XMLlt/dctitlegt
  • do not use constructs likeltdctitle
    value"Dublin Core in XML" /gt

55
Recommendation 4
  • the property names for the 15 DC elements should
    be all lower-caseltdctitlegtDublin Core in
    XMLlt/dctitlegt
  • do not useltdcTitlegtDublin Core in
    XMLlt/dcTitlegt

56
Recommendation 5
  • multiple property values should be encoded by
    repeating the XML element for that
    propertyltdctitlegtFirst titlelt/dctitlegt
    ltdctitlegtSecond titlelt/dctitlegt

57
Simple DC example
  • example 3

58
Recommendation 6
  • element refinements should be treated in the same
    way as other properties
  • the name of the XML element should be an XML
    qualified name (QName)ltdctermsavailablegt2002-0
    6lt/dctermsavailablegt
  • do not use any of the followingltdcdate
    refinement"available"gt2002-06lt/dcdategtltdcdate
    type"available"gt2002-06lt/dcdategtltdcdategt
    ltdctermsavailablegt2002-06 lt/dctermsavailablegt
    lt/dcdategt

59
Recommendation 6 (2)
  • element refinements are properties in their own
    right and are therefore best encoded in a similar
    way to other DC elements
  • in particular, it should be noted that element
    refinements may have further refinements of their
    own (e.g. format is refined by extent which
    might be further refined by duration)
  • nesting does not mean refinement
  • nesting might be used for other purposes

60
Recommendation 7
  • encoding schemes should be implemented using the
    'xsitype' attribute of the XML element for the
    property
  • the name of the encoding scheme should be given
    as the attribute value, and should be in the form
    of an XML qualified name (QName)ltdcidentifier
    xsitype"dctermsURI"gt http//www.ukoln.ac.uk/
    lt/dcidentifiergt

61
Recommendation 7 (2)
  • it should be noted that there may be existing DC
    XML applications that use other conventions to
    support encoding schemes, notably the use of a
    scheme attribute of the XML element for the
    property
  • therefore, it may be sensible for software
    applications that consume DC XML to be fairly
    liberal in what they accept

62
Recommendation 8
  • elements, element refinements and encoding
    schemes should use the names specified inDCMI
    Metadata Termshttp//dublincore.org/documents/dcm
    i-terms/
  • note, the 15 DCMES element names all start with
    a lowercase letter

63
Recommendation 8 (2)
  • element and element refinement names may be
    mixed-case but should always have a lower-case
    first letter
  • encoding scheme names may be mixed-case but
    should always start with an upper-case
    letterltdctermsisPartOf xsitype"dctermsURI"gt
    http//www.bbc.co.uk/lt/dctermsisPartOfgt
    ltdctermstemporal xsitype"dctermsPeriod"gtname
    The Great Depression start1929 end1939
    lt/dctermstemporalgt

64
Recommendation 9
  • where the language of the value is indicated, it
    should be encoded using the xmllang
    attributeltdcsubject xmllang"en"gt
    seafoodlt/dcsubjectgtltdcsubject xmllang"fr"gt
    fruits de merlt/dcsubjectgt

65
Some examples
  • Qualified DCexample 4
  • DC and IMSexample 5
  • DC, IMS and ODRLexample 6

HEALTH WARNING Examples 5 and 6 may seriously
damage your interoperability!
66
Encoding DC in RDF
67
What is RDF?
  • Resource Description Framework
  • W3C recommendation for metadata
  • model and syntax(es)
  • XML is most common syntax in use on the Web
  • underpins the semantic WebW3C - Resource
    Description Framework (RDF)http//www.w3.org/RDF/

68
Why use RDF?
  • RDF provides shared metadata model
  • shared meaning
  • metadata can be shared between applications that
    have little or no knowledge about each other
  • e.g. an RDF-based bibliographic application can
    consume RDF-based geospatial metadata and have
    'some' knowledge of what it meanswith (X)HTML
    and XML encodings, softwareapplications must
    have understanding hard-codedinto them

69
DC in RDF
  • DC abstract models map easily onto the RDF model
    (because RDF was the basis for them!)
  • DC in RDF/XML syntax is an encoding of the RDF
    model in XML
  • simple DC is similar to the non-RDF XML we've
    seen already
  • but with the addition of ltrdfRDFgt and
    ltrdfDescriptiongt container elements
  • example 7

70
RDF basics the model
  • model based on triples
  • a resource has a property which has a value
  • often represented as an arc-node diagram (or
    graph)
  • resources and properties are identified using URI
    references

property
resource
value
71
A more concrete example
  • The graph below approximately translates into
    English as
  • the resource identified by the URI
    http//example.org/ has a dccreator that is
    represented by the string Andy Powell

http//purl.org/dc/elements/1.1/creator
http//example.org/
Andy Powell
72
Values as resources
  • values can be resources too
  • means that we can then attach properties to the
    value as well as to the original resource
  • build up quite complex graphs

http//example.org/
dccreator
myname
myphoneNumber
Andy Powell
01225 383933
73
Typed and blank nodes
  • nodes can be blank (to represent resources that
    have not be assigned a URI)
  • can also indicate the class of a resource using
    the rdftype property

myPerson
rdftype
http//example.org/
dccreator
myname
myphoneNumber
Andy Powell
01225 383933
74
Qualified DC in RDF
  • now ready to look at some more complex examples
  • for full details about how to encode DC in RDF
    seeExpressing Simple Dublin Core in
    RDF/XMLhttp//dublincore.org/documents/dcmes-xml/
    Expressing Qualified Dublin Core in
    RDF/XMLhttp//dublincore.org/documents/dcq-rdf-xm
    l/

75
Case study 1 dccreator
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsmy"http//purl.org"gt
ltrdfDescriptiongt ltdccreatorgt
ltrdfDescriptiongt ltrdfvaluegt
Andy Powell lt/rdfvaluegt
ltmyemailgt a.powell_at_ukoln.ac.uk
lt/myemailgt lt/rdfDescriptiongt
lt/dccreatorgt lt/rdfDescriptiongt lt/rdfRDFgt
Example RDF description using dccreator
76
Case study 1 dccreator
Andy Powell
rdfslabel
dccreator
Andy Po
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
and the RDF model it represents.
77
Case study 1 dccreator
relatedMetadata
Andy Powell
rdfslabel
dccreator
Andy Po
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
But we dont want to embed all this information
into every instance metadata record do we?
78
Case study 1 dccreator
Andy Powell
rdfslabel
dccreator
Andy Po
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
Need to separate part of the information out and
store it in a single place in this case in a
directory service
79
Case study 1 dccreator
Andy Powell
rdfslabel
dccreator
Andy Po
valueURI
valueURI
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
To do this we need to assign a URI (the
valueURI) to the anonymous value node
80
Case study 1 dccreator
relatedMetadataURI
Andy Powell
rdfslabel
dccreator
Andy Po
valueURI
valueURI
myname
myemail
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
The document containing this information is
itself an RDF resource (the relatedMetadata)
and has a URI
81
Case study 1 dccreator
relatedMetadataURI
Andy Powell
rdfslabel
dccreator
Andy Po
valueURI
valueURI
myname
myemail
rdfsseeAlso
a.powell_at_uko
a.powell_at_uko
myaffiliation
UKOLN, Univ
Use rdfseeAlso to form linkage between
description and relatedMetadata
82
Case study 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt
lt/dcsubjectgt lt/rdfDescriptiongt lt/rdfRDFgt
Example RDF description using dcsubject (taken
from Qualified DC in RDF recommendation
83
Case study 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt
lt/dcsubjectgt lt/rdfDescriptiongt lt/rdfRDFgt
Formated
rdfslabel
dcsubject
rdfsvalue
D08.586
rdftype
rdftype
dctermsMESH
and the RDF model it represents.
84
Case study 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt
lt/dcsubjectgt lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadata
Formated
rdfslabel
dcsubject
rdfsvalue
D08.586
rdftype
rdftype
dctermsMESH
But we dont want to embed all this information
into every instance metadata record do we?
85
Case study 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
D08.586
Formated
rdfslabel
dcsubject
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
Need to separate part of the information out and
store it in a single place in this case with
the terminology owner
86
Case study 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
D08.586
Formated
rdfslabel
dcsubject
valueURI
valueURI
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
To do this we need to assign a URI (the
valueURI) to the anonymous value node
87
Case study 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadataURI
D08.586
Formated
rdfslabel
dcsubject
valueURI
valueURI
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
The document containing this information is
itself an RDF resource (the relatedMetadata)
and has a URI
88
Case study 2 dcsubject
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadataURI
D08.586
Formated
rdfslabel
dcsubject
valueURI
valueURI
rdfsseeAlso
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
Use rdfseeAlso to form linkage between
description and relatedMetadata
89
Abstract DC model
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
ltrdfslabelgt Formate Dehydrogenase
lt/rdfslabelgt lt/dctermsMESHgt lt/rdfRDFgt
lt?xml version"1.0"?gt ltrdfRDF xmlnsrdfhttp//ww
w. xmlnsrdfshttp//www.w3.org/ xmlnsdchttp/
/purl.org/dc/ xmlnsdcterms"http//purl.org"gt
ltrdfDescriptiongt ltdcsubjectgt
ltdctermsMESHgt ltrdfvaluegt
D08.586.682.075.400 lt/rdfvaluegt
lt/dctermsMESHgt lt/dcsubjectgt
lt/rdfDescriptiongt lt/rdfRDFgt
relatedMetadataURI
resource
resource
related description
D08.586
Formated
valueString
value string (language)
rdfslabel
dcsubject
valueURI
property
property
valueURI
valueURI
value URI
rdfsseeAlso
Formated
rdftype
rdftype
dctermsMESH
dctermsMESH
In terms of abstract DC model we now have
resource, property, value URI, value string (and
value string language), encoding scheme, related
description
encoding scheme
90
Practical examples OAI and RSS
91
OAI-PMH
  • OAI Protocol for Metadata Harvesting
  • simple protocol for sharing metadata records
    between applications
  • currently at version 2.0
  • based on HTTP, XML, XML Schema and XML namespaces
  • allows a harvester to ask a remote repository for
    some or all of its metadata records

92
OAI-PMH (2)
  • simple DC is default (mandatory) record format
  • supports any record format provided it can be
    encoded using XML (e.g. DC, IMS, MARC, ODRL,
    )Open Archives Initiativehttp//www.openarchiv
    es.org/

93
OAI-PMH example
  • record from the American Memory repository at the
    Library of Congresshttp//memory.loc.gov/cgi-bin
    /oai2_0
  • example 8
  • ScreenCam of using the repository explorer
  • GetRecord for record identifieroailcoa1.loc.gov
    loc.gmd/g3701p.rr003570

94
RSS
  • RDF Site Summary or Rich Site Summary (or even
    Really Simple Syndication)
  • at least 3 different versions (0.91, 1.0 and 2.0)
  • all based on XML but not compatible
  • simple format for sharing news feeds on the Web
  • RSS channels list of items
  • channels updated by updating XML file
  • RSS clients gather XML on regular basis

95
RSS 1.0 and DC example
  • RSS 1.0 based on RDF
  • most flexible and extensible of the RSS family
    - not necessarily the most widely deployed
  • can include DC in both channel and item
    descriptions
  • example 9
  • full documentation atRDF Site Summary 1.0
    Modules Qualified Dublin Corehttp//web.resource
    .org/rss/1.0/modules/dcterms/

96
Assigning identifiers tometadata terms
97
Whats the problem?
  • the terms used in DCMI metadata records must be
    assigned a URI reference before they can be used
  • qualified DC application profiles generally use
    local additions to DCMI terms
  • therefore these additional terms must be assigned
    a URI reference
  • a URI reference is a URI with an optional
    fragment identifier

98
DCMI terms URI references
  • all DCMI terms have already been assigned URI
    references
  • for examplehttp//purl.org/dc/elements/1.1/titl
    ehttp//purl.org/dc/terms/dateCopyrighted

99
Namespace-name issues
  • encoding syntaxes split the term URI reference
    into two parts
  • namespace
  • name
  • the namespace is shortened to a namespace prefix
  • for example
  • DC.title (XHTML)
  • dctitle (XML, RDF/XML)

100
Guidelines
  • for groups of related terms, URI references are
    typically assigned such that they can share a
    namespace prefix
  • all term URI references should resolve to a human
    and/or machine readable description of the term
  • term URI references should use a registered URI
    scheme
  • term URI references should be assigned with the
    intention of them being as persistent as the
    Internet

101
A note on namespaces
  • DCMI namespaceA DCMI namespace is a collection
    of DCMI terms (a collection of names)
  • DCMI termA DCMI term is a DCMI element, a DCMI
    qualifier or term from a DCMI-maintained
    controlled vocabulary
  • each DCMI namespace is identified by a URI each
    name in the namespace is also a URI
  • a mechanism for making DCMI terms unique

102
How do I assign URIs?
  • no clear recommended best practice in this area
    yet!
  • four strategies for assigning URIs are presented
    here
  • there are other strategies!

103
Using project/service URIs
  • simple to do
  • but danger of lack of persistence
  • exampleshttp//myservice.org/terms/priceltmyser
    vicepricegt (XML, RDF/XML)MYSERVICE.price
    (XHTML)http//myproject.org/metadata/vocabs/colo
    rRedltmyprojectRedgt (RDF/XML)

104
Using PURLs
  • PURLs are persistent URLs (under the purl.org
    domain)
  • used by DCMI, RSS and others to provide
    persistent term URI references
  • exampleshttp//purl.org/dc/elements/1.1/titlelt
    dctitlegt (XML, RDF/XML)DC.title
    (XHTML)http//purl.org/rss/1.0/linkltrsslinkgt
    (XML, RDF/XML)

105
Using xmlns.com
  • domain registered explicitly for use for XML
    namespaces
  • but persistence policy a little unclear
  • used for FOAF terms
  • examplehttp//xmlns.cm/foaf/0.1/firstNameltfoaf
    firstNamegt (RDF/XML)

106
Using info URIs
  • info URIs specifically designed for identifying
    vocabulary terms
  • but not a registered scheme yet and there is
    currently some discussion (i.e. argument!) on
    various lists about whether they are a good idea
  • exampleinfoddc/22/eng//004.678

107
What have we learned?
  • an abstract model for DC
  • encoding DC in XHTML
  • encoding DC in XML
  • encoding DC in RDF/XML
  • two practical examples
  • OAI Protocol forMetadata Harvesting
  • RSS
  • how to assign identifiersto new metadata terms

108
Questions?
Write a Comment
User Comments (0)
About PowerShow.com