Title: Dublin Core for Museums Day 1
1 Dublin Core for MuseumsDay 1
CIMI John Perkins jperkins_at_cimi.org
2Overview for Thursday March 25
- Introduction to Metadata
- Introducing the Dublin Core
- CIMI DC Guidelines - Dublin Core for Museums
- Break
- DC for museums continued...
- Lunch
- Practicalities of Implementing DC
- Break
- Introduction to MICI
3Whats the Problem ?
- Need to serve a Web audience
- Demand for content
- Uncertain quality
- Expectations for rapid easy access
- Need to be visible on the Web
- Two million web sites
- Half a billion addressable pages
- Many communities with the same problem
4Whats the Problem ?
- Manage and organise interconnected data
- Different types
- Different repositories
- Packages
- Interoperate with other communities
- Interoperate with other applications
- Need a way to
- Express meanings in rich and complex data
- Express the structure of our data
- Encode the transfer of data
5Whats the Solution ?
- Communities address their own needs
- Do so in a way that works across communities
- Standards based
- Collaborative
6What is a Community?
Based on a slide by Stu Weibel
7Communities working together
Based on a slide by Stu Weibel
8Communities working together
Metadata
Based on a slide by Stu Weibel
9What is Metadata?
- Meaningless jargon
- ora fashionable term for what weve always done
- ora means of turning data into information
- anddata about data
- andthe name of a film director (Luc Besson)
- and the title of a book (The Lord of the
Flies).
10What is Metadata?
- Metadata exists for almost anything
- People
- Places
- Objects
- Concepts
- Databases
- Web pages
11What is Metadata?
- Metadata fulfils three main functions
- description of resource content
- What is it?
- description of resource form
- How is it constructed?
- description of issues behind resource use
- Can I afford it?.
12What is Metadata?
- Many structures have evolved at different levels,
and to meet different requirements...
MICI
13For human communication we need...
SemanticInteroperability
Standardisation ofcontent
cat milk sat drank mat
Lets talk English
StructuralInteroperability
Standardisation ofform
Heres how to make a sentence
Cat sat on mat. Drankmilk.
SyntacticInteroperability
Standardisation ofexpression
These are the rulesof grammar
The cat sat on the mat.It drank some milk.
14Challenges
Opportunities
- Many flavours of metadata
- which one do I use?
- Managing change
- new varieties, and evolution of existing forms
- Tension between functionality and simplicity,
extensibility and interoperability
15Introducing the Dublin Core
- An attempt to improve resource discovery on the
Web - now adopted more broadly
- Building an interdisciplinary consensus about a
core element set for resource discovery - simple and intuitive
- crossdisciplinary
- international
- flexible.
16Introducing the Dublin Core
- 15 elements of descriptive metadata
- All elements optional
- All elements repeatable
- The whole is extensible
- offering a starting point for semantically richer
descriptions - Interdisciplinary
- libraries, museums, government, education...
- International
- available in 20 languages, with more on the way.
17Introducing the Dublin Core
- Title
- Creator
- Subject
- Description
- Publisher
- Contributor
- Date
- Type
- Format
- Identifier
- Source
- Language
- Relation
- Coverage
- Rights
http//purl.org/dc/
18Extending DC (semantic refinement)
Improve descriptive precision by
adding substructure (subelements and schemes)
Element qualifier
Value qualifier
Greater precision lesser interoperability
Should dumb down gracefully
Affiliation
Contact Info
Based on a slide by Stu Weibel
19Extending DC (a modular approach)
- Modular extensibility...
- additional elements to support local needs
- complementary packages of metadata
- but only if we get the building blocks right
Based on a slide by Stu Weibel
20Extending DC?
- DC offers a semantic framework
- through use of further substructure, meaning can
often be clarified
John Inc. ? John xyz ? xyz John ?
ltCreatorgt
John
- John Inc.
- John xyz
- xyz John.
ltCreatorgt
ltfore namegt
John
21Extending DC?
- DC offers a semantic framework
- Use of domainspecific schemes greatlyincreases
precision
Washington State ? Washington DC ? Washington
monument ?
ltCoveragegt
Washington
- Washington State
- Washington DC
- Washington monument
ltCoveragegt
ltTGNgt
Washington
North and Central America, United States,
Washington
22Dublin Core in the physical world
- Dublin Core originally designed with electronic
resources in mind - Physical resources are fundamentally different
- Issues of surrogacy become more important
- Genre, Type, and Format models vary greatly
- Difficult to remember what is being described,
and which characteristics of the resource and its
surrogates are correct.
23Introducing Physical Objects
- Aspects of the real world are keyto much of what
museums do - Physical objects have dimensions
- 23 x 46 cm
- 12 x 52 x 18 in
- 18.6 cm3
- 823 pages
- Physical objects have a form
- oil on canvas
- Tadcaster limestone
- stainless steel.
24Introducing Physical Objects
- Physical objects change over time
- constructed between AD524 and 873
- repaired in AD1270
- incorporated into ornamental arch in AD1320
- Physical objects move
- cast in Beijing
- used in Shanghai
- taken to Hong Kong
- on display in Macau.
25Introducing Physical Objects
- Physical objects are associated with people
- written by William Shakespeare
- acquired by Lord Elgin
- decreed by the Emperor Hadrian
- associated with Prince Charles Edward Stuart
- Physical objects are contextualised
- fired at the Battle of Trafalgar
- carried on Apollo 11 from the moon
- printed on the first printing press
- salvaged from the Titanic.
26Introducing Collections
- Museum objects, whether original orsurrogate,
are normally part of a collection - Collections may be real...
- the Sutton Hoo hoard
- the Terracotta Warriors
- ...an aspect of the process by which objects
enter the museum... - the Burrell Collection
- Solomon Guggenheims art collection
- or simply practical
- coins at the British Museum
- the Tate Gallerys collection of works by Da
Vinci.
27Introducing Surrogacy
- Many of the resources we describe are, in
reality, surrogates for something else - a photograph of King Tutankhamensdeath mask
- a photograph of a statue of George Washington
- a film of President Kennedys assassination
- a sound recording of Neil Armstrongs Onesmall
step for man speech on the moon - a copy of the Mona Lisa
- a model of the Great Wall of China
- a reproduction of the Terracotta warriors.
28Issues of Surrogacy
- Many of the resources we describe are, in
reality, surrogates for something else - we need to be clear whether we aredescribing the
resource or its surrogate - the sculptor of a statue is often not the person
who made its photographic surrogate - the model of the Forbidden City is unlikely to
have been created at the same date as the
Forbidden City itself - the format of a computer image of the Mona Lisa
(image/jpeg ?)is not the same as the format of
the original painting (oil on canvas ?).
29Other Museum Issues
- Museums need to describe real objectsand
surrogates in a similar manner - guidelines/standards therefore need to encompass
both, despite their differences - Resource descriptions will often be drawn from
existing collection management systems in the
first instance, rather than created afresh - guidelines therefore need to respect existing
practices within established systems - There is often no right answer
- so practices need to allow for approximate dates,
multiple possible creators, etc.
30Introducing the 11 Principle
1 1
- The broader Dublin Core community is tackling
some of the problems relevant to museums - Their work on the 11 Principle is especially
useful in resolving museum issues over original
versus surrogate and item versus collection - each Dublin Core record should describe only
one resource, whether surrogate or original.
Associated resources should be linked together by
means of the Relation element in Dublin Core.
31Introducing the 11 Principle
1 1
- In a record describing a photo of the Mona Lisa
on a web page, for example - Leonardo da Vinci is not the creator of the image
- The image was not created during the Renaissance
- but you might include these as Subject terms,
and you could usefully provided a link to the
record describing the real painting via Dublin
Cores Relation element - Equally, in describing the painting itself
- http//www.louvre.fr//monalisa.jpg is not the
Identifier of the painting - but you might link to this image via Relation,
just to show people what the painting looks like.
32The primacy of Type
- In describing museum objects, it is often most
useful to first decide whatyou are describing
and why, rather thanbeginning with who made it
and what is it called, as is often the case
with books - if you know youre describing a surrogate of the
Mona Lisa, then you know Leonardo da Vinci is not
the Creator whoever made the surrogate is - if you know youre describing a collection of
20th century paintings, then you know that
Picasso, Hockney et al are not the Creators the
collector is.
33The primacy of Type
- if you know youre describing the Sutton Hoo
helmet, then the fact that it was added to a
particular museumcollection in 1939 perhaps
doesnt matterthat information is better placed
in the collection record - if you know youre describing a natural specimen,
then perhaps it has no Creator there may be a
creator associated with its identification or
collection, though.
34Dublin Core for Museums Assumptions
- In applying Dublin Core to museums, we aremaking
certain basic assumptions, many of which were
tested by CIMI - DC is appropriate for use in describing both
physical and digital resources - DC is easy to learn and simple to use
- Information can be meaningfully and efficiently
extracted from existing museum systems in order
to populate DC records - the creation of a DC record to describe a museum
object is costeffective, and aids the discovery
of resources more than simply allowing access to
the underlying Collection Management system
might.
35Practicalities of Implementing Dublin Core
Paul MillerUk Office for Library Information
Networkingp.miller_at_ukoln.ac.uk
Thomas HofmannAustralian Museums
On-Linethomash_at_amol.org.au
36Overview
- Creation and Maintenance
- Harvesting and Distribution
- Retrieval
- Implementation Models
- Case Study
37Dublin Core - Refresher
- 15 simple elements
- Focus on Resource Discovery not Resource
Description - One Dublin Core record per resource
- Interoperable across communities
- Can be easy populated from existing databases
- Can be formatted in XML/ RDF or HTML
38When should I use Dublin Core?
- You have a rich standard, need simpler one
- You want to disclose your data to other
communities using commonly understood semantics - You want to provide unified access to databases
with different underlying schemas - You need core description semantics and dont
feel compelled to invent them anew
39Considerations
40(No Transcript)
41Encoding Dublin Core
- HTML
- Unqualified
- Easy
- Qualified
- Overloaded Content (HTML 3.2)
- Additional Attribute (HTML 4)
- RDF
- Based on XML
- Sophisticated
- More complex
42Encoding Dublin Core - Unqualified
ltHEADgt ltMETA NAME"DC.TITLE" CONTENT"My Web
Page"gt ltMETA NAME"DC.Subject" CONTENT"Comput
ers,Metadata"gt lt/HEADgt
43Encoding Dublin Core - Qualified (HTML 3.2)
ltHEADgt ltMETA NAME"DC.Subject" CONTENT"(SCHE
MEAAT)(LANGEN) Statue, Granite"gt lt/HEADgt
44Encoding Dublin Core - Qualified (HTML 4)
ltHEADgt ltMETA NAME"DC.Subject" SCHEME"AAT"
LANG"EN" CONTENT"Statue, Granite"gt lt/HEADgt
45Encoding Dublin Core - Sub-Elements
ltHEADgt ltMETA NAME"DC.Date.Created" CONTENT"
(SCHEMEISO8601) 1999-03-01"gt ltMETA
NAME"DC.Date.Modified" SCHEME"ISO8601" CO
NTENT"19990325"gt lt/HEADgt
46Encoding Dublin Core - RDF
... lt?xmlnamespace href"http//iso.ch/8601/"
as"ISO"?gt ltRDFRDFgt ltRDFDescription
gt ltDCDategt ltRDFDescriptiongt ltISOdategt
19990325lt/ISOdategt lt/RDFDescriptiongt lt/DC
Dategt ltRDFDescriptiongt lt/RDFRDFgt
47Example Tool DC Dot
- http//www.ukoln.ac.uk/metadata/dcdot/
- Semi-automated generation of Dublin Core
- Cut and past into document
- Conversions to HTML, SOIF, XML, WHOIS, USMARC,
GILS
48Example Tool DC Dot
Screenshot of http//www.ukoln.ac.uk/metadata/dc-d
ot/
49Example Tool DC Dot
Screenshots of DC Dot output
50Example Tool Reggie
- http//metadata.net
- Generic creation tool for any metadata schema
published to metadata.net - Currently supports Dublin Core in 5 languages
- Syntax HTML META tags (V3.2 and 4.0), RDF
51Example Tool Reggie
Screenshot of Reggie
52Example Tool Site Generator
- http//www.dstc.edu.au/RDU/MetaWeb/
- Tool which parses local web site and
automatically creates Dublin Core metadata - Syntax HTML
- JAVA based tool which requires JDK 1.1
53Further Information - Creation and Maint.
- Metadata Creation Tools General METADATA PAGE AT
UKOLN http//www.ukoln.ac.uk/metadata/software-to
ols/ METAWEB http//www.dstc.edu.au/RDU/MetaWeb/
TagGen SE http//www.hisoftware.com/fact_sheetc
c.htm - User Guides
- Official User Guide for Simple Dublin
Core http//purl.org/dc/core/documents/working_dr
afts/wd-guide-current.htm - CIMI Guide to Best Practice Dublin Core
54(No Transcript)
55Harvesting / Distribution
- Tools
- Z39.50 Gateway
- Metadata Harvester
- Full-text Search Engine
- Resources
- Indexing, harvesting tools http//www.searchengin
ewatch.com/ http//www.searchtools.com/ http//w
ww.ukoln.ac.uk/metadata/software-tools/ http//ww
w.dstc.edu.au/RDU/MetaWeb/ - Z39.50 http//www.ilrt.bris.ac.uk/discovery/z3950
/resources/ http//www.ukoln.ac.uk/dlis/z3950/res
ources/
56(No Transcript)
57Retrieval
- Tools
- HTML - search forms
- HTML - predefined queries
- Z39.50 clients/ Java applets
- Standalone applications
- Interface design
- Assist users-help them to understand what they
are looking for-give them an idea what
terminologies you are using-use commonly
understood design language
58- Bringing it all togetherImplementation Models
59Implementation Models
- Harvesting DC into a repository (database)
- Distributed Database Search
- Full-text indexing with metadata extraction
60Implementation Models
- Harvesting DC into a repository (database)
?
HTML
?
Harvester
Query
XML
Repository
?
Other types
Dynamic document creation from database
retrieve resource
61Implementation Models
- Distributed Database Search
Query
retrieve resource
62Implementation Models
- Full-text indexing with metadata extraction
?
HTML
?
Indexer
Query
Index DB
XML
?
Other types
Dynamic document creation from database
retrieve resource
63Questions before implementation
- Do I really need Dublin Core?
- What is my budget?
- What type of resources do I want to describe?
- Which encoding format for which resource?
- Do I have community support?
- Can I provide creation tools?
64Challenges of implementing Dublin Core
- Intellectual
- Education of information creators
- Community consensus
- Resistance against sharing information
- Technical
- Efficient tools
- Infrastructure
- Economical
- Automatic generation vs. manual creation
- Cost of training
- Cost of tools
65(No Transcript)
66Dublin Core for the masses
- Why Dublin Core hasnt hit the consumer market
yet - No killer application
- Lack of standardisation
- No support in public search engines
- No support in mass market applications
- Non transparent applications
- Inefficient handling in HTML
67Further Information
- Projects Official Dublin Core web
site http//purl.oclc.org/dc/projects/index.htm - Mailing lists Dublin Core Implementors workgroup
Mailing list http//www.mailbase.ac.uk/lists/dc-i
mplementors/
68(No Transcript)
69Case Study AMOL (1)
- Gateway to Australian Museums and Galleries
- Initial idea One central access point for all
Australian collections - Creation of AMOL standard record for object data
due to lack of common standards - 8 basic field with focus on resource discovery
and easy deployment from within existing
databases - Fields Object Title, Object Name, Creator,
Description, Item ID, KeySearchTerms,
Date/DateRange, Associated Places
70Case Study AMOL (2)
AMOL search/ system architecture - current system
71Case Study AMOL (3)
Lessons Learned
- Data and technology related
- Lack of consistent use of controlled
vocabularies, quality of data recorded - Performance of indexing software, lack of
metadata support in public search engines - high administration efforts
- Intellectual
- Users have problems with empty text box
approach - Limited information in record to see context with
larger picture - General
- Large institutions bureaucratic machinery,
complex collection systems designed without
interoperability in mind - Small institutions concerned about security
issues,fear of larger institutions
72Case Study AMOL (4)
New perspectives
- New resource types Information about
institutions, Images, Video, Audio, general HTML
pages - goes beyond capabilities of standard AMOL
record - Need to provide easier access for users
- New cross community projects require
interoperable metadata standards for cross domain
searching - Strong move in Australia towards Dublin Core
based metadata schemas driven by government - Strong move towards interpretation of objects
through stories - Search Architecture and extended AMOL metadata
standard
73Case Study AMOL (5)
NEW AMOL search/ system architecture
74Case Study AMOL (6)
- Future Directions
- Implementation of RDF for dynamically served
databases and text style resources - Consensus of community Metadata Forum
- Further education of users Metadata Workshops
- Creation of multi-type metadata schema based on
Dublin Core - Creation of mapping tools for easier database
implementation
75Case Study AMOL (7)
- Recommendations
- Prepare good user guides
- Run workshops and educate museum professionals
- Get consensus from community
- Plan with interoperability in mind
- Evaluate tools and plan for future additions
- Biggest Problem still remaining
- what is the benefit to the individual institution
other than being interoperable for networked
resources
76(No Transcript)
77Dublin Core for the masses
- Why Dublin Core hasnt hit the consumer market
yet - No killer application
- Lack of standardisation
- No support in public search engines
- No support in mass market applications
- Non transparent applications
- Inefficient handling in HTML
78Further Information
- Projects Official Dublin Core web
site http//purl.oclc.org/dc/projects/index.htm - Mailing lists Dublin Core Implementors workgroup
Mailing list http//www.mailbase.ac.uk/lists/dc-i
mplementors/
79http//www.cimi.org/
80For Machine Communication we need..
SemanticInteroperability
Lets talk Resource Description
Standardisation ofcontent
Creator, Publisher..,
StructuralInteroperability
Standardisation ofform
Lets use MICI
Field 1 Element Name
SyntacticInteroperability
Standardisation ofexpression
Heres how to say it in HTML
ltMeta name Element Name .gt