Der Beitrag der Fachgesellschaften zum Aufbau integrierter wissenschaftlicher Informationssysteme au - PowerPoint PPT Presentation

1 / 32
About This Presentation
Title:

Der Beitrag der Fachgesellschaften zum Aufbau integrierter wissenschaftlicher Informationssysteme au

Description:

Elements of Plan of the DMV (1993-95) for a. Distributed ... DC-1, March 1995, Dublin, Ohio. OCLC/NCSA Metadata Workshop. DC-5, October 1997, Helsinki ... – PowerPoint PPT presentation

Number of Views:160
Avg rating:3.0/5.0
Slides: 33
Provided by: Lgg4
Category:

less

Transcript and Presenter's Notes

Title: Der Beitrag der Fachgesellschaften zum Aufbau integrierter wissenschaftlicher Informationssysteme au


1
Der Beitrag der Fachgesellschaftenzum Aufbau
integrierter wissenschaftlicherInformationssystem
e aus Sicht der Mathematik
  • Joachim Lügger
  • Konrad-Zuse-Zentrum
  • für Informationstechnik Berlin
  • META-LIB-Workshop am 22./23. Juni 1998
  • SUB Göttingen

2
Some Organisational Characteristics of
theDistributed Information System for Mathematics
  • Involve scientific libraries, providers of
    specialised information, publishers, special
    interest groups
  • Proceed user oriented - find out what they want
  • Plan for a infrastructure for mathematics and
    extend it, (Informationsbeauftragte in all
    departments/institutes)
  • Communicate widely - look for technical
    co-operation
  • Exchange experience with related disciplines
    (IuK)
  • ...
  • Development of a news service (short notes) and
    a related information store ordered by
    collections, classifications,...

3
Elements of Plan of the DMV (1993-95) for a
Distributed Information System for Mathematics
  • Offer of electronic preprints and reports by
    authors
  • Development of a model for electronic journals
  • Availability of research software - sources and
    test data
  • Distribute descriptions of research projects
    early
  • build an electronic mathematical museum
    (multimedia)
  • ...
  • Usage of electronic communication (e-mail, ftp,
    gopher)
  • Organisation of a distributed information
    infrastructure
  • Development of some mechanism for search and
    retrieval in distributed and heterogeneous data
    collections.

4
User Oriented Services
  • Local - Collections of Documents
  • offer of all (!) relevant mathematical
    information
  • browsing and querying of hierarchic ordered
    collections
  • utilisation of stringent classifications by
    authors (MSC, GAMS,...)
  • National - Collections of Collections
  • integration/centralisation of collection oriented
    services
  • news service (distribution of "first pages")
  • central store of news first-pages, which can be
    browsed and searched (switching different views)

5
Spectrum of Mathematical Information
  • Digital publications, preprints, reports
  • Electronic teaching materials (Exercises, ...
    ,Applications)
  • Electronic announcements of (local) events and
    talks
  • Mathematical software, data collections
  • Electronic information on projects and research
    groups
  • Contact addresses (E-mail, Phone, Fax, ...
    Homepages)
  • Digitised collections of historical materials
    (Manuscripts)
  • "Mathematical Museum" (Multimedia, ...
    Visualisations)
  • Collections of links to other relevant resources

6
Origins of Metadata
  • NSF/NASA/ARPA Digital Library projects
  • Maps, images, geospatial data
  • Journals, books, general scientific information
  • Environmental databases, agricultural data
  • Videos, computer vision materials
  • Governmental archives and data
  • OCLC/NCSA (USA) and UKOLN (UK)
  • Document delivery and supercomputing
  • Library and information sciences
  • Internet, WWW and search engines
  • National Libraries, Museums, Cultural Heritage
  • Preservation of documents
  • Pictures, images (original art and
    digitisation's)
  • Natural artefacts and artificial objects

7
Metadata Data about Data
  • Improving resource discovery in networks
  • Description of resources
  • Automatic discovery, indexing and retrieval
  • Interoperability of digital libraries
  • Combination of digital resources
  • Integration of heterogeneous databases
  • Interoperability of information systems
  • Wide accessibility of catalogue information
  • Opportunities for interdisciplinary collaboration
  • Geospatial activities and environmental
    initiatives
  • Human genome project and medicine
  • Electronic publishing and education
  • Visual Arts and Information Sciences

Metadata is Information that makes data useful.
The focus is not on technologies or libraries but
rather on usability and utilisation.
8
Examples Metadata Formats
  • Libraries
  • USMARC US Machine Readable Cataloguing
  • MAB Maschinelles Austauschformat für Bibliotheken
  • Humanities
  • TEI Text Encoding Initiative
  • EAD Encoding Archival Description
  • Geospatial resources
  • FGDC Content Standard for Digital Geospatial Data
  • Museums
  • CIMI Computer Interchange of Museum Information
  • Government
  • GILS Government Information Locator System
  • Internet resources
  • IAFA Internet Anonymous Filetransfer
  • SOIF Summary Object Interchange Format (Harvest)

9
Do Metadata Schemes Have Something in Common?
Libraries
Internet/ WWW
Museums
Astronomy
Dublin Core
Humanities
Geospatial
Environment
Government
  • Research is more and more inter- and even
    transdisciplinary.
  • Information exchange between science and society
    is essential.
  • Cataloguing communities are creating their own
    metadata methodologies according to their habits,
    needs and uses.

10
Basic Dublin Core Design Principles
The discussion was further restricted to the
metadata elements for the discovery of what we
called document like objects, or DLOs by the
workshop participants. Weibel-95 on DC-1
DC-1, March 1995, Dublin, Ohio OCLC/NCSA Metadata
Workshop
DC-5, October 1997, Helsinki OCLC/Nat. Library of
Finland
  • Intrinsicality
  • Extensibility
  • Syntax Independence
  • Optionality
  • Repeatability
  • Modifiability
  • Simplicity
  • Semantic Interoperability
  • International Consensus
  • Flexibility

Back in 1995 we focused on providing authors with
the ability to supply metadata ... . This is
happening, but not as much as we expected most
metadata is being created by cataloguers, or by
information professionals we wouldnt quite call
cataloguers or by other non-authorial agents.
Caplan-97
11
The Dublin Core Metadata Element Set
http//purl.org/metadata/dublin_core_elements
Intellectual Property
Content
Instantiation
  • 1. Title)
  • 3. Subject and
  • Keywords
  • 4. Description
  • 11. Source
  • 12. Language
  • 13. Relation
  • 14. Coverage
  • 2. Author or
  • Creator
  • 5. Publisher
  • 6. Other
  • Contributor
  • 15. Rights
  • Management

7. Date 8. Resource Type 9.
Format 10. Resource Identifier
) Element numbers as given in the definition.
The Dublin Core is built around the library
metaphor, i.e. the catalogue card.
12
Collections of the Math-Net Project
  • Preprints and Published Articles
  • Teaching Materials
  • Talks
  • Mathematical Software, Data Collections
  • Projects and Research Groups
  • Talks, Lectures
  • Personal Homepages

13
DC-Element Title
  • The name given to the resource, usually
  • by the Creator or Publisher.
  • DC.Title
  • SUBELEMENTS
  • DC.Title.Alternative (used for any titles
    other than the main title including subtitle,
    etc.)

The qualifier LANG may be important for the Title
element.
14
1. Title
  • The name given to the resource, usually
  • by the Creator or Publisher.
  • DC.Title Preprints complete title of the article
  • Teaching complete title of the work
  • Talks title of the talk
  • Software name of the software/source
  • Projects name of the project/group
  • Personal n.a.

No subtitles, no special schemes or qualifiers
are used.
15
2. Author or Creator
  • The person or organisation primarily responsible
    for creating the intellectual content of the
    resource.
  • DC.Creator -- n.a.
  • DC.Creator.PersonalName Preprints last name,
    first name,... (no title)
  • Teaching last name, first name,... (?)
  • Talks last name, first name,... (?)
  • Software last name, first name,...
  • Projects name of the head of the group
  • Personal last name, first name,... (no
    title)
  • DC.Creator .Email Preprints E-mail address
  • .PersonalName.Address Teaching e-mail address
  • .Address Talks e-mailfaxphoneofficeadd.
  • .Email Software e-mail address
  • .PersonalName.Address Projects e-mail address
  • .PersonalName.Address Personal e-mailfaxphone
    officeadd.

16
DC Problem a Name and its Notation (I)
  • you need a scheme to write and to interpret it
    automatically
  • Grötschel, Prof. Dr. M.
  • Prof. Dr. Martin Grotschel
  • Martin Gr\otschel
  • Groumltschel, M., Prof. Dr.
  • you must write correctly if you want alphabetic
    lists
  • you need a coding convention (incl. accents,
    vowels, etc.)
  • There is no provision of an universal coding
    scheme e.g.,
  • LCNAF (LOCs Name Authority File) is community
    specific.
  • you need subelements for proper discrimination in
    searches
  • DC.Creator ...
  • DC.Creator.PersonalName ...
  • DC.Creator.CorporateName ...
  • DC.Creator.PersonalName.Address ...
  • DC.Creator.CorporateName.Address ...

Who is the creator in case of a digitised
manuscript from Gauß? Is it the person who
digitised it or Gauß?
17
DC Problem a Name and its Notation (II)
If all of these problems are solved, then there
remains the ...
  • Tschebytscheff-Problem Hazewinkel,
    Osnabrück, Oct. 1997

Chebychef Chebycheff Chebychev Chebyhev
Chebyschev Chebysev Chebyshef Chebyskev
Tchebychef Tchebycheff Tchebychev Tchebyschef
Tchebyscheff Tchebyschev Tchebyshef
Tchebysheff Tchebyshev Tchebytcheff Tschebishev
Tschebychef Tschebyscheff Tschebychev
Tschebyschef Tschebyscheff Tschebyschev
Tschebysheff Tschebyshev Tschebyshew
There are more than 600 variants of
writing Tschebytscheff correctly.
18
3. Subject PreprintsTeachingTalksSoftwareProj
ectsPersonal
  • SCHEME
  • DC.Subject -- uncontrolled keywords,
    description
  • Math-Net Math-Net subject classification
  • DC.Subject .MscPrimary msc91 primary
    MSC-classification
  • .MscSecondary msc91 secondary
    MSC-classification
  • Msc msc91 union of primary and secondary MSC
  • .Topic "Mathematics" (if MSC-classified)
  • DC.Subject .Pacs pacs PACS-classification
  • .Topic "Physics" (if PACS-classified)
  • DC.Subject .Cr cr CR-classification
  • .Topic "Computer Science" (if CR-classified)
  • DC.Subject.Zdm zdm ZDM-classification
  • .Topic "Mathematics Education" (if ZDM-cl.)
  • DC.Subject.Gams gams GAMS-classification
  • .Topic "Software" kind of software

19
Some Bibliographic Schemes
Rebecca Guenther Library of Congress
  • Author or Creator
  • LCNAF Library of Congress Name Authority File
  • Subject and Keywords
  • LCSH Library of Congress Subject Headings
  • MeSH Medical Subject Headings
  • AAT Art and Architecture Thesaurus
  • LCNAF Library of Congress Name Authority File
  • DDC Dewey Decimal Classification
  • LCC Library of Congress Classification
  • NLM National Library of Medicine Classification
  • UDC Universal Decimal Classification
  • in Germany
  • PND PersonenNamenDatei
  • GKD Gemeinsame KörperschaftsDatei
  • SWD SchlagWortnormDatei

20
4. DescriptionPreprintsTeachingTalksSoftwareP
rojectsPersonal
A textual description of the content of the
resource, including abstracts in case of
document-like-objects of content descriptions in
case of visual resources.
  • SCHEME
  • DC.Description -- a short textual description
  • (url) URL of a short textual description
  • DC.Description.Abstract -- abstract of the
    resource
  • (url)... URLabstract (to abstract within
    body)
  • DC.Description.Notes -- additional (technical)
    information
  • (url)... URLabstract (to note within body)

An abstract, a description or a note within the
body must to be surrounded by special commentary
texts.
21
Date, Type, and Relation
  • CONTENT
  • DC.Date YYYYMMDD date of last modification
  • DC.Date.Created YYYYMMDD date of the creation of
    the first version
  • DC.Type preprint for preprints
  • article for published articles
  • software for published sources
  • Text.Homepage for personal Homepages
  • Text.Homepage.Organisation
  • for research projects/groups
  • DC.Relation (SCHEMEurl) URL of related document

22
DC-Problems DATE and RELATION
  • The notation of the DATE field has been fixed
    to ISO8601.
  • But what are we providing access to - a digital
    representation of the painting, or the Webpage it
    is upon, or both? Larsgaard-Dec-97
  • 11 Principle
  • Each resource should have a discrete metadata
    description, and each metadata description should
    include elements to a single resource. It is
    desirable to be able to link these descriptions
    in a coherent and consistent manner (by usage of
    the RELATION-field).
  • Subelements of the DATE-field (as of Feb-98)
  • Date.Created
  • Date.Issued
  • Date.Accepted
  • Date.Available
  • Date.Gathered
  • Date.Valid
  • Subelements of the RELATION-field (under
    development)
  • Relation.Type

As agreed upon at DC-5, Helsinki, Finland,
Oct-97 The Helsinki Metadata Workshop OCLC/Nat.
Library of Finland
23
DC-Element Relation - under development
  • An identifier of a second resource and its
    relationship to the present resource. This
    element permits links (via a SCHEME qualifier
    free text default, URL, URN, ISBN,...) between
    related resources and resource descriptions to be
    indicated.
  • Inclusion Relation (e.g. collection, part of)
  • DC.Relation.IsPartOf
  • DC.Relation.HasPart
  • Version Relation (edition, draft)
  • DC.Relation.IsVersionOf
  • DC.Relation.HasVersion
  • Mechanical Relation (copy, format change, mirror
    copy)
  • DC.Relation.IsFormatOf
  • DC.Relation.HasFormat
  • Reference Relation (citation)
  • DC.Relation.References
  • DC.Relation.IsReferencedBy
  • Creative Relation (translation, annotation)
  • DC.Relation.IsBasedOn
  • DC.Relation.IsBasisFor

24
DC-Element Coverage - under development
  • The spatial and/or temporal characteristics of
    the intellectual content of the resource.
    Coverage may be modified by spatial or temporal
    qualifiers
  • Subelements - as determined by the Coverage WG
  • DC.Coverage.PeriodName
  • DC.Coverage.PlaceName
  • DC.Coverage.t
  • DC.Coverage.x
  • DC.Coverage.y
  • DC.Coverage.z
  • DC.Coverage.Polygon
  • DC.Coverage.Line
  • DC.Coverage.3d

Spatial coverage refers to a physical region
(e.g. celestial sector) use coordinates (e.g.,
longitude and latitude) or place names that are
from a controlled list or are fully spelled out.
Temporal coverage refers to what the resource is
about rather than when it was created or made
available.
25
What is the Dublin Core?
Libraries
Internet/ WWW
Museums
DC Minima- lists
Astronomy
Humanities
Structuralists
Geospatial
Environment
Government
Tom Bakers Theory of Pidgin Metadata
Weibel-Oct-97 Pidginization results from the
need for communication among groups, who do not
share a common language. Creolization is the
process of complexification of a pidgin language
in order to make it more adequate to the
complexity of natural language expressivity.
26
What the Dublin Core is Not
  • It was never an objective of the DC working group
  • to design a brute force simplification of
    cataloguing interoperability is a main goal.
  • to reduce costs for resource descriptions
    these costs basically depend from the users
    needs.
  • to replace existing practice in cataloguing the
    DC community, however, gets much useful critique
    and support from the cataloguing community.
  • to prescribe syntax, formats or implementation
    the usage of the WWW, however, is encouraged.

The Dublin Core is a simple (and uniform)
conceptual scheme for the description of
resources.
27
Experience of a DC application to Images (III)
  • Full cataloguing is a complex, time-consuming
    process. Library administrators, when they feel
    like being horrified, figure out how much time
    (and therefore money) it takes per title - around
    67 per item, at least at Davidsons Library, ...
  • There are many more possible methods of access
    where full cataloguing is used the question is,
    how necessary are they? And the answer is, it
    depends. What are users looking for?
  • The general experience in university libraries is
    that a brief record is sufficient, and
    indeed, this brief record is what normally
    displays in a library online catalogue.
  • Only the place of publication does not appear in
    the Dublin Core element set.

Larsgaard-Dec-97
28
Dublin Core and Classification (I)
If we could target our searches onto words which
are used as significant terms, we could achieve
an enormous improvement in precision. Metadata
can be used to achieve this by identifying just
the major concepts of the information resource.
Cathro-97
  • DC metadata attributes can be used like
    classification codes to restrict the search space
    top down (thus shrinking the context)
  • Classification codes can be used within several
    DC attributes (e.g. Subject and Keywords,
    Description, Coverage) to guide navigation
    (browsing, context-switching/shrinking/widening)
  • Certain DC attribute values can be used like
    classification
  • The name of a scientist may guide the search for
    items which are specific to the scientists area
    of interest.
  • Some keywords are classification terms by their
    very nature, e.g. terms of specialist terminology
    e.g., in biology or medicine.

29
Metadata-oriented Browsing and Searching
Math
Entry of search terms
Papers
GAMS classified
MSC
Software
People
MSC
Hypertext systems like HyperWave with integrated
search engines allow switching between browsing,
searching and hierarchical navigation modes. Thus
they enable a number of powerful context
switching methods.
30
Support of Specialised Open Communities
Search Engine
User communities, such as mathematicians,
could use metadata to form (or isolate) their own
virtual collections of resources.
  • Metadata are also useful to specify offline
    services accordingly such as
  • alerting, announcing
  • profile oriented searching
  • within the context of large heterogeneous
    collections of classified resources.

31
Dublin Core as Inter-Metadata
Now it appears an even more common application of
DC is as lingua franca, a least common
denominator for indexing across heterogeneous
databases. ... The simplest way to index them
all with some degree of semantic consistency may
be to translate them all to DC. Caplan-97
  • Integration of heterogeneous collections of
    resources
  • Inter- and transdisciplinary research projects
    are increasingly common in modern science
  • The research process of today results in rapidly
    growing products which are separated from each
    other by their content and form
  • If the Dublin Core will be accepted widely
  • Also a market of search engines may evolve on the
    grounds of the future WWW protocol suit and the
    DC as universal data structure
  • Users of such engines may have access to an ever
    growing number of heterogeneous and also well
    structured digital resources

32
Beyond Traditional Classification/Navigation
  • Interactive Maps
  • Virtual Tourist
  • CityNet
  • Geospatial coordinates
  • EarthView
  • Living Earth
  • Icons, Images
  • Blue-Skies
  • CineBase (Video server)
  • Chronologies, Historic Maps
  • History of Mathematics (MacTutor)
  • Theory, interactive navigation
  • Famous Curves Index (MacTutor)
  • Hypermedia navigation
  • ChemWeb (XML based)
  • ICM98 (HyperWave)
Write a Comment
User Comments (0)
About PowerShow.com