DCC Workshop on Persistent Identifiers A Layered Model Decision Making Concerning Persistent Identif - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

DCC Workshop on Persistent Identifiers A Layered Model Decision Making Concerning Persistent Identif

Description:

Digitising a photo archive, use a separate system for that? ... printing example feeds into thinking about multiple versions in the digital world ... – PowerPoint PPT presentation

Number of Views:48
Avg rating:3.0/5.0
Slides: 38
Provided by: StuartL158
Category:

less

Transcript and Presenter's Notes

Title: DCC Workshop on Persistent Identifiers A Layered Model Decision Making Concerning Persistent Identif


1
DCC Workshop on Persistent IdentifiersA
Layered Model Decision Making Concerning
Persistent Identifiers
  • Stuart Weibel
  • Senior Research Scientist
  • July 1, 2005

2
Open Univ
  • How do you establish that you need a persistent
    identifier (as opposed to a non-persistent one)
  • E.g. OU is creating a new course, with a very
    complex collection of courseware, and an equally
    complex workflow which manages all the parts
  • At what stage do we assign persistent identifiers
    to all the bits?
  • SW Is this a granularity issue?
  • In part, but theres ownership issues

3
CETIS
  • Need to be careful about advice, and who we give
    it to
  • Wheres the line between what you do and dont
    need to care about
  • At what point does persistence become important
  • PB publish is the crucial word here

4
LSE
  • One or many systems?
  • Digitising a photo archive, use a separate system
    for that?
  • Who assigns identifiers at the individual school
    level
  • Managing identifier incrementation
  • What about managing versions in a repository?
  • Separate identifiers for distinct manifestations
    of the same article?

5
Nat Lib Scot
  • Anyone with experience of Nat Lib of Aus scheme?

6
Strathclyde Univ CDLR
  • Confession I assign identifiers
  • And Ive changed them
  • And changed their URLs
  • But, Ive left redirects
  • Is this ok?

7
Edin Univ Library
  • How relate identifiers for a physical object to
    digital manifestations of that object
  • Preexisting vs. internal institutional
    identifiers
  • Public vs. private
  • How implement identifiers within a METS document
  • Enders Not a METS problem, an XML problem
    SICI is not an XML ID

8
Edin Univ Lib
  • If http URIs are a given, should they be opaque
    (e.g. ISBNs) or semantic (e.g. http//www.w3.org/R
    EC-xml/)
  • Do you know your own barcode?
  • Semantic
  • Human readable (within a designated community,
    broad (native speakers of English) to narrow
    (well-trained biomed librarians)
  • vs
  • Rule-based, so decodable given some additional
    info
  • Opaque
  • E.g. ARK, random string, with (by design) no
    structure
  • E.g. LibCon(?) concatenation of year and
    accession number
  • What am I trading off wrt to one choice or
    another?
  • Are http URIs intrinsically semantic?

9
Enders
  • If weve assigned identifiers, in a non-online
    repository, how can I get a persistent
    identifier?
  • E.g. DSPACE
  • AD DSPACE uses Handle
  • Cetis Fedora uses Handle (?)
  • Strathclyde Eprint.org does not offer any

10
CETIS
  • Authoring and content tools should help out here
  • SW Where in workflow should identifiers be
    assigned?

11
EDINA
  • To what extent should we have an international
    non-territorial view of assigning identifiers?
  • Or should we stay within the bounds of the nation
    state?

12
Glasgow Univ Lib
  • If from my repository I want to refer to some
    other repositorys identifier, in a persistent
    way
  • Can I do so responsibly independently of how
    theyve done this
  • JK If they dont publish a name they take
    responsibility for, dont go there!
  • Image archive example you can email a filename
    and get a temporary URI (ftp) back how can I
    write down into a formal workflow a reference to
    BigBear file name so-and-so
  • So its a permanent identifier for a static
    digital resource, but I cant refer to it
    formally
  • RR instance of general case of offline
    resources
  • SW They need to be pressured to recognise the
    value of e.g. prepending http
  • But they wont do that today, so Im going to do
    it, but that feels like breaking the rules
  • ?? What about names for resources in GRID
    computing?
  • ?? One option is Handles, recently released
    module for Globus
  • ?? Surely allocating persistent http URIs
    would be overkill with 5M large images
  • SW Publishing the naming structure is distinct
    from delivering all of them instantly
  • tutti design-on-the-fly

13
SunCat
  • Working on a catalogue of serials etc., heard
    about FRBR, etc., but inadequate for my needs, do
    I use the same identification scheme up and down
    the hierarchy
  • AP important to distinguish between
    application-level questions and in-principle
    general questions
  • PB Application-level are not ipso facto out of
    scope
  • And whats an expression?
  • SW Out of scope

14
UKOLN
  • Candidate functional requirements for persistent
    identifiers
  • Compare for equality
  • Retrieve from it
  • Do we get more specific, e.g. ARK?
  • Ding an sich JK or suitable surrogate
  • HST Demurs
  • Description
  • Cover the case where the identifier no longer
    identifies anything?
  • Change history
  • PB Theres a wealth of relevant experience
  • Persistence commitment
  • SW Demur no always desirable, sometimes not
    desirable
  • Encode and distribute

15
NLScotland
  • Where would we go in ISO to do this
  • PB There is an ISO technical cttee
  • RR Wrt NISO, I can feed back on where do we go
    Standards, recommendations, Best Practices notes

16
Open Univ
  • Can we clarify whats been done wrt Premis data
    dictionary?
  • Just published a new dictionary
  • JK Wrt identifiers, says nothing, judged too
    controversial/out of scope

17
DCC
  • Is there a relationship between the identifier
    and the findability of a resource?
  • Given pervasive use of search engine
  • SW, PB What are the roles in discover and
    location of service

18
British Library
  • If we do use http URIs for identifiers, does the
    HTTP protocol provide any particular actions
    which make sense, e.g. PUT or POST?
  • Should resolution be a determiner of existence
  • SW redirection comes in here
  • In theory, any 10/13 digit number can be an ISBN
  • What does 404 not found mean?

19
Strathclyde
  • Impact of search engines people arent using
    identifiers any more
  • Should we question to assumption that identifiers
    are the entry point into access to resources

20
DCC/Hunter
  • Whats the impact of the fact that the community
    is split into two parts
  • Those with detailed knowledge of the identity
    system
  • Those who just have requirements which identity
    contributes to solving
  • Bookshop example
  • PB Ironmonger example

21
Discussion candidates
  • Semantic vs. opaque
  • What should libraries do
  • Is retrievable a necessary functional
    requirement
  • What are we identifying
  • What is the connection between the identifier and
    the think itself?

22
Seamus/DCC
  • Can we do automatic assignment of identifiers to
    extracts from (multiple) databases
  • RR About attaching metadata as you go
  • As an author, a private person, how do I get an
    identifier for my work
  • Overwriting is OK, always?
  • Historical printing example feeds into thinking
    about multiple versions in the digital world
  • PB Relevant term is provenance and the role
    of identifiers

23
UKOLN
  • Are the functional requirements on identifiers
    across domains or classes of resources?

24
EDINA
  • Is the distinction between persistence for
    record and persistence for reuse relevant to
    the design of identifiers?
  • Archival vs. access?
  • SW Difference in behaviour based on difference
    in object identified
  • Seamus BNFL vs. BBC example

25
In the Information World
  • We care about identifying resources
  • Physical
  • Virtual
  • Conceptual
  • Knowing you have what you think you have
  • Comparing identity (referring to the same thing)
  • Reference linking
  • Managing intellectual (or physical) property

26
What do we want from Identifiers
  • Authority
  • Reliability
  • Appropriate Functionality (resolution and other
    services)
  • Persistence throughout the life cycle of the
    information object
  • What are the business models to support
    identifiers?
  • Not just a matter of money, but costs are part of
    the equation

27
The Identifier Layer Cake
  • Identifiers come in many sizes, flavours, and
    colours what questions do we ask?

Social
Business
Policy
Technology
Functionality
The Web httpTCP/IPfuture infrastructure?
28
Functional Layer Operational characteristics of
Identifiers
  • Is it globally unique? (easy)
  • What is the means for matching persistence with
    the need?
  • Can a given identifier be reassigned?
  • Is it resolvable? To what?
  • How does it behave? What applications
    recognize it and act on it appropriately?
  • Is the name portion of the identifier opaque,
    or can it carry semantics?
  • Do humans need to read and transcribe them?
  • Do identifiers need to be matched to the
    characteristics of the assets they identify?

29
Information Assets have life cycles with
different characteristics
  • Journal Articles
  • Created
  • Reviewed
  • Pre-published Published
  • Versioned
  • Sold Resold
  • Archived
  • Cited
  • Distributed in a variety of channels (appropriate
    copy problem)
  • Concepts Terms
  • Created and deprecated
  • Versions
  • Definitions
  • Abstract (concepts) and instantiated (terms)
  • Translations (for terms)
  • Position in a hierarchy (ontology)
  • Relations, linkages

30
Technical Layer
  • What dependencies are assumed?
  • http tcp/ip(bar codeRFID) scanners
  • What is the nature of the systems (both software
    and social) that support assignment, maintenance,
    resolution of identifiers?
  • Are servers centralized? federated? peer to peer?
  • How is uniqueness assured?

31
Policy Layer
  • Who has the right to assign or distribute
    Identifiers?
  • Who has the right to resolve them or offer
    serves against them?
  • What are appropriate assets for which identifiers
    can be assigned, and at what granularity?
  • Can identifiers be recycled?
  • Can ID-Asset bindings be changed?
  • Is there supporting metadata, and if so, is it
    public, private, or indeterminate?
  • Is there a governance model?

32
Business model layer
  • Who pays the cost?
  • How, and how much?
  • Who decides (see governance model)?
  • The problem with identifier business models
  • Those who accrue the value are often not the same
    as those who bear the costs
  • You probably cant collect revenue for resolution

33
Social Layer
  • The only guarantee of the usefulness and
    persistence of identifier systems is the
    commitment of the organizations which assign,
    manage, and resolve identifiers
  • Who do you trust?
  • Governments?
  • Cultural heritage institutions?
  • Commercial entities?
  • Non-profit consortia?
  • We trust different agencies for different
    purposes at different times

34
Terminology Identifiers
  • Global, persistent identifiers that reflect the
    functional characteristics of webulated
    controlled vocabularies can help us remove
    boundaries between and among communities and
    disciplines.
  • Problem Identify these functional requirements
    and tailor identifier systems to meet them.

35
Identifiers for Concepts
  • How do you use terminology in the Web World?
  • The Semantic Web is about semantics exchanging
    tokens of meaning between machines
  • Identifiers are a fundamental part of this.

36
Concepts can be expressed in language independent
ways (even if imperfectly)
  • Vietnamese War, 1961-1975
    DDC/22/eng//959.7043
  • (English language version of DDC 22)
  •  
  • American War, 1961-1975 DDC/22/vie//959.7043
  • (Vietnamese language version of DDC 22,)

37
Boundary-Free Community Terminologies
  • Controlled Vocabularies have been with us for a
    long time
  • Hypothesis there are specific functional
    requirements that terminologies should embody in
    order to be useful in the realization of the
    Semantic Web
Write a Comment
User Comments (0)
About PowerShow.com