PIDs in Data Infrastructures - PowerPoint PPT Presentation

About This Presentation
Title:

PIDs in Data Infrastructures

Description:

PIDs in Data Infrastructures Peter Wittenburg CLARIN Research Infrastructure EUDAT Data Infrastructure Automatic Workflows most data is created automatically as part ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 13
Provided by: prz8
Learn more at: https://www.ietf.org
Category:

less

Transcript and Presenter's Notes

Title: PIDs in Data Infrastructures


1
PIDs in Data Infrastructures
  • Peter Wittenburg
  • CLARIN Research Infrastructure
  • EUDAT Data Infrastructure

2
Automatic Workflows
  • most data is created automatically as part of
    workflows
  • manual operations are exceptions
  • at data creation time it is not obvious what
    their future life will be
  • later association with metadata and PIDs
    troublesome and costly
  • thus immediate generation of metadata and PIDs as
    part of automated
  • workflows
  • data resources need to be referable and often
    citable (published)
  • need a reliable and highly performing machinery
    (registration resolution) based on stable
    standards

typically Handles via EPIC
typically DOIs via DataCite
3
PID usage in our domain
? ? ?
  • assume that we have a recording of an extinct
    language and some
  • annotations that tell us what someone said
    about medicine etc
  • researchers create relations that need to be
    preserved

Video Recording
from Repository A
from Repository B
Recording Session Metadata Record
Sound Recording
from Repository C
How long, stable and persistent? are using
Handles from EPIC service
Annotations
4
PID usage in our domain
? ? ?
Biological and cultural processes have evolved
together, in a symbiotic spiral they are now
indissolubly linked, with human survival unlikely
without such culturally produced aids as
clothing, cooked food, and tools. The twelve
original essays collected in this volume take an
evolutionary perspective on human culture,
examining the emergence of culture in evolution
and the underlying role of brain and cognition.
The essay authors, all internationally prominent
researchers in their fields, draw on the
cognitive sciences -- including linguistics,
developmental psychology, and cognition -- to
develop conceptual and methodological tools for
understanding the interaction of culture and
genome. They go beyond the "how" -- the questions
of behavioral mechanisms -- to address the "why"
-- the evolutionary origin of our psychological
functioning. What was the "X-factor," the magic
ingredient of culture -- the element that took
humans out of the general run of mammals and
other highly social organisms?Several essays
identify specific behavioral and functional
factors that could account for human culture,
including the capacity for "mind reading" that
underlies social and cultural learning and the
nature of morality and inhibitions, while others
emphasize multiple partially independent factors
-- planning, technology, learning, and language.
The X-factor, these essays suggest, is a set of
cognitive adaptations for culture.
ePublication Repository 1
eRessource Repository 2
How long, etc.? Handles from EPIC
5
Data Object World
  • lets isolate external properties of our data
    objects and collections and ignore the content
    (structure, semantics, packaging, etc.) for a
    moment

goes back to a paper by Kahn Wilensky, 2006
6
2 DO flavours in our domain
DO
access via metadata
metadata
bit sequence (instance)
immediate access ?
access via PID
PID
  • way how we organize data
  • different other variants possible

MDO
access via metadata
metadata
bit sequence (instance)
search/browse access
access via PID
PID
7
collections in our domain (similar to MPEG21
containers, items, sub-items)
ISOcat Registry (ISO 12620, compl. ISO 11179)
- grouping of related data - large variety of
reasons - versions of a DO - presentations
of a DO - same interview/experim. - many
others - DO part of many collections
category 1 - assoc info category 2 - assoc info
metadata (collection) - category 1 - category
2 ... - category N - PID1 - PID2 ... - PID K
metadata - category 1 - category 2 ... - category
N - PID
PID collection - assoc info PID1 - assoc
info PID2 - assoc info
bit sequence
PID Registry
8
EUDAT - common services
  • two major tracks
  • understanding data organization practices in
    communities
  • provide first common services after 12 months

9
PID Use V1 in EUDAT Federation
repository Y
repository Z
repository X
DO1
DO1
DO1
prefx
PIDx
URL URLy URLz CKSM Rights ....
domain X
domain Y
domain Z
10
PID Use V2 in EUDAT Federation
repository Y
repository Z
repository X
DO1
DO1
DO1
prefx
prefy
prefz
PIDx
URL RoR HDL CKSM Rights ....
PIDy
URL RoR HDL CKSM Rights ....
PIDz
URL RoR CKSM Rights ....
domain X
domain Y
domain Z
11
EUDAT relying on EPIC Handles
  • EPIC (European PID Consortium CSC, SARA, GWDG,
    more)
  • large data centers with national/organizational
    (MPS) support
  • applying redundancy schemes (persistence,
    availability)
  • reliability, robustness, performance
    (registration, resolution)
  • all the same API (agreement on information
    associated)
  • thus PID syntax not crucial but storing /finding
    information
  • feasible business model for science
  • security of administration DB for system
  • persistent and balanced governance for HS
  • need a worldwide registry of agreed information
    types to feed our stupid machines

12
Information types in discussion
  • multiple links to resources
  • checksum
  • link to metadata
  • citation metadata
  • RoR statement
  • mutability flag
  • persistency statement
  • pointers to presentation versions
  • provenance statement
  • collection statement
  • pointer to rights
  • (support for parts/fragments)
  • (actionable PIDs)

- need agreements - need standard APIs for EUDAT
this is crucial
Write a Comment
User Comments (0)
About PowerShow.com