Title: We Must All Be Curators Now from Ingest to Service Delivery, in Data Library
1We Must All Be Curators Nowfrom Ingest to
Service Delivery, in Data Library National
Data Centre
Roles Responsibilities
- Peter Burnhill
- Director, EDINA
- JISC National Data Centre, University of
Edinburgh, Scotland UK - 10 October 2006
2Three different voices / roles
- Director, EDINA National Data Centre
- serving researchers, lecturers and students
across the UK - so something about what EDINA is what EDINA
does - EDINA is funded by the JISC
- so something about the JISC the JISC IE
- A time-served data person fellow professional,
from the University of Edinburgh - building on the past, planning for the future
- A substitute for another guy
- trying to make sense of what is going on
- working towards shared understanding
- proposing a framework of verbs nouns
3Joint Information Systems Committee (JISC)
- of all the UK funding councils for higher and
further education - Mission
- world-class leadership in the innovative use of
ICT for support of education research - Information Communication Technology
- Income mix of top-slice recurrent funding
capital grants
4Funding Councils, the JISC and EDINA
Higher Ed funding councils
Research Councils as Partners
UK National Data Centres
NDCs are now HEFCE-related bodies
5organisational infrastructure for JISC Services
- UKERNA runs Joint Academic Network (JANET)
- EDINA MIMAS national data centres
-
- Arts Humanities Data Service (AHDS)
- Economic and Social Data Service (ESDS)
-
- UKOLN Centre for Educational Technology
Interoperability Standards (CETIS) Digital
Curation Centre (DCC) British Universities Film
Video Council (BUFVC) Technical Advisory
Service on Images (TASI) Open Source Advisory
Service Nat. Centre for Text Mining Plagiarism
Advisory Service - JISC Legal/Monitoring/TechDis Regional Support
Centres UK Access Management / Athens - most located in universities across UK
6What is EDINA?
- A National Data Centre, designated by the JISC in
1995/96 - based on Edinburgh University Data Library, est.
1983/84 - Mission to enhance productivity of research,
learning teaching in UK higher and further
education - part of JISC Information Environment
- Keywords have been Accessibility/Outreach/Inter-wo
rking/Inter-operability - range of development projects and 24/7 services
- Geo-spatial, about which more later ..
- Scholarly communication Multimedia
- films images spoken word
- Infrastructure for Digital Library
- certificates rights middleware
- SDSS -gt UK Access Management Federation
- And the name, whats that stand for?
- Edinburgh Data Information Access
- Edina is the poetic name for Edinburgh
7Delivering online services, 24/7
- http//edina.ac.uk
- http//edina.ac.uk/
8(No Transcript)
9Biog as data person these past 25 years
- Moved to the University of Edinburgh in 1979
- formerly science staff at Social Science Research
Council (ESRC), 1974/77 - then medical statistician at Queen Charlottes
Maternity Hospital, 1978/79 - first as statistician researcher ( senior
lecturer) - with Scottish Education Data Archive, from 1979
- making survey data at Govt-funded research centre
(CES) - from design, data creation and documentation,
onto analysis - as survey methodologist in Edinburgh Survey
Methodology Group - then recruited to do RD for service delivery
- setting up managing Edinburgh University Data
Library, 1984 - - Co-director, ESRC Regional Research Laboratory,
Scotland 1986/90 - early days of Geographical Information Systems
(GIS) - member of Data Task Force, Inter-Agency Global
Env. Change - European Secretary (1993/95) President
(1996/2001) of IASSIST - international assoc. for (social science) data
librarians and archivists - Now EDINA IS Directorate at Univ. of Edinburgh
- Was Set-up Director for Digital Curation Centre,
2003/4 to 2004/5
10 maybe Ive been a data curator all along
- Scottish Education Data Archive, late 1970s mid
80s - Database of surveys of school leavers cohorts
of young people (16-19) - derived data, trend datasets over time, changing
classifiers (eg Social Class) - integrating data from different sources, eg
census small area statistics - made available online but under privileged not
open access - Edinburgh University Data Library, mid- 80s on
- Wider variety of datasets, obtained from others,
often via others - A local library of datasets
- Easing access to data held elsewhere (eg UKDA)
- made available online across ERCC wide area
network and beyond - building databases, sometimes with special
software, - ESRC Regional Research Laboratory, Scotland
1986/90 - early days of Geographical Information Systems
(GIS) - Integrating large-scale data, much geographic
or geo-spatial - EDINA national data centre, mid-1990s on
- National online access to wider range of
reference and source data - obtained under licence
- required value-added curation
- Digimap as but one example
11one example of data curation
11000152100913Playing Field 0901103
120001016400000 2100000010001004040097130
0 15000155 0321 0901103
0000000 2100000010001055810075820 0 15000156
0321 0901103 0000000 210000001000105713007669
0 0 15000157 0321 0901103
0000000 2100000010001060110075460 0 15000158
0321 0901103 0000000 210000001000106326007465
0 0 15000159 0321 8010619
0000000 2100000010001063370071760 0 15000160
0321 0901103 0000000 210000001000106673007670
0 0 15000161 0321 0901103
0000000 2100000010001058910068550 0 15000162
0321 0901103 0000000 210000001000106449006904
0 0 15000164 0321 0901103
0000000 2100000010001055710052730 0 15000173
0321 0901103 0000000 210000001000105873005039
0 0 15000174 0321 0901103
0000000 2100000010001059520050430 0 15000175
0321 0901103 0000000 210000001000105643004921
0 0 15000176 0321 0901103 0000000
OS digital data
12 maybe Ive been a data curator all along
- Scottish Education Data Archive, late 1970s mid
80s - Database of surveys of school leavers cohorts
of young people (16-19) - derived data, trend datasets over time, changing
classifiers (eg Social Class) - integrating data from different sources, eg
census small area statistics - made available online but under privileged not
open access - Edinburgh University Data Library, mid- 80s on
- Wider variety of datasets, obtained from others,
often via others - A local library of datasets
- Easing access to data held elsewhere (eg UKDA)
- ESRC Regional Research Laboratory, Scotland
1986/90 - early days of Geographical Information Systems
(GIS) - Integrating large-scale data, much geographic
or geo-spatial - EDINA national data centre, mid-1990s on
- National online access to wider range of
reference and source data - obtained under licence
- required value-added curation
- Digimap as but one example
- national repositories of digital content Jorum,
GRADE, TheDepot - Digital Curation Centre, 2004 2005
13Authorising Institutions for free-at-point of use
HE FE funding councils
Data Provider e.g. Ordnance Survey
Licensing Agent (JISC Collections)
Value-added Service Provider
Key role for Authentication (is-member of
Institution) and Authorisation (is-licensed
Institution)
end user (staff/student) access
Institution (Licence)
14EDINA as national data centre
- http//edina.ac.uk
- 50 direct funding from JISC for delivering
services - Good reputation for helpdesk, user interfaces,
FAQs etc - 24/7, 99 uptime
- 50 is extra awarded for Development activity
- Developing services developing JISC IE working
with Researchers - Acknowledged project competence for RD
- Strategic role as Geographic Data Centre
- For JISC (Digimap etc), for ESRC (UKBORDERS)
- Building Spatial Data Infrastructure with NERC
and internationally (OGC)
15Existing Geo-data Services
16Where are we with GIS?
- University of Edinburgh its Data Library have
long run interest experience - Geography Department (Coppock/Hotson
Waugh/GIMMS) PLU - first MSc GIS course, and much else
- ESRC Regional Research Laboratory for Scotland,
1987- - Launch of UKBORDERS in 1994
- EDINA has continued and extended that for
geo-spatial data - JISC eLib project access to Ordnance Survey
mapping, 1996- - Launch of Digimap service, 2000 -
- Extension of UKBORDERS, 2001 -
- Shared Services provision
- Go-Geo! (geo-data portal)
- geoXwalk
- GRADE Geospatial Repositor for Academic Deposit
and Extraction - Not all (only a fraction) of geo-referenced data
at EDINA - Strategic importance of interoperability
- GI web services
- Interested in furthering the use of GI data
across disciplines - Geo-parsing mark-up geo-finding geoXwalk
(vocabularies)
17 Somethings special about the spatial
EDINA role as Geographic Data Centre?
Slide borrowed from Liz Lyon, curated ..
182. Getting back to Problem Statement
- roles responsibilities
- Some Thoughts, and Questions
- What resources, and how should we share?
- What are scholarly resources?
- What is special about scholarship?
- What is different about digital?
- Who should do what?
- A division of labour that leverages
- responsibility and expertise for curation
- Means of service delivery
- Find our place in old and new geography
- words, numbers, pictures, sounds
- all to be digital accessed from afar
19Scholarship Services and Stewardship
- Services, in support of scholarship,
- Libraries have traditionally focussed on the
formal part of scholarly communication - Relevance searching strategies
- new challenges how to cope with digital
everything? - Stewardship
- Was Special Collections, now Collections, inc.
the digital - Ensuring provenance continuing access
- Digital curation, preservation archiving
- Sharing with future scholarship
- Sharing with wider world
- Research
- What do researchers do, and what do they
want/need? - eScience, Data, and scholar workstation and the
VRE - Learning and Teaching
- What do students need?
- What do teachers/lecturers need?
- e-learning and the VLE (virtual learning
environment)
20Infrastructure to support four demand-side verbs
- discover information object of interest
- e.g. article referenced in database, AI, eToC,
etc - locate organisation offering service
- e.g. library (union catalogue/OPAC)
- or document delivery service
- request use of service
- via payment of money or privilege of membership
- access object of interest
- via personal visit, document delivery, online
access -
- based on MODELS workshops (UKOLN/JISC eLib)
21Simplified workflow
Discover
Fit for purpose?
Locate
Access
Curate
Use
Publish
Issue
22Dataset publishing
- Re examine concept of Dataset Publishing
(Callahan, Johnson, and Shelley 1996) - analogous to publishing papers
- rewards for publishing datasets (e.g. promotion,
RAE) - procedures (e.g. standards to use, peer review)
resources to manage procedures - Should minimise time and effort required
- need tools to assist in creation, maintenance and
dissemination of dataset descriptions - Means of putting into a public/community
- Deposit and Share are too cosy
- to publicate, to issue
- Terms of access and use
- Open?
- Privilege of membership
- Payment of money
23Repositories of digital content
- So what is a digital repository?
- I like (user) verbs, not (supply-side) nouns
- A repository is a noun that meets a set of (user)
verbs/tasks, by supporting delivery of services
for a given/designated client community - Put ingest service
- Keep-safe storage service
- Get access service
- Motivation
- for the record? preservation prospect of access
- for re-use? curation current access
- Can we say, Behind every great service, there is
a wonderful managed repository? - No, not if access service does not have
corresponding ingest service.
24Repositories OAIS Reference Model?? In a
classic Repository, the DIP is the same as the
SIP ?? In a data centre, and many data
libraries, it rarely is.
25Support for Research research-led learning
- Data, software and facilities
- Data as evidence
- Data curation and digital preservation
continuing access - Data Archives and Data Libraries
- Social surveys, and much more
- IASSIST
- International Association for data professionals
(1972 -) - Members in Philippines and Vietnam
- Census Programme
- Small area statistics MIMAS
- UKBORDERS (boundaries for thematic mapping)
EDINA - EDINA Digimap Collection
- Topographic mapping data, from national mapping
agency - Marine Geological mapping data
- then there is the challenge of scientific
visualisation, and observational images and
documentary films!
26Scholarly Communication
- Access to commercial services resources
- Consortium licensing
- local hosting licensed data at National Data
Centres (NDCs) - Focus on community-generated resources
- Union catalogues ( links to ILL/docdel) - SUNCAT
- digital library developments
- Open Access repositories
- Put it in The Depot (www.depot.ac.uk)
- Need for Access Control as Middleware development
- Shibboleth framework, developed as part of
Internet2 - UK Access Management Federation for Education
Research - Managed by UKERNA, based on work by EDINA SDSS
- replacing vendors UserID password with
community scheme
27Scholarly Communication
Author
writes to be recognised by peer community for
institutional Research Assessment Excersise
(RAE) purposes perhaps to be read
(content of) article is the information object
of desire
Key User (Reader) Verbs Discover article of
interestLocate service on those
articlesRequest permission to use
serviceAccess to service/article
Reader
28Scholarly Communication(simple model focus on
articlelength work published in journals)
Author (article)
Publisher article serial issue
Licence
Libraries and Publishers provide framework
the traditional middleware/infrastructure
... with Licence(s) for electronic (online) and
print (on-shelf)
Library (serial)
Reader (article)
P.Burnhill, EDINA/JISC, 2005
29Scholarly Communication Open Access(Access to
articlelength work)
Forma economy
Digital Preservation
Open Access
Author (article)
repositories
peer review
Publisher article serial issue
Licensed Online Access
learned society
ILL/docdel
Licence
peer exchange
Institutional arrangement
E-prints
free2web access
Library (serial)
Reader (article)
Informal invisible college and the gift
economy
30Research Data
Creator
Generates (curates) data for own purpose, or as
part of team wants/has to put it
somewhere for use by others (perhaps to be
recognised by a peer community)
Key User (Researcher) Verbs Discover data of
interestLocate service on that data with
documentation on provenance etc Request
permission to use serviceAccess to
service/data,
Evidential value of data in analysis as object of
desire
Researcher
31Data (simple model)
Creator (dataset)
Data Centre (database)
Licence
... with what kind of Licence(s) for access?
??
(Data) Library
who provides framework? the middleware/infras
tructure
Researcher (data)
P.Burnhill, EDINA/JISC, 2006
32Doing Data
Forma economy
Digital Preservation
Open Access
Creator (dataset)
repositories
peer review
Data Centre
Authorised Online Access
Institutional arrangement
learned society
Licence
peer exchange
datasets
free2web access
Institution
Researcher
Informal invisible college and the gift
economy
33All Curators Now
- Thank you
- p.burnhill_at_ed.ac.uk
- http//edina.ac.uk
- http//jisc.ac.uk
34JISC Information Environment Architecture
(Idealised) Technical Infrastructure for
ServicesAndy Powell, 2005
35 Somethings special about the spatial
EDINA has role as Geographic Data Centre
Slide borrowed from Liz Lyon, curated ..
36Support for Research research-led learning
- Data, software and facilities
- Data as evidence
- Data curation and digital preservation
continuing access - Digital Curation Centre established
(Edinburgh-led) - Data Archives and Data Libraries
- Social surveys, and much more
- IASSIST
- International Association for data professionals
(1972 -) - Members in Philippines and Vietnam
- Census Programme
- Small area statistics MIMAS
- UKBORDERS (boundaries for thematic mapping)
EDINA - EDINA Digimap Collection
- Topographic mapping data, from national mapping
agency - Marine Geological mapping data
- I could say very much more about Digimap!!
- And then there are images and documentary films!
37(No Transcript)
38Focus on community-generated resources
- traditional ground for libraries
- Union catalogues ( links to ILL/docdel) SUNCAT
- SAsk me about SUNCAT
- digital library developments
- Resource Discovery Network
- Inter-operability not just http, but m2m
interfaces - Digitisation
- Newspapers, NewsFilm, Manuscripts
- DIWAN digitising Islamic Materials in UK
university collections - New challenge Open Access repositories
- International development UK active
- Institutional Repositories
- put it in The Depot www.depot.ac.uk not yet
launched - need Access Management Federation for Education
Research - Shibboleth framework, developed as part of
Internet2