ARROW institutional repositories and discovery services: presentation to the CORDRA workshop, Melbou - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

ARROW institutional repositories and discovery services: presentation to the CORDRA workshop, Melbou

Description:

Needs development of a FRODO profile of XACML for access control interoperability ... Needs FRODO Metadata schemata for object exchange, export and ingest into new ... – PowerPoint PPT presentation

Number of Views:39
Avg rating:3.0/5.0
Slides: 27
Provided by: barbar192
Category:

less

Transcript and Presenter's Notes

Title: ARROW institutional repositories and discovery services: presentation to the CORDRA workshop, Melbou


1
ARROW institutional repositories and discovery
services presentation to the CORDRA workshop,
Melbourne, 4 Feb 2005
  • Geoff Payne,
  • ARROW Project Manager

2
Presentation structure
  • ARROW objectives and strategies
  • ARROW software
  • ARROW and CORDRA

3
ARROW acronyms
  • Australian Research Repositories Online to the
    World (ARROW) is a Federated Repositories of
    Digital Objects (FRODO) Project funded by the
    Australian Commonwealth Government Dept of
    Education, Science and Training (DEST)
  • DEST Funding of A3.66M over three years
    (2004-2006)
  • ARROW Consortium Partners
  • Monash University (Lead Institution)
  • University of New South Wales
  • Swinburne University of Technology
  • National Library of Australia

4
What is an Institutional Repository?
  • A managed collection of digital objects
  • institutional in scope
  • with consistent data and metadata structures for
    similar objects
  • enabling resource discovery by the Communities
    of Practice for whom the objects are of interest
  • allowing read, input and export of objects to
    facilitate resource sharing
  • respecting access constraints
  • sustainable over time
  • facilitating application of preservation
    strategies

5
ARROW Branded Services Profile
 Internet
                                               
National Library of Australia
        ARROW Web Site   Project Information  
National Library of Australia   ARROW Resource
Discovery Service   Using TeraText to index
metadata harvested by OAI PMH
        ARROW Open Access Journal Publishing
System     Using OJS from Public Knowledge
Project
        Internet Search Engines         Capture
text exposed by ARROW Repositories
Swinburne                
UNSW
Monash   ARROW Repository   Digital Object
Storage using Fedora VITAL
Members only area Meeting Minutes etc

6
Repositories Technical Issues Metadata
Exchange
  • Dublin Core insufficiently granular for many
    purposes
  • Learning Object Metadata not good for
    bibliographic metadata
  • Need to preserve metadata relevant to categories
    of objects as decided by the community of
    practice that produced the object
  • Open Archives Initiative Protocol for Metadata
    Harvesting (OIA-PMH) can gather Dublin Core
    metadata to establish resource discovery services

7
ARROW Metadata Strategy
  • Supports metadata schemata to suit individual
    data models
  • No requirement to shoehorn all metadata into one
    schema
  • Each stored object can retain metadata developed
    for it by the community of practice which
    generated the object
  • Maintains flexibility to store many types of
    digital objects in the repository
  • No need to anticipate every object type now

8
OCLC Metadata Interoperability Core
From Godby, Smith and Childress. 2003. Two
paths to interoperable metadata p. 3 at
http//www.oclc.org/research/publications/archive/
2003/godby-dc2003.pdf
9
ARROW Branded Services Profile
 Internet
                                               
National Library of Australia
        ARROW Web Site   Project Information  
National Library of Australia   ARROW Resource
Discovery Service   Using TeraText to index
metadata harvested by OAI PMH
        ARROW Open Access Journal Publishing
System     Using OJS from Public Knowledge
Project
        Internet Search Engines         Capture
text exposed by ARROW Repositories
Swinburne                
UNSW
Monash   ARROW Repository   Digital Object
Storage using Fedora VITAL
Members only area Meeting Minutes etc

10
(No Transcript)
11
ARROW Persistent Identifiers
  • Repositories need to offer a preferred form of
    citation for their content
  • Which does not break as URLs do when files are
    moved or web sites are restructured
  • Handles from CNRI seem to be becoming widely
    adopted
  • DOI (Digital Object Identifier is a Handle)
  • UK Stationery Office adopting Handles
  • DSpace uses Handles

12
ARROW Repository Persistent Identifiers
  • ARROW Handles Format adopted
  • http//arrow.monash.edu.au/hdl/1959.1/nnnn
  • 1959   ARROW handles naming authority
  • 1959.n one sub number for each ARROW repository
  • nnnn running number
  • ARROW will assign a handle to each datastream in
    a digital object to ensure that
  • individual parts of the digital object can be
    cited and re-used independently
  • Internal data models in the repository can be
    reworked and the datastream can still be reliably
    retrieved
  • http//www.handle.net/index.html

13
ARROW - Summary of design criteria
  • A generalised institutional repository solution
  • Initial focus on managing and exposing
    traditional bibliographic research outputs
  • Expand to managing non-bibliographic research
    outputs
  • Design decisions are being taken with the
    intention of not precluding management of other
    digital objects such as learning objects and
    large research data sets

14
ARROW technology software selected
  • Flexible Extensible Digital Object Repository
    Architecture -Fedora
  • Software implementation of architecture by
    Cornell University and University of Virginia
  • VTLS Inc www.vtls.com as development partners
  • ARROW / VTLS partnership to take the Fedora
    engine and construct a working repository to
    meet ARROWs functional requirements
  • ARROW licensing VITAL repository product
  • VTLS doing ARROW-specified development
  • Ongoing sustainability through vendor support

15
ARROW architecture component software
VITAL Access Portal, OAI/PMH, SRU/SRW, Web
Exposure
VITAL, OJS Fedora
Fedora
16
Resulting VITAL Application Stack
Vital Closed Source Management Client(Windows)
Access Portal (Web)
ARROW-Funded Open Source Web Services
Fedora Open Source Repository
17
ARROW stages
  • Demonstration (2004)
  • Developing architecture, selecting, testing and
    developing software
  • Deployment (late 2004 end 2005)
  • Populating the ARROW Partners repositories
  • Distribution (mid 2005 end 2006)
  • Enabling others to participate
  • Under review for earlier participation by others

18
ARROW partnerships
  • Established
  • VTLS
  • Fedora
  • Google, to test indexing of research materials
  • Thomson ISI Web Citation Index
  • Being negotiated
  • OCLC, to test their metadata interoperability
    core
  • Open Journal System, to enhance the OJS Software
  • Research Master, to test integration between RM4
    and ARROW

19
ARROW FRODO Partnerships
  • MAMS Meta Access Management System
  • Access control through eXtensible Access Control
    Markup Language (XACML) metadata
  • Needs development of a FRODO profile of XACML for
    access control interoperability
  • APSR Australian Partnerships for Sustainable
    Repositories
  • Interoperability through consistent metadata for
    similar data objects
  • Needs FRODO Metadata schemata for object
    exchange, export and ingest into new repository
    environments as part of sustainability and
    preservation initiatives
  • ADT Australian Digital Theses
  • Interoperability through harvestable Dublin Core
    metadata
  • Supporting e-theses online which are pointed to
    from ADT
  • Role for an overarching FRODO Web services
    strategy?

20
Repositories and Middleware
  • List of possible open source repository software
  • http//www.soros.org/openaccess/software/
  • Regardless of software selected, need to deal
    with same issues
  • authorisation/authentication
  • object processing on ingest
  • object workflow on ingest
  • metadata consistency
  • providing search exposure
  • identifying OA status of deposits
  • collaboration with other repositories and
    repository initiatives
  • These may or may not be handled in the software
    selected

21
Search exposure
  • Requirements
  • standards-based way for information gateways, as
    well as other repositories, to query repository
    contents directly
  • way to harvest from repositories to support
    single search gateway (preferable to federated
    search)
  • ARROW will be supporting OAI-PMH and also
    commissioning development of SRU/SRW web service
    on top of Fedora
  • Google has agreed to work with ARROW to expose
    repository content via Google Scholar
  • Middleware opportunity
  • Extending OAI-PMH to harvest content as well as
    metadata
  • modOAI project already looking at this
  • Agreement to use SRU/SRW for searching across
    different repositories
  • Moderately coupled with repository

22
Collaboration with other repositories
  • Requirements
  • need for interoperability
  • avoiding wheel re-invention
  • learning from each others progress so far
  • ARROW working with Fedora Development Consortium,
    National Science Digital Library and APSR
  • Middleware opportunities
  • Registry of standard content models
  • In the absence of existing practice, influencing
    the emergence of de facto standards
  • Agreement on standard framework for re-usable web
    services

23
ARROW Content Committee
  • Unfortunately it is not as simple as build it and
    they will come
  • Publisher and Library/Learning Solutions (PALS)
    Pathfinder research on web-based repositories ,
    Final Report, January 2004
  • We find that IRs are currently rather small,
    with an average (median) of 290 records per
    institution (smaller but comparable to the median
    size of other OAI data providers). (Page 33)

24
Incentives are needed for academics to submit
their materials to repositories
  • Substantial advocacy is required to achieve
    participation
  • Mandatory deposit of e-Theses
  • Credits towards promotion
  • Funding linkages
  • Demonstrable additional exposure such as in Web
    Citation indexes and search engines

25
ARROW and CORDRA
  • ARROW
  • Relies on exposing metadata for harvesting by
    discovery services
  • Exposes content through search engines
  • Not exclusively SCORM compliant learning objects
  • Exposes basic elements for use in building
    learning objects etc
  • Need to minimise human effort in the registration
    process
  • CORDRA as one of many pathways to ARROW content

26
Questions
  • What is in it for authors?
  • Protocols for exposing content in scope for
    CORDRA registries via automatic harvesting or
    push mechanisms?
  • Boundaries for CORDRA instances
  • National?
  • Community of interest
  • Medical images for diagnostics
  • Medieval manuscripts
  • Content eligibility
  • ARROW fundamental building block objects
  • ARROW learning objects
  • What is the difference between harvesting and
    CORDRA registration
  • Would CORDRA relate to ARROWs federated resource
    discovery service, or to individual repositories?
  • Business model reciprocal access or real ?
Write a Comment
User Comments (0)
About PowerShow.com