Enabling Interaction and Quality in a Distributed Data DRIS - PowerPoint PPT Presentation

1 / 21
About This Presentation
Title:

Enabling Interaction and Quality in a Distributed Data DRIS

Description:

2004 initiative for Librarians (faculty) to collaborate with other faculty ... Darcy Bullock, Civil Engineering, Co-PI): develop toolkit to deploy customized ... – PowerPoint PPT presentation

Number of Views:27
Avg rating:3.0/5.0
Slides: 22
Provided by: neti156
Category:

less

Transcript and Presenter's Notes

Title: Enabling Interaction and Quality in a Distributed Data DRIS


1
Enabling Interaction and Quality in a Distributed
Data DRIS
CRIS 2006 Bergen, Norway May 11, 2006
  • D. Scott BrandtAssociate Dean for Research
    Michael WittSenior Research Systems
    Administrator
  • Purdue University Libraries

2
Background Purdue University
  • Nine Colleges Agriculture, Consumer Family
    Sciences, Education, Engineering, Liberal Arts,
    Management, Pharmacy/ Nursing/Health Sciences,
    Technology, Vet Medicine
  • 73 Departments, several cross-disciplinary e.g.
    Agricultural Biological Engineering

3
Purdue University Libraries
  • 2004 initiative for Librarians (faculty) to
    collaborate with other faculty across
    campusapply library science knowledge and
    expertise to various research data problems
  • collect, organize, describe, curate, archive,
    disseminate data/information

4
Strategic directions
  • University interdisciplinaryand collaborative
    endeavorsgrounded in the strengths of academic
    disciplines
  • Libraries Libraries faculty are integrated into
    campus research agenda

5
Areas of research collaboration
  • Discovery Learning Center
  • Earth Atmospheric Science
  • English
  • IT at Purdue
  • Mechanical Engineering Technology
  • Regenstrief Center
  • Agronomy
  • Biology
  • Cancer Center
  • Center for the Environment
  • Chemical Engineering
  • Chemistry
  • Cyber Center

6
Current areas of participation
  • E. Coli K-12 Model Organism Resource NIH proposal
    (B. Wanner, Biology, PI, D. Scott Brandt,
    Libraries, Co-PI) create archival process for
    curated database, assist in applying ontologies
    for data representation and annotation
  • An Expert System Multimedia Tutorial for Locating
    Technical Information, Purdue University TLT
    Digital Content grant (Megan Sapp, PI, Amy Van
    Epps and Michael Fosmire, co-PIs, with Bruce
    Harding, Mechanical Engineering Technology)
    develop tutorial for MET102 course in using and
    applying standards
  • URL-based Search Interface to the Distributed
    Institutional Repository Purdue University
    Graduate School (Michael Witt, Libraries, PI,
    Darcy Bullock, Civil Engineering, Co-PI) develop
    toolkit to deploy customized searching of
    dissertations by school, advisor, etc.
  • AquaEcon Web Library An Electronic Resource on
    Economics-Related Literature on Aquaculture, NOAA
    (K. Quagrainie, Agricultural Economics PI, Hal
    Kirkwood, Libraries, as co-PI) build and
    populate database

7
Progression towards CRIS
  • Institutional repository (IR)
  • Distributed institutional repository (DIR)
  • Interactions related to DIR leading to CRIS-like
    applications
  • Leverage DIR for DRIS/CRIS

8
Distributed Institutional Repository
e-prints
archival collections
MetadataRepository
grid resources
Applications
data archive
native databases
OAI Service Provider
OAI Data Providers
9
A systems-based approach to Libraries supporting
research linear
inputs
experimentation
outputs
Data repositories
Document repositories
CRIS
A repository of well-described data resulting
from research processes is preserved and shared
for repurposing
A current research information system links
people engaged in research with funding and other
resources such as interdisciplinary collaborators
Journal article pre-prints, post-prints,
conference and working papers, dissertations and
other e-prints represent research outputs in a
document repository
10
A systems-based approach to Libraries supporting
research cyclical
CRIS
data repository
e-print repository
11
An example application SRU
  • Linking to electronic theses and dissertations
    (ETD)
  • URL-based search interface to DIR running as a
    web service
  • 16,000 Strategic Development Initiative award
    for fellowship and server

12
Getting to the datasets SRB
  • The Storage Resource Broker
  • Developed by the San Diego Supercomputer Center
  • Uniform access to heterogeneous, distributed
    storage
  • Metadata catalog (MCAT) and preservation
    functionality
  • TeraGrid, collaboration with Information
    Technology at Purdue and Rosen Center for
    Advanced Computing

13
An example systems interaction
  • OAISRB provides an OAI-PMH interface to the SRB
    to expose metadata from resources on a data grid
    to OAI service providers

Data grid
14
Sample OAISRB config
OAI Handler Base URL Format OAIHandler.baseUR
Lhttp//128.210.126.2318080/OAISRB/OAIHandler
SRB Connection Parameters SRB.HOSTorion.sdsc.e
du SRB.PORT7620 SRB.USERNAMEmwitt SRB.PASSWORDn
yah SRB.HOMEDIRECTORY/dspace/home/mwitt.purdue SR
B.MDASDOMAINNAMEpurdue SRB.DEFAULTSTORAGERESOURCE
dspace-fs1 SRB.MCATZONEdspace SRB
Collection Count and SRB Collection
Names SRB.root/TGzone/home/lars.itap SRB.maxcolle
ctions1 SRB.collection1LARSDATA Custom
Parameters for SRB GRID SRBRecordFactory.repositor
yIdentifiermwitt.purdue Display.MaxListSize50
Custom Identify response values Identify.reposi
toryNameSRB Data Grid Identify.adminEmailmailto
mwitt_at_purdue.edu Identify.earliestDatestamp2000-0
1-01T000000Z Identify.deletedRecordno
Crosswalk (in this example, FGDC-to-unqualified
Dublin Core) DC.Identifiertitle DC.Descriptionpu
rpose DC.Titletitle DC.FormatFile
Format DC.Creatoraddress DC.Subjectmetprof
15
Metadata research
  • Metadata librarian worked for four months
    analyzing metadata needs and processes for
    several data sets
  • Results included DC descriptions, enhanced with
    thesaurus headings, and a basic crosswalk
  • Also metadata descriptions from scratch are too
    manually intensive

16
Metadata- Water Quality
  • A flat file with only system metadata
  • Began with Dublin Core
  • Enhanced subjects with thesaurus from NAL (US
    National Agriculture Library)
  • Looked at DIF (Dir. Interchange Format)
  • Looked at cross-walk with FGDC (Federal
    Geographic Data Comm.) format

17
(No Transcript)
18
(No Transcript)
19
Next steps Metadata
  • Articulate metadata workflow to imbed metadata
    into the process
  • Review automating all data
  • Determine how/where to validate and automate
    descriptive metadata

20
Conclusions and Questions
  • Use existing, native metadata whenever possible
  • Automate and periodically assess processes to
    ensure quality
  • Diminishing returns we settled on discovery and
    collection-level metadata
  • Crosswalks are useful but can truncate or distort
    the original meaning
  • The importance of interactions, among people and
    systems
  • How do we implement CRIS/CWIS/DRIS in our
    environment?
  • What is the role of the Libraries in such?

21
Takk (thank you)
  • Michael Witt
  • mwitt_at_purdue.edu
  • D. Scott Brandt
  • techman_at_purdue.edu
Write a Comment
User Comments (0)
About PowerShow.com