Title: The Open Archives Initiative OAI and Electronic Theses and Dissertations ETDs
1The Open Archives Initiative (OAI) andElectronic
Theses andDissertations (ETDs)
- ASIDIC 2000
- Orlando, FL - March 27, 2000
- Edward A. Fox (fox_at_vt.edu)
- http//fox.cs.vt.edu
- Virginia Tech, Blacksburg, VA, USA
2Acknowledgements (Selected)
- Sponsors ACM, Adobe, ARL, Belgian Science
Found., CLIR, DARPA, IBM, LANL, Microsoft, NSF,
OCLC, SPARC, US Dept. of Ed. (FIPSE), - VT Faculty/Staff Tony Atkins, Thomas Dunbar,
John Eaton, Gwen Ewing, Peter Haggerty, Gary
Hooper, Gail McMillan, Len Peters, James Powell - VT Students Emilio Arce, Fernando Das Neves,
Brian DeVane, Robert France, Marcos Goncalves,
Scott Guyer, Robert Hall, Neill Kipp, Paul
Mather, Tim McGonigle, Todd Miller, Constantinos
Phanouriou, William Schweiker, Ohm Sornil,
Hussein Suleman, Patrick Van Metre, Laura Weiss
3Virginia Tech Background
- Largest university in Virginia, land-grant,
football, town population 35K plus 25K students - Blacksburg Electronic Village, since 1992, with
gt 80 of community on Internet - Net.Work.Virginia, largest ATM network, with over
750 sites, for education, research, govt - LMDS, Local Multipoint Distribution Service,
gigabit wireless networking - 1/3 of Virginia - Math Emporium, 500 workstations
- Faculty Development Initiative, round 2
4Digital LibrariesShorten the Chain from
Author
Editor
Reviewer
Publisher
AI
Consolidator
Library
Reader
5DLs Shorten the Chain to
Roles
Digital Library
Author
Teacher
User
Reader
Editor
Learner
Reviewer
Librarian
6The Networked Digital Library of Theses and
Dissertations
www.NDLTD.org
Training Authors Expanding Access Preserving
Knowledge Improving Graduate Education Enhancing
Scholarly Communication Empowering Students
Universities
Leader of the Worldwide ETD (Electronic Thesis
and Dissertation) Initiative
7Open Archives initiative OAi www.openarchives.org
openarchives_at_openarchives.org
8OAi Philosophy
- Self-archiving submission mechanism
- Long-term storage system archive
- Open interface harvesting mechanism
- Data provider service provider
- Start with gray literature
- e-prints/pre-prints, reports, dissertations,
9Tiered Model of Interoperability
Mediator services
Metadata harvesting
Document models
10Repository of Digital Objects
Repository Access Protocol
handle
Metadata
terms and conditions
Digital object
11Open Archives initiative History
- xxx at LANL Los Alamos National Laboratory
(Ginsparg) for high-energy physics - 1991 - CSTR WATERS NCSTRL (Lagoze) - 1994
- xxx NCSTRL CoRR collaboration - 1998
- UPS (Universal Preprint Service) 1999 mtg
- Herbert Van de Sompel (U. Ghent, SFX)
- Dublin Core (DC), XML
- Dienst protocol and software (Lagoze)
- Renamed late 1999 as OAi
12Open Archives (protoproto)
- ArXiv Los Alamos National Lab
- CogPrints U. Southampton
- NACA NASA (reports)
- NCSTRL Cornell U.
- NDLTD Virginia Tech
- RePEc U. Surrey
- Total of around 200K records
13Original Open Archives Members
- Caroline Arms, Library of Congress
- Leslie Carr, University of Southampton
- Mark Doyle, American Physical Society
- Dale Flecker, Harvard University
- Edward A. Fox, Virginia Tech
- Michael Friedman, HighWire Press, Stanford U.
- Paul M. Gherman, Vanderbilt U. SPARC
- Paul Ginsparg, Los Alamos National Lab. xxx
- Stevan Harnad, University of Southampton
- Thomas Krichel, University of Surrey RePEc
- Carl Lagoze, Cornell University
14Original Open Archives Memberscontd
- Rick Luce, Los Alamos National Laboratory
- Clifford Lynch, Coalition for Networked Info.
- Kurt Maly, Old Dominion University
- Michael Nelson, NASA Langley Research Center
- John Ober, California Digital Library
- Bob Parks, Washington University EconWPA
- Herbert Van de Sompel, University of Ghent
- Eric F. Van de Velde, Caltech
- Don Waters, The Andrew W. Mellon Foundation
- Ken Weiss, California Digital Library
15Open Archives Future
- EconWPA (U. Washington)
- e-biomed -gt PubMed Central (NIH)
- PubScience (DOE)
- Clinical Medicine Netprints ( other HighWire
Press holdings ) - University ePub (California Digital Library)
- All public e-prints (MIT)
- Scholars Forum (Caltech)
- Intl CERN, Germany, India, Mexico,
- Goal millions of books/articles/reports / yr
16Approaches to Open Archives
Build By Institution
Build By Discipline
17Approaches to Open Archives
Build By Institution
Build By Discipline
Access by
Author Category Interdisciplinary Year Language Qu
ery
18Open Archives initiative (OAi)www.openarchives.or
g
- Santa Fe meeting, Oct. 21-22, 1999, protoproto
- Next mtg June 3, San Antonio, between HT00
DL00 - LANL, CNI, DLF, Mellon,
- Convention (see Feb. D-Lib Magazine article)
- Archives -gt Open Archives
- Support unique archive identifiers
- Implement Open Archives Metadata Set (DC-based,
using XML) - Implement Dienst harvesting interface
- Register the archive
- Build tools, layer other services linking,
searching,
19(No Transcript)
20Mechanisms
- Sharing
- Join federation, run software
- Make metadata and archive available
- Aggregating
- By discipline
- By institution
- By genre
- Automating
- Workflow
- Harvesting and providing services
- Federated searching
- Dynamic linking (e.g., with SFX)
21Report on Open Archives work in progress
atVirginia Tech
With students Hussein Suleman (hussein_at_vt.edu) Da
ve Watkins (dwatkins_at_cs.vt.edu) Robert France
(france_at_vt.edu) Marcos Andre Goncalves
(mgoncalv_at_cs.vt.edu)
22VT View of the Open Archives initiative (OAi)
- Enable sharing of publication metadata and
full-text by digital libraries - Standardize low-level mechanisms to share
contents of libraries - Build higher-level user-centric and
administrative services in meta-libraries - Install organizational mechanisms to support the
technical processes
23Virginia Tech Projects
- MARC XML-DTD
- Computer Science Teaching Centre (CSTC)
- W3C Web Characterization Repository
- OAi Repository Explorer
- Networked Digital Library of Theses and
Dissertations (NDLTD)
24MARC XML-DTD
- XML Transport format for US-MARC records
- Standardized metadata exchange format for
traditional library services joining OAi
25CS Teaching Center (CSTC)
- Collection of reviewed online resources used to
aid in teaching of Computer Science - Supports author submission and peer-review
process for new ACM Journal of Educational
Resources In Computing (JERIC) - Connected with NSDL (NSF 00-44)
- http//www.cstc.org
26W3C Web Characterization Repository
- Online database of metadata related to
publications, tools and data sets dealing with
Web characterization - Project of the Web Characterization Activity
working group of the World-Wide-Web Consortium
(www.w3c.org/WCA) - http//purl.org/net/repository
27OAi Repository Explorer
- Serves as a compliancy test
- Allows browsing of open archives using only OAi
protocol - Sends requests on behalf of user, parses and
checks responses and displays browsable interface - Will detect most discrepancies in protocol
- http//purl.org/net/explorer
28NDLTD
- Work has begun on interoperability between
Virginia Tech and partners in Germany - Wrappers have been created to harvest data from
remote sites which use other protocols - Harvested data to be stored in a central
OAi-compliant database (work in progress)
29NDLTD
Grad Program
Ed Tech
IT
Library
30A Digital Library Case Study
- Domain graduate education, research
- GenreETDselectronic theses dissertations
- Submission http//etd.vt.edu
- Collection http//www.theses.org
- Project
- Networked Digital Library of Theses
Dissertations - (NDLTD) http//
- www.ndltd.org
- with 225 people at 3rd Intl Symposium, March 2000
31What are we doing?
- Aiding universities to enhance graduate
education, publishing and IPR efforts - Helping improve the availability and content of
theses and dissertations - Educating ALL future scholars so they can publish
electronically and effectively use digital
libraries (i.e., are Information Literate and can
be more expressive)
32Key Ideas
Networked infrastructure
Scalability
University collaboration
Workflow, automation
Education is the rationale
8th graders vs. grads
Maximal Access
Authors must submit
Standards
PDF, SGML, MM, MARC, DC, URNs, Federated search
33Student Defends Finalizes ETD
Multimedia
Start ETD early!
34Student Gets Committee Signatures and Submits ETD
Approval form
35Graduate School Approves ETD, Student is
Graduated
Quality control
36Library Catalogs ETD, Access is Opened to the New
Research
WWW
NDLTD
Digital library access control
37User Search Support(multilingual, XML)
Note All groups shown are connected with NDLTD.
38www.theses.org
- James Powell student project, D-Lib Magazine
description in Sept. 1998 - XML description of each site
- type of search engine / service
- language
- coverage (for resource discovery)
- Adding Z39.50 gateway capability and integrating
with MARIAN, along with Harvest and Open Archives
protocols
39Access Possibilities
www. openarchives. org
Web search engines
www. theses. org
library catalog clients
3rd Party Services (e.g., UMI)
Virginia Tech
National Library of Portugal
CBUC (Spain)
Ohio Link
MIT
National Projects AU, GE,
40PetaPlex
- Digital Library Machine (super object store)
- Parallel computer / storage utility
- Knowledge Systems Incorporated is supplying
VT-PetaPlex-1 with - high speed backbone connection
- 2.5 terabytes through 100 nodes
- Net connection 25GB disk 233 MHz Pentium
Linux
41How does this relate to UMI?
- 1987 UMI workshop to explore ETDs
- Support letter for US Dept. of Ed. proposal
- Steering committee membership
- ProQuest Direct pilot of scanning works started
1/1/97, free 2 yr access to front part - Collaborating on
- accepting electronic author submissions
- standards (e.g., representation)
42ETD Initiative (and UMI)
Education
Access
Students Learn about DL, EPub
TDs become more expressive
Global TDs become more accessible, archived
Universities
N. Amer. (T)Ds are accessible, archived
UMI
43(No Transcript)
44US University Members (41)
- Air University (Alabama)
- Baylor University
- Brigham Young University
- Caltech
- Clemson University
- College of William Mary
- Concordia University (Illinois)
- East Carolina University
- East Tenn. State U. require fall 2000
- Florida Institute of Tech.
- Florida International University
- George Washington University
- Marshall University (W. Va.)
- Miami U. of Ohio
- MIT
- Michigan Tech
- Naval Postgraduate School (CA)
- North Carolina State U.
- Penn. State University
- U. of Florida
- U. of Georgia
- University of Hawaii, Manoa
- U. of Iowa
- U. of Kentucky
- U. of Maine
- U. of North Texas required since 8/99
- U. of Oklahoma
- U. of South Florida
- U. of Tennessee, Knoxville
- U. of Tennessee, Memphis
- U. of Texas at Austin
- U. of Virginia
- U. Wisconsin - Madison
- Vanderbilt U.
- Virginia Commonwealth U.
- Virginia Tech - required since 1/97
- West Virginia U. - required fall 1998
- Western Michigan U.
45Institutional Members
- Coalition for Networked Information (CNI)
- Committee on Institutional Cooperation (CIC)
- Diplomica.com
- Dissertation.com
- Dissertationen Online (Germany)
- Ibero-American Science Technology Education
Consortium (ISTEC, www.istec.org) - National Library of Portugal (for all
universities) - Organization of American States (SEDI/OAS)
- UNESCO (www.unesco.org/webworld/etd)
46Australian Project Members
- U. New South Wales (lead institution)
- U. of Melbourne
- U. of Queensland
- U. of Sydney
- Australian National University
- Curtin U. of Technology
- Griffith U.
47German Project Members
- Humboldt University (lead institution)
- 3 other universities
- 5 learned societies
- Mathematics, Physics, Chemistry, Sociology,
Education - 1 computing center
- 2 major libraries
48CBUC (www.cbuc.es, Spain)
- Consorci de Biblioteques Universitàries de
Catalunya, as group, with 9 members - Universitat de Barcelona
- Universitat Autonòma de Barcelona
- Universitat Politècnica de Catalunya
- Universitat Pompeu Fabra
- Universitat de Girona
- Universitat de Lleida
- Universitat Rovira i Virgili
- Universitat Oberta de Catalunya
- Biblioteca de Catalunya
49Other International Members
- Chinese University of Hong Kong
- Chungnam National U. (S. Korea - CS)
- City University, London (UK)
- Darmstadt U. of Tech. (Germany)
- Free University of Berlin (GE - Vet. Med.)
- Gyeongsang National U. (Korea)
- India Institute of Tech., Bombay (India)
- Nanyang Technological U. (Singapore, pt)
- National U. of Singapore (Singapore, pt)
50Other International Memberscontd
- Polytechnic University of Valencia (Spain)
- Rhodes U. (South Africa)
- St. Petersburg St. Tech.U (Russia)
- Univ. de las Américas Puebla (Mexico)
- Univ. of Alicante (Spain)
- Univ. of Pisa (Italy)
- U. Laval U. of Guelph U. Waterloo
- Wilfrid Laurier U. (Canada),
51What are the long term goals?
- 400K US students / year getting grad degrees are
exposed / involved - 200K/yr rich hypermedia ETDs that may turn into
electronic portfolios (images, video, audio, ) - Dramatic increase in knowledge sharing
literature reviews, bibliographies, - Services providing lifelong access for students
browse, search, prior searches, citation links - Hundreds/thousands of downloads / year / work
52For professional societies
- Like writing across the curriculum, e.g.,
Chemical Markup Language, MathML, - Besides writing computing/communications,
information literacy, personal digital library
management, tool use, research methods,
collaboration, archiving/preservation - Data sets, communities of users of them
- Classification systems / browsing / searching
53Extending Services - 1 of 2
- Working with publishers
- Motivate students awards,
- Publicize support of NDLTD
- ACM, ACS, IEEE-CS, Elsevier,
- Allow students to increase level of access
- Arranging preservation
- Mirroring worldwide
- Involving long-term trusted parties
54Extending Services - 2 of 2
- Adding services currently prototyped
- annotation and SDI (routing) capabilities
- Dublic Core metadata, crosswalk to MARC
- support for XML, ML, preservation
- harvesting, federated search
- Adding other services planned
- building/using citation DB (CiteSeer, SFX, )
- implementing plagiarism check (like SCAM)
55Remember!
- Digital Libraries (technology base)
- OAi (help establish enormous international
cooperative of data and service providers) - NDLTD - improve graduate education
- www.ndltd.org/join
- (www.ndltd.org/talks for this)