Nomadic%20Digital%20Library%20Research%20at%20Cornell - PowerPoint PPT Presentation

View by Category
About This Presentation



Personal computers Apple University Consortium 1984. Campus networks ... MIT OpenCourseWare (MITOCW), at a press conference at MIT on Wednesday, April 4th. ... – PowerPoint PPT presentation

Number of Views:34
Avg rating:3.0/5.0
Slides: 47
Provided by: carll8


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Nomadic%20Digital%20Library%20Research%20at%20Cornell

The Impact of the Internet on Research
Universities Examples from Distance Education
Digital Libraries William Y. Arms Department of
Computer Science Cornell University
Universities and Cost
In 1978, a Cornell education cost one Chevrolet
per year. In 2001, a Cornell education costs one
BMW per year. Every year, costs have gone up
faster than average income.
The costs of research universities are dominated
by personnel. Major reductions in unit costs
require different use of personnel.
Technology in Education and Distance Education
By creative use of technology Can we teach more
students, to a high level, with less faculty per
Technology in Education
Technology Example Date History Time
sharing Dartmouth Basic 1964 Television Open
University 1972 Personal computers Apple
University Consortium 1984 Campus
networks Carnegie Mellon Andrew
1986 Current Internet Digital libraries 1991 Web
Distance learning 2000
Course Web Sites
For profit, non-degree executive and professional
Technology in Education and Distance Education
Question 1 Quality Is it good education?
In a recent survey by JSTOR of faculty in social
sciences and humanities, only 17 thought that
distance education was as good as conventional
campus-based education. (Preliminary data please
do not quote)
What is the evidence?
The British Open University
Distance education Students at home, with
limited access to tutors, summer
schools. Technology used as appropriate Printed
materials, home experimental kits, videos,
computing, etc. Academic standards Full degree
programs, external control of quality.
Longevity First students in 1972.
The Open University
Currently 215,000 students. Over 2
million students since 1972. Ranked in the
top 10 of all UK universities, for
teaching quality. Ranked after
Cambridge, York, Oxford, Imperial College,
London School of Economics, Warwick, University
College London, Durham and
Sheffield. Higher Education Funding Council,
Technology in Education and Distance Education
Question 2 Capital Intensive Education What are
the organizational options?
Capital Intensive Education
Conventional course Major cost is
faculty time. Costs are repeated every
year. Technology in education and distance
education Course materials are a major
expense. Marginal cost of delivering course
is low. Consequences Economies of scale
Universities need access to capital
Course materials are an asset
Columbia University Cambridge University
Press London School of Economics New York Public
Library University of Chicago University of
Michigan British Library American Film
Institute RAND Woods Hole Victoria and Albert
Museum Science Museum Natural History Museum
(No Transcript)
Technology in Education and Distance Education
Question 3 Ownership and Intellectual
Property If course materials are assets, who owns
Recommendations of a Cornell Committee
1. The university policies on intellectual
property should be independent of the media in
which ideas are expressed.  2. Creators of works
should have control over the intellectual output
resulting from their research, teaching, and
writing.  3. When there are multiple creators of
an individual work, the control should be shared
among the creators.  4. When the university
contributes substantial resources to the
development of specific materials, it has a right
to share in the control and returns.
MIT to make nearly all course materials available
free on the World Wide Web Unprecedented step
challenges 'privatization of knowledge' CAMBRIDGE,
Mass. -- MIT President Charles M. Vest has
announced that the Massachusetts Institute of
Technology will make the materials for nearly all
its courses freely available on the Internet over
the next ten years. He made the announcement
about the new program, known as MIT
OpenCourseWare (MITOCW), at a press conference at
MIT on Wednesday, April 4th. MIT Press Release,
April 4, 2001
Digital Libraries
By creative use of technology Can we build
libraries that are of high quality at much lower
Research Libraries are Expensive
library materials
buildings facilities
The Open Access Web
Before the web Few people had access to
scientific, medical, legal information With the
web Much high quality information is
available with open access Free services
organize this information and provide access to it
"Please can I use the web? I don't do
libraries." Anonymous Cornell student, circa
The Potential of Digital Libraries
open access
computers networks
Digital Libraries
Question 1 Economic Models for Open Access Who
pays for open access?
A False Assumption
Incorrect thinking The only incentive for
creating information is to make money --
royalties to authors and profits for
publishers Correct thinking Many creators do not
require revenue Marketing and
promotion Government information
Academic research
They want their materials to be used
Old New Books in Print (subscription) Amazon.
com (advertising) Medline (pay-by-use) Grateful
Med (external) Journal (subscription) ePrint
archives (external) Westlaw (pay-by-use) Legal
Information Institute (external) Inspec
(subscription) Google (advertising)
Before You Ask ...
The open access information is sometimes a
poor substitute Much good information
is not available with open access
But every year the proportion of important
information that is available with open access
Open Letter We support the establishment of an
online public library that would provide the full
contents of the published record of research and
scholarly discourse in medicine and the life
sciences in a freely accessible, fully
searchable, interlinked form. Establishment of
this public library would vastly increase the
accessibility and utility of the scientific
literature, enhance scientific productivity, and
catalyze integration of the disparate communities
of knowledge and ideas in biomedical sciences.
Hypotheses for Scholarly Information
The dominant force is author pressure, which
emphasizes open access rather than closed access.
Digital Libraries
Question 2 Quality What are the alternatives to
peer review?
(No Transcript)
Observations about Peer Review
At its best, it is superb. At its worst, it
validates junk. Some topics can be reviewed from
a paper, e.g., mathematics. Some topics cannot be
reviewed from a paper, e.g., computer systems.
"Whatever you do, write a paper. Some journal
will publish it." Advice to young faculty
member, University of Sussex, 1969.
Quality without Peer Review
How can readers recognize good quality
materials? How can publishers maintain high
standards and let readers know? How can a
scientist build a reputation outside the
traditional peer-reviewed journals?
A sample of one William Y. Arms
Digital Libraries
Question 3 Brute Force Computing How far can
computers be used for the skilled tasks of
professional librarianship?
Brute Force Computing
Few people really understand Moore's Law --
Computing power doubles every 18 months --
Increases 100 times in 10 years -- Increases
10,000 times in 20 years Simple algorithms
immense computing power may outperform human
Brute Force Computing
Example Creators of the world champion chess
program (Deep Thought later Deep Blue) --
moderate chess players -- simple tree-search
algorithm -- very, very fast computer hardware
Example Catalogs and Indexes
Catalog, index and abstracting records are very
expensive when created by skilled
professionals -- only available for certain
categories of material (e.g., monographs,
scientific journals) -- contain limited fields
of information (e.g., no contents page) --
restricted to static information
Equivalent Services
Information discovery I used to be a heavy user
of Inspec. Now I use Google instead.
Why are web search services the most widely used
information discovery tools in universities
Thinking out of the Box
For information discovery, particularly with
untrained users automated indexing of full text
is at least as effective as manually
produced indexes and catalogs Demonstrated
repeatedly in experiments going back to the
original Cranfield experiments.
Digital Libraries
Question 4 Automated Digital Libraries What is
the state of the art in automated digital
Automated Digital Libraries Examples
Automatic indexing Lycos, Infoseek, Altavista,
Google, ... Query matching Vector methods
(Salton) Ranking importance Google (Page and
Brin) Archiving Internet Archive
(Kahle) Collection development ResearchIndex
(Lawrence) Metadata extraction Informedia
Digital Libraries
Question 5 A National Science Library (NSDL) Can
we build a very low cost national science library
using the methods of automated digital libraries?
One of Six Core Integration Demonstration
Projects for the NSDL
How Big might the NSDL be?
The NSDL aims to be comprehensive -- all
branches of science, all levels of education,
very broadly defined. Five year targets
1,000,000 different users 10,000,000 digital
objects 100,000 independent sites
Requires low-cost, scalable, technology
automated collection building and maintenance
Levels of InteroperabilityMetadata Harvesting
Agreements on simple protocol and metadata
standard(s) Example Metadata harvesting
protocol of the Open Archives Initiative
(MHP) Moderate-quality services Low cost
of entry to participating sites Moderately large
numbers of loosely collaborating sites Promising
but still an emerging approach
Levels of InteroperabilityGathering
Robots gather collections automatically with no
participation from individual sites Examples Web
search services (e.g., Google) CiteSeer (a.k.a.
ResearchIndex) Restricted but useful services
Zero cost of entry to gathered sites Very
large numbers of independent sites Only suitable
for open access collections
Technology Demonstrations
1. One Library, Many Portals 2. Coherent
Services across Heterogeneous Collections 3.
Easy Integration of Participating Collections 4.
Variable Levels for Integrating Collections 5.
Tools to Create New Collections
Some Light Reading
William Y. Arms, "Automated digital libraries."
D-Lib Magazine, July/August 2000.
William Y. Arms, "Economic models for
open-access publishing." iMP, March 2000.