ALEXANDRIA DIGITAL LIBRARY PROJECT - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

ALEXANDRIA DIGITAL LIBRARY PROJECT

Description:

Photo Science, Inc. bucket-level. searching. field-level ... research and development community, adopting/adapting/sharing our ADL Gazetteer components ... – PowerPoint PPT presentation

Number of Views:265
Avg rating:3.0/5.0
Slides: 61
Provided by: greg69
Category:

less

Transcript and Presenter's Notes

Title: ALEXANDRIA DIGITAL LIBRARY PROJECT


1
ALEXANDRIA DIGITAL LIBRARY PROJECT
  • Larry Carver ? James Frew ? Greg Janée
  • Mike Goodchild ? Linda Hill ? Terry Smith
  • www.alexandria.ucsb.edu

2
Outline
  • Alexandria Digital Library Project (ADLP)
  • History
  • Goals, activities, partners
  • Distributed DL supporting georeferenced access
  • Research and development issues
  • Operational collections and services
  • Knowledge organization systems (KOS)
  • Gazetteers and related KOS
  • ADEPT learning environment
  • Concept-based learning spaces
  • Collections and services

3
ADLP History
  • Pre-1994 UCSB geo-information and map library
  • 1994-98 DLI-1 georeferenced collections/access
  • 1998-99 Operational ADL (UCSB Library/CDL)
  • 1999-2004 DLI-2 distributed DL
  • Extension of architecture and access services
  • Knowledge organization services
  • Integration of learning services
  • Geo/GIS-based interfaces
  • Basic CS research
  • 2004-2008 Large-scale DLs and beyond
  • NSDL Core Infrastructure and services
  • Cyber Infrastructure

4
ADLP Goals
  • Current goals Distributed DLs and applications
  • Operational distributed digital library
  • services for construction/use of georeferenced
    collections
  • DL federation and interoperation
  • scalability over many heterogeneous collections
  • Development/integration of KOS services
  • Integration of concept-based learning spaces
  • services for creating/using learning environments
  • Development of geo-based interfaces
  • Evaluation of services
  • Basic computational science research
  • Emerging goals Large-scale DLs and beyond
  • Extending NSDL Core Infrastructure and services
  • Cyber Infrastructure

5
ADLP Major Collaborative Activities
  • 1994-98
  • 4 DLI-1 partners CMU, Illinois, Stanford, UCB
  • SDSC, U.Arizona, US Navy, NIMA, LoC, MSFT, ESRI,
  • 1999-2004
  • UCSB Library, CDL
  • DLI-2 partners UCLA, GT, SDSC/NPACI, Stanford,
    UCB
  • DLESE
  • NSDL CI partners Cornell, Columbia, U.Mass
  • NSDL Services partners IIT Chicago, UCSD
  • JISC partners Penn State, Southampton, Leeds

6
ADLP Activities
7
Outline
  • Alexandria Digital Library Project (ADLP)
  • History
  • Goals, activities, partners
  • Distributed DL supporting georeferenced access
  • Research and development issues
  • Operational collections and services
  • Knowledge organization systems (KOS)
  • Gazetteers and related KOS
  • ADEPT learning environment
  • Concept-based learning spaces
  • Collections and services

8
Goals
  • Digital library architecture for
    geospatial/georeferenced information
  • heterogeneous
  • rich services
  • scalable
  • many providers
  • collections, large and small
  • DL infrastructure, not artifact
  • standard components and interfaces
  • distributed participants

9
Issue discovery
  • Naïve approach
  • I want a map of Boulder
  • ? Downtown street map of Boulder, Colorado
  • But... remote-sensing imagery is nameless
  • AVHRR NOAA-13 2002-06-03 1433 UTC
  • But... direct placename search is unreliable
  • I want a map of the Flatirons in the Rocky
    Mountains just behind Boulder, Colorado
  • ? USGS topographic map Eldorado Springs
  • generally many names for any given place

10
ADL approach
  • Coordinate-based representation and discovery
  • lat/lon coordinates
  • rich geometry
  • polygons, polylines
  • spatial operators
  • overlaps, contains
  • Gazetteer
  • content standard defines representation
  • service maps placenames ? coordinates

placenames
client
gazetteer
coordinates
library
11
Issue multiple data types
  • Geospatial discovery is not amenable to text
    treatment
  • constitutes new data type
  • Adding notion of different data types has many
    implications
  • input validation
  • internal structures, external representations
  • query language and processing
  • ranking
  • user interface components

12
ADL approach
  • Discovery bucket framework
  • extensible data type system for metadata
  • XML representations, search operations
  • native metadata is explicitly mapped to buckets
  • software supports bucket views over arbitrary
    RDBMSs
  • 9 Dublin Core-like standard buckets
  • User interface components
  • background maps, item footprint
    identification/creation
  • Spatial ranking
  • by spatial similarity to query region

13
Bucket mapping
Originator
FGDC Citation/Originator
U.S. Geological Survey
USGS DOQ Producer
Photo Science, Inc.
14
Collection statistics
Spatial
Temporal
Object Type cartographic works maps images
photographs aerial photographs

Count 324,876 324,876 2,014,799 484,083 484,083
15
ADL approach
  • Discovery bucket framework
  • extensible data type system for metadata
  • XML representations, search operations
  • native metadata is explicitly mapped to buckets
  • software supports bucket views over arbitrary
    RDBMSs
  • 9 Dublin Core-like standard buckets
  • User interface components
  • background maps, item footprint
    identification/creation
  • Spatial ranking
  • by spatial similarity to query region

16
ADL in context
ADL
affordances
DLs
17
Issue scalability
  • Size
  • easy to accumulate lots of data
  • satellites image continuously
  • geospatial discovery scales... not so well
  • indexing unwieldy at 106 items
  • efficiently joining spatial, other constraint
    types is difficult
  • Burden management
  • collection building is labor-intensive
  • providers have differing content, services, IP
    concerns, policies, lifetimes
  • providers already exist
  • MS Terraserver 3 TB, 750 million items

18
ADL approach
  • Distributed library of peer nodes
  • library nodes host collections
  • other nodes host gazetteers, thesauri, other KOS
  • other components, e.g., map servers
  • Federated item-level search
  • over buckets
  • over individual metadata fields mapped to buckets
  • Centralized collection-level search/ranking
  • over collection statistics derived from bucket
    mappings
  • space, time, type, format
  • any library node can act as collection registry
  • Collection aggregation

19
Issue context use of library items
  • Context is critical in geospatial DLs
  • formulating queries
  • evaluating result sets and individual results
  • Use of geospatial data
  • need access descriptions
  • item content ? single URL is insufficient
  • multiple formats
  • multiple access methods
  • multiple components
  • need integration with common data environments
  • ARC/INFO, etc.

20
Geospatial context
  • Does this answer your question?

21
ADL approach
  • All library functionality is accessible via...
  • web service APIs
  • Java RMI
  • Content access model
  • characterizes methods of access
  • multiple access points
  • download, service, web interface, offline
  • hierarchies of alternatives, decompositions
  • Context
  • background maps
  • library-supplied lightweight GIS functionality

22
Incorporation into NSDL/CI
  • Geospatial/georeferenced data is an instance of
    science data
  • complex, well-defined structure
  • rich metadata
  • large size
  • poorly served by traditional information
    retrieval methods
  • Science data belongs in NSDL
  • For NSDL comparable infrastructure enabling...
  • distributed, content-specific search services
  • association of DL items and content-specific
    helper tools

23
Operational status
  • ADL co-developed with UCSB Library
  • production-quality software
  • foundation of operational library since 2000
  • complete system in 2003
  • UCSB Library Map Imagery Laboratory (MIL)
  • self-supporting, 5 full-time employees
  • 2.6 million items, 6.5 TB, growing 1.5 TB/year
  • 4.5 million item gazetteer
  • Remote sites
  • ESSW, CNR, DLESE, SIO, NTNU, AUT

24
Outline
  • Alexandria Digital Library Project (ADLP)
  • History
  • Goals, activities, partners
  • Distributed DL supporting georeferenced access
  • Research and development issues
  • Operational collections and services
  • Knowledge organization systems (KOS)
  • Gazetteers and related KOS
  • ADEPT learning environment
  • Concept-based learning spaces
  • Collections and services

25
KOS activities contributions
  • KOS as primary components of DL architecture
  • Heretofore not acknowledged as a major component
  • ADL/ADEPT thesaurus and gazetteer service
    protocols
  • Gazetteer components of DLs
  • Growth of a research and development community,
    adopting/adapting/sharing our ADL Gazetteer
    components
  • Gazetteer research issues
  • NSDL Textual Geospatial Integration Project
  • KOS integration into learning environments
  • Terry Smith will address this in detail

26
Digital Library Components
CATALOG OF METADATA
27
KOS Generalization
Concept
Type
Definition
Label
Relationships
Meaning
Navigation
Translation
Sense-making
28
Digital Gazetteer Essentials
Name
  • None of these elements are unique identifiers of
    a particular place

29
Building gazetteer research community
  • 1994-1996 ADL built the first multi-million-entry
    international gazetteer and integrated it into
    the ADL system
  • 1996-1999 ADL created...
  • Gazetteer Content Standard
  • Feature Type Thesaurus (210 preferred terms 1046
    non-preferred)
  • rebuilt the ADL Gazetteer (over 4 million
    entries)
  • provided web interfaces for searching the ADL
    Gazetteer

30
Building a research community
  • 1999-present
  • Set of 5.9 million geographic names available for
    download useful for placename recognition in
    text
  • Gazetteer Service Protocol and protocol server
    code
  • An external identifier for ADL Gazetteer
    records
  • New gazetteer client that is based on the
    gazetteer protocol

31
Our network of gazetteer interactions
32
Advancing and extending gazetteers
33
Advancing and extending gazetteers
34
Advancing and extending gazetteers
Obtaining extents from image analysis
  • Recognizing patterns
  • Identifying features from gazetteers
  • Deriving the extent of the features from feature
    analysis
  • Adding bounding box footprints to gazetteer
    entries

Santa Barbara Municipal Airport
35
Advancing and extending gazetteers
The duplicate detection problem. Given variant
names and variant footprints, how do we determine
that two pieces of information are about the same
place?
36
Advancing and extending gazetteers
37
Gazetteer ITR Proposal
  • Advancing and Extending Georeferencing
    Interoperability and Services (AEGIS)
  • Medium ITR proposal for 2003
  • Michael Goodchild, UCSB, PI
  • Lewis Lancaster, Berkeley/ECAI, co-PI
  • Formalization and extension
  • Performance and scalability
  • Cross-cultural issues
  • Cognitive and behavior issues
  • Extents representation of a features geometry
  • Integration of locator services

38
NSDL Textual Geospatial Integration
2001 - 2003
  • Goals
  • Extend NSDL infrastructure by enabling
  • geographic queries
  • across heterogeneous, text and non-text resources
  • spatial georeferencing
  • of arbitrary texts without explicit geographic
    cataloging
  • Participants
  • University of California, Santa Barbara
  • James Frew, PI
  • Terence Smith
  • Michael Bueno
  • Linda Hill
  • Information Retrieval Lab, Illinois Institute of
    Technology
  • Ophir Frieder
  • David Grossman
  • Eric Jensen
  • Steve Beitzel

The American Geological Institute (AGI) has
permitted us to use a set of their GeoRef records
for system training.
39
Example text -gt Estimated footprint
Structure and petrography of the schist of
Skookum Gulch, Callahan-Yreka area, eastern
Klamath Mountains, Northern California ltkeygtbluesc
hist California Callahan California
foliation Klamath Mountains melange
metamorphic rocks Ordovician Paleozoic
petrology schists Silurian Siskiyou County
California Skookum Gulch United States
Yreka Californialt/keygt ltabgtThe schist of Skookum
Gulch (SSG) is an informal name applied to a
fault-bounded melange composed mainly of
schistose metamorphic rocks and less abundant
sedimentary and igneous rocks located in the
eastern Klamath Mountains of Northern California.
The SSG features outcrops of lawsonitesodic
amphibole blueschist and epidotesodic amphibole
rocks transitional to the greenschist facies.
Isotopic dating indicates that the schist was
metamorphosed during the Ordovician. The SSG is
the oldest known Paleozoic blueschist-bearing
melange in California and one of the oldest
preserved blueschist terranes in North America.
Tonalitic rocks associated with the schist have
Early Cambrian ages and are among the oldest
rocks yet dated within the Klamath Mountains.
Field relations indicate that the schist of
Skookum Gulch is a complex tectonic melange
composed of metavolcanic, ...lt/abgt ltcoordgtN410000N
420000W1220000W1230000lt/coordgt
  • Derived footprint - small
  • Blue derived footprint large
  • Red GeoRef footprint

40
KOS activities contributions
  • KOS as primary components of DL architecture
  • Heretofore not acknowledged as a major component
  • ADL/ADEPT thesaurus and gazetteer service
    protocols
  • Gazetteer components of DLs
  • Growth of a research and development community,
    adopting/adapting/sharing our ADL Gazetteer
    components
  • Research issues
  • NSDL Textual Geospatial Integration Project
  • KOS integration into learning environments
  • Terry Smith will address this in detail

41
Outline
  • Alexandria Digital Library Project (ADLP)
  • History
  • Goals, activities, partners
  • Distributed DL supporting georeferenced access
  • Research and development issues
  • Operational collections and services
  • Knowledge organization systems (KOS)
  • Gazetteers and related KOS
  • ADEPT learning environment
  • Concept-based learning spaces
  • Collections and services

42
Applications services based on DLs
  • Integrate applications with DL infrastructure
  • Web portals lack library organization
  • packages not integrated with DLs
  • Important applications include
  • Services/collections supporting learning
    environments
  • Services/collection supporting research
  • Apply domain-specific KOS principles for
    organizing collections/services for given
    application
  • Geospatial applications use georeference
  • Science learning environments use concept spaces

43
Science learning spaces Concept KOS
  • Concepts of science as basic knowledge granules
  • Sets of concepts form bases for scientific
    representation
  • DL and KOS technology can support organization of
    science learning materials in terms of concepts
  • Collections of models of science concepts
    (knowledge base)
  • Collections of learning objects (LO) cataloged
    with concepts
  • Collections of instructional materials organized
    by concepts
  • Organize learning materials as trajectory
    through concept space
  • Lecture, lab, self-paced materials
  • Services for creating/editing/displaying such
    materials

44
Learning environment components/services
45
Application to learning environments
  • Application
  • Introductory physical geography (F2002, S2003)
  • Collections created
  • Knowledge base (KB) of strongly structured
    concepts
  • Structured lectures and labs
  • Learning objects cataloged by ADN metadata (
    concepts)
  • Services created
  • For concepts
  • Web-based concept input tool
  • Graphic and text-based display tools
  • For instructional materials
  • Web-based lecture composer
  • Conceptualization graphing tool
  • For learning objects
  • Metadata input tool

46
Learning environment display (lecture mode)
  • The lecture is presented on three projection
    screens, showing the
  • Concept window (left)
  • Lecture window (center)
  • Object window (right)

47
Model of science concepts
  • Representing a concept involves more than terms
  • Objective, information-rich, scientific
    representations
  • e.g., for concepts of heat diffusion, DNA,
    drainage basin,
  • Associated semantics
  • e.g., relating to measurement, recognition,
  • Many interrelationships
  • e.g., hierarchical, causative, property,
  • Models of science concepts
  • Already exist for chemistry (ASA), materials
    (NIST),
  • Generalize such models for this application
  • Structure items in concept KB using model

48
Model of science concepts
  • ID
  • TYPE and FACET
  • CONTEXT (KNOWLEDGE DOMAIN)
  • TERM(S) (P/NP)
  • DESCRIPTION(S)
  • HISTORICAL ORIGIN(S)
  • EXAMPLE(S)
  • HIERARCHICAL RELATIONS
  • DEFINING OPERATIONS
  • SCIENTIFIC REPRESENTATION(S)
  • Scientific classifications
  • Data/Graphical/Mathematical/Computational reps
  • PROPERTIES
  • CAUSAL RELATIONS
  • CO-RELATIONS
  • APPLICATION(S)

49
Item in concept knowledge base
50
Concept input tool
51
Collections of learning materials
  • Lecture/lab composer
  • Creates learning materials with
  • Tailorable structure
  • Underlying organization as forest of trees of
    concepts
  • Small reusable granules for
  • Easy creation/edit/access/re-use
  • Can link in
  • Concepts from concept KB
  • Items from learning object collections
  • Items from lecture collection

52
Current instructional material window
  • The left-hand frame displays the structure of the
    lecture
  • The right-hand frame displays the content of the
    lecture
  • ADL icons (globe image) attached to a concept
    link to a display of concept properties in the
    concept window

Other icons attached to a concept link to a
display of concept examples in the illustration
window
53
View of learning material by concepts
54
Lecture/lab/ composer tool
55
Learning object collections
  • Cataloged with tool for metadata creation
  • ADN metadata content standard with concept fields
  • Use of ADL/ADEPT middleware search services
  • E.g., in creation of lecture/lab presentation
    materials
  • Display of collection items in collection window
  • Photos, images, maps, text, videos,
  • Support in display window for ADL browser
  • Allows dynamic search of collection holdings

56
The illustrations window
57
Evaluation of concept-based approach
  • Evaluation of efficacy for student learning
  • Do students attain deeper levels of
    understanding?
  • Comparison approach to evaluation
  • Evaluation of value to instructors/TAs
  • UCLA evaluation team
  • Evaluation issues
  • Instrumenting students use of course materials
  • Time to assess pedagogic value of approach

58
Example of lessons learned
  • Importance of conceptualizations of concept
  • e.g., characterize concept of Fluvial Landscape
    with concepts of River, Watershed
  • Embed conceptualizations in lecture/labs (not in
    KB)
  • Idea of learning materials as trees in concept
    space
  • Construct labs using analogous lab composer
  • Tailored for lab presentations/work
  • Supports of logic of using concepts as framework
  • Can import material from lecture/other
    collections

59
Summary
  • DL infrastructure as basis for Learning
    Environments
  • Collections
  • Concept KBs, Lectures, DL objects
  • Services
  • Creation/Search/Display
  • Evaluation of efficacy of approach
  • Community-based development of KBs, Learning
    Materials, Collections

60
ADLP Activities
Write a Comment
User Comments (0)
About PowerShow.com