Distributed Archives Interoperability - PowerPoint PPT Presentation

About This Presentation
Title:

Distributed Archives Interoperability

Description:

Distributed Archives Interoperability. Cynthia Y. Cheung. NASA Goddard Space Flight Center ... in both Machine-understandable and Human-understandable form ... – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Slides: 10
Provided by: jimp161
Category:

less

Transcript and Presenter's Notes

Title: Distributed Archives Interoperability


1
Distributed Archives Interoperability
  • Cynthia Y. Cheung
  • NASA Goddard Space Flight Center
  • IAU 2000
  • Commission 5
  • Manchester, UK
  • August 12, 2000

2
Current Status
  • Global Astrophysics Data Resources Loosely
    Connected By the Internet
  • Observational data archives or repositories
  • Derived data products (astronomical catalogs,
    browse images, video)
  • Data analysis packages
  • Visualization/presentation packages
  • Special services (bibliography,
    discipline-specific knowledge bases, directories)
  • Distributed Storage, Processing, and Management
  • Multispectral surveys (Data volume terabytes)
  • Islands of Information?
  • Requires both Vertical and Horizontal Integration

3
Path to the Future
  • Current (connections via hyperlinks) one to one
  • Near Future (connections to multiple DBs all at
    once, via middleware) one to many
  • Long Term (multiple inter-connectivity, federated
    databases) many to many
  • Distributed Autonomous data centers
  • Intelligent Agents
  • User-defined Profiles and Preferences
  • Access via Multiple Interfaces

User
Resources
User
Middleware
Resources
User
Resources
4
Components of Interoperability
  • Integrated search and discovery
  • URL Registry (e.g., Yellow Pages, GLU,
    AstroBrowse)
  • Query processor (e.g., AMASE, ISAIA)
  • Browsing/visualization to support selection (ADC
    Data Viewer, AEQ)
  • Batch queries (Feed output stream of one data
    service to another)
  • Tools to support integration of results
  • Data and software exchange
  • FTP of data and software updates (pull)
  • Download of Browser Plug-in (pull)
  • Automated Updates (HST DB replication) (push)
  • Hybrid Techniques (with data cache or aircache)
    (push pull)
  • Packaging of software with data (XDF)

5
Technical Issues and Challenges
  • Example Positional correlation of objects in a
    region of the sky across multiple wavelengths
    (Radio, IR, Optical, UV, X-rays, Gamma Rays)
  • Data volume and network bandwidth
  • Cache of pre-computed results (e.g., astronomical
    catalogs)
  • Data filtering at data site, ship results only
  • Deployment of user code (platform independent
    S/W)
  • Data visualization for exploration and selection
  • Registration, Sensitivity, Positional Accuracy
  • Coordinate transformation on a large scale
  • Calibration and normalization
  • Query Optimization across Multiple Sites
  • Query execution plan for efficient
    cross-correlation
  • Indexing for fast access

6
Semantic Interoperability
  • Content-based Searches
  • Science goal driven queries instead of SQL
  • Data Understanding (Domain Context)
  • Human Interface gt S/W Mapping gt
    Object-oriented Mapping
  • Data Annotation for Correct Interpretation
  • Measured parameters, units, quality, range of
    validity
  • Algorithm and calibration used, pedigree
  • Theoretical models applied
  • Data Organization
  • File directory structure
  • Database schema
  • Need Information in both Machine-understandable
    and Human-understandable form

7
Metadata Standards
  • Syntax
  • Directory Structure
  • Size, Format, Location, URL
  • Semantics
  • Usage Convention (e.g., FITS)
  • Extensible Standards to Encompass Different
    Disciplines (DTD, XML)
  • Astronomical Nomenclature and Designation
  • Conceptual Data Model
  • Metadata Language or Representation
  • FITS, ASCII, IEEE Binary
  • Astronomical XML

8
Aspects of Metadata Usage
  • Ref Bretherton Singley 1994 Proc of 7th
    SSDBM, p. 166
  • Search, browse, retrieval (Human)
  • Data extraction and interpretation
  • Navigate among services
  • Ingest, quality assurance, (re-)processing
  • Science product generation pipeline
  • Content analysis
  • Storage, archive (Data Management)
  • Information relevant for effective system design
    and operation
  • Application to application transfer (Machine)
  • Enable context interchange (distributed queries
    and transformations)
  • Need transfer language with mappings from
    conceptual level to different logical
    representation

9
Other Supporting Tools
  • Interface Standards for Software Tools
  • Tools for Schema Mapping
  • Document logical structure of database (key
    elements and relationship)
  • Mapping of local definitions into common
    terminology
  • Track changes and updates at other sites
  • Tools for Data Integration and Fusion
  • Dynamic Interface with user preferences
  • Intelligent Software Agents to mediate
    interaction
  • Goal Global query to many distributed
    autonomous evolving data resources
Write a Comment
User Comments (0)
About PowerShow.com