HDX Data Model FITS, NDF and XML Implementation - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

HDX Data Model FITS, NDF and XML Implementation

Description:

FITS, NDF and XML Implementation. David Giaretta, Mark Taylor, Peter Draper, Norman Gray, Brian McIlwrath. Starlink Project, UK ... – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 20
Provided by: davidgi9
Category:
Tags: fits | hdx | ndf | xml | and | brian | data | implementation | mark | model

less

Transcript and Presenter's Notes

Title: HDX Data Model FITS, NDF and XML Implementation


1
HDX Data Model FITS, NDF and XML Implementation
  • David Giaretta, Mark Taylor, Peter Draper, Norman
    Gray, Brian McIlwrath
  • Starlink Project, UK

2
Outline
  • Progress report on structure approach to data
  • Bringing together existing Starlink data
    structure (HDS) with FITS and XML
  • Part of work to make Starlink software support
    the Virtual Observatory

3
Possible Requirements
  • Flexible and extensible
  • Capable of storing hierarchical data
  • Supplement FITS and VOTable
  • Strong astronomical data model
  • Convenient for local access
  • Usable on a wide range of platforms
  • Standard enough to be used widely
  • Amenable to remote access
  • Able to be indexed/interrogated in sophisticated
    ways

4
Example
5
VO data handling
  • VO applications might be expected to deal with
  • Remote data
  • Need to add metadata e.g. additional
    calibrations, refined astrometric data
  • Large numbers of interrelated files organised
    into various collections
  • Any single file may be part of several
    collections
  • New links may need to be created by users between
    distributed files
  • Distributed/ parallel processing
  • Automated processing may be needed e.g. dealing
    with large numbers of files

6
Complex Relationships possible problems
  • Applications could create increasingly complex
    aggregations of data, to express increasingly
    complex interrelationships
  • How are the relationship between the various
    components defined?
  • How can different applications exchange
    information effectively?
  • How can automated processing be facilitated?

7
Data Model proposal
8
Structured Object
  • Need Structured Object in addition to Data
    Objects
  • to separate the Structure from the Data
  • avoid overcomplicating e.g. FITS
  • allow use of distributed data
  • Easily extensible to allow additional metadata
    e.g. relationships
  • Need something moderately standardised because
  • applications from many sources must have some
    rules about what to do with things they do not
    understand
  • Need to be able to validate data even non-XML
  • Also need some standard components e.g.
  • NDX (see later)
  • VOTable

9
Structured Object HDX
  • Flexible data model based on many years
    experience with HDS using local data
  • Aim to support distributed processing and data
    holding
  • Uses URIs to point to data
  • Low (or even zero) overhead e.g. bare FITS files
    OK
  • Can be serialised as XML
  • Platform independent
  • Format agnostic
  • Natural data holding formats are XML and FITS

10
HDX
  • HDX is a particular, simple, StructureObject
  • An HDX is a W3C DOM which has a top-level element
    lthdxgt, and which is valid.
  • valid if each of the document element's children
    is
  • either unknown to the HDX system or,
  • if known, is validated by its declared validator
    (part of its registration)
  • the software implements the DOM API

11
Data access layer
  • The abstract data model has been implemented in a
    Java data-access library.
  • The support for the HDX system is distinct from
    the support for the various HDX types which are
    defined.
  • It is easy to extend the system to support new
    types
  • It is easy to extend the system to support new
    data storage resources, such as new file formats
    or a database serving an archive.
  • These extensions can be implemented in very
    efficient ways.

12
HDX UML from software
13
NDX
  • NDX represents an N-dimensional chunk of
    astronomical data and contains
  • Image pixel array (mandatory component)
  • Variance pixel array
  • Generalisable as error estimate systematic and
    random
  • Quality pixel array
  • Bad bit mask
  • WCS information
  • History information
  • Title
  • Units
  • User-defined extensions (stored as XML)
  • ...e.g. statistical values, thumbnails etc
    probably calculated lazily
  • .

14
NDX operations
  • Simple operations e.g. ndx1.add(ndx2) takes care
    of variance, quality, WCS etc (where these
    components are present)
  • Access is available to individual arrays
    (NDArrays) is possible for more complex
    algorithms

15
More details NDX/NDArray
  • The philosophy and design goals behind
    NDArray/NDX are
  • Very large arrays can be processed.
  • Arrays of unlimited size can be processed in
    limited memory. 
  • Bad value processing is comprehensive and
    transparent. 
  • Array access is direct and transparent between
    different formats. 
  • Resource naming is location transparent. 
  • Deferred processing. 
  • Extensibility. 
  • Type independence where possible.

16
NDX UML
17
Treeview native view
18
View as HDX/NDX
19
Summary
  • Work in progress
  • There is a Java implementation see the Starlink
    demo
  • Will be changed to support developing VO
    standards
  • Extensible
  • Supplements FITS and VOTable does not aim to
    replace them
Write a Comment
User Comments (0)
About PowerShow.com