Mike Folk - PowerPoint PPT Presentation

About This Presentation
Title:

Mike Folk

Description:

... and Grid data intensive apps, Visualization, ... Interprets dataset as 'tables' collections of records. Insert, delete ... New hdf-object package ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 49
Provided by: hdf4
Learn more at: http://www.hdfeos.org
Category:
Tags: collections | folk | mike

less

Transcript and Presenter's Notes

Title: Mike Folk


1
HDF Update
  • Mike Folk
  • National Center for Supercomputing Applications
  • HDF and HDF-EOS Workshop VI
  • December 4-5, 2002

2
Topics
  • Who is supporting HDF
  • HDF software in 2002
  • Other activities of interest

3
Who is supporting HDF?
  • NASA/ESDIS
  • Earth science applications, instrument data
  • DOE/ASCI (Accelerated Strategic Computing Init.)
  • Simulations on massively parallel machines
  • NCSA/NSF/State of Illinois
  • HPC and Grid data intensive apps, Visualization,
    user support
  • Atmospheric and ocean modeling environments
  • DOE Scientific Data Analysis Computation
    Program
  • High performance I/O R D
  • National Archives and Records Administration
  • Small grant to consider HDF5 as an archive format

4
HDF software in 2002
  • Library releases
  • Java Products
  • Tools
  • Compression
  • Investigations of Web technologies

5
HDF4 library
  • No releases in 2002.
  • Release 1.6 planned for May, 2003
  • Bug fixes
  • New compilers
  • Intel
  • Portland Group
  • New OS
  • Mac OS X
  • AIX 5.1 64-bit

6
HDF5 software milestones in 2002
Q1 02
Q2 02
Q3 02
Q4 02
Base library
High level library
Java products
Other tools
7
HDF5 library in 2002
  • Compilers, configuration, etc.
  • h5cc script to simplify compilation of HDF5
    programs
  • F90 shared library and C supported on Windows
  • Intel C, F90 and C on Linux, IA32/64 and
    Windows
  • Support for zlib 1.1.4
  • Performance
  • Added library performance tests
  • Performance improvements
  • hyperslabs, data conversions. chunking
  • Fewer and larger I/O requests when accessing a
    file
  • Parallel I/O performance improvements

8
Parallel HDF5
  • Parallel I/O performance benchmark suite
  • Compares raw I/O, MPI-I/O, and HDF5 I/O
  • Distributed with HDF5
  • http//hdf/RFC/PIO_Perf/PHDF5_performance.html
  • Parallel HDF5 tutorial
  • http//hdf.ncsa.uiuc.edu/HDF5/doc/Tutor/
  • Flexible parallel HDF5 programming model
  • More flexible model for parallel HDF5
  • Performance studies and tuning activities

9
Next major release -- HDF5 1.6
  • Release date Spring 2003
  • New format and library features include
  • Compression enhancements, including szip
  • Generic Properties
  • Checksum
  • Dimension scale support (tentative)
  • Performance improvements include
  • Chunking compression
  • Parallel I/O performance benchmark suite

10
Next major release -- HDF5 1.6
  • Flexible parallel HDF5
  • Special platforms
  • Large Compaq cluster (Pittsburgh SC)
  • Crays
  • Windows XP
  • Mac
  • Several new compilers (e.g. Intel, Portland
    Group)
  • Documentation
  • New Users Guide-good draft, first version

11
High level APIs
  • Make HDF5 easier to use
  • More operations per call than the normal HDF5 API
  • Encourage standard ways to store objects
  • Enforce standard representation of objects in HDF5

12
High level APIs
  • Lite done
  • Same as HDF5, but simpler
  • Image done
  • Interprets dataset as image/palette
  • 2-D raster data like HDF4 raster images
  • Table partly done
  • Interprets dataset as tables collections of
    records
  • Insert, delete records or fields
  • Future sort and search
  • Dimension scale in the works
  • Unstructured grids in the works
  • http//hdf.ncsa.uiuc.edu/HDF5/hdf5_hl/doc/

13
HDF5 tools activities
14
HDF Java Products 2002
  • Goal replace older tools with single
    viewer/editor
  • HDF Java Products
  • Java HDF Interface (JHI) to access the HDF4
    library.
  • Java HDF5 Interface (JHI5) to access the HDF5
    library.
  • New hdf-object package understands HDF4 and
    HDF5.
  • HDFView tool for browsing/editing HDF4 and HDF5
  • See demo, brochure, CD, web page
  • http//hdf.ncsa.uiuc.edu/hdf-java-html/

15
HDFView releases in 2002
Version 1.0 Browser for both HDF4 and HDF5
Version 1.1 Editor for both HDF4 and HDF5
Version 1.2 All features of old Java tools. Some
new features.
HDFView can do as much as JHV and H5View and also
includes many new editing features http//hdf.ncsa
.uiuc.edu/hdf-java-html/hdfview/
16
H4toH5 Conversion Toolkit
  • Goal support transition from HDF4 to HDF5
  • Version 1.0 released in July 2002
  • Includes
  • h4toh5 converter
  • h5toh4 converter
  • library of functions for converting HDF4 objects
    into HDF5 objects
  • Download from
  • http//hdf.ncsa.uiuc.edu/h4toh5/libh4toh5.html
  • Mapping specification and FAQ
  • http//hdf.ncsa.uiuc.edu/HDF5/doc/ADGuide/H4toH5Ma
    pping.pdf

17
Other tools work
  • H5import - convert flat files to HDF5 datasets
  • ASCII text file with numeric data (float or
    integer)
  • Binary file with native floating point data
  • Binary file with native integer data
  • hdf4import souped up version of the old fptohdf
  • Available in hdf4r1.6
  • HDF5-to-GIF and GIF-to-HDF5 converters
  • H5dump improvements
  • Subsetting
  • Support variable length datatypes including
    strings

18
Other tools work
  • H5diff
  • compare the structure and contents of two HDF5
    files, and report differences
  • Command line utility like Unix diff and older
    hdiff
  • Report missing objects, inconsistent size,
    datatype, etc.
  • Compare values of numeric datasets
  • First beta available January 2003
  • RFC http//hdf.ncsa.uiuc.edu/RFC/H5diff/h5diff.ht
    ml

19
Compression
  • Szip - fast compression method for EOS data
  • Expect to include in next releases of HDF4 and
    HDF5
  • Shuffling reorder bytes before compressing
  • Can improve compression ratio
  • Performance study BZIP2 vs gzip compression
  • Study whether or not to support bzip2
    compression
  • Result BZIP2 not significantly better than gzip
  • So not currently supported in the release
  • But BZIP2 can be used with HDF5

20
Investigations of Web technologies
21
HDF5 XML
  • Great interest in XML, interoperation of XML and
    binary formats
  • Results
  • HDF5 DTD
  • h5dump XML
  • H5View reads XML and writes HDF5
  • Studies, design notes, other info
  • http//hdf.ncsa.uiuc.edu/HDF5/XML/
  • Possible future activity
  • XML schema
  • Update tools
  • HDF4 schema, tools
  • Format translation via XSLT

22
XML, Java Server Pages, etc.
  • How to use HDF5 data in Web environment
  • Experiments with XML, Java Server Pages (JSP),
    etc.
  • JSP server
  • Access HDF5 files on Web server using Web
    browser, or Java applet, or Java application
  • Several variations demonstrated
  • Is not a product!
  • http//hdf.ncsa.uiuc.edu/HDF5/XML/

23
CORBA Experiments
  • HDF5 with CORBA on distributed systems
  • Prototype CORBA server to wrap HDF5 library and
    datasets (C)
  • Remote access via C, Java, Web
  • Might be valuable as replacement for Java Native
    Interface
  • Successful demonstration, but many open issues
  • Is not a product!
  • http//hdf.ncsa.uiuc.edu/HDF5/XML/JSPExperiments/
    index.html

24
Other Activities of Interest
25
NPOESS
  • National Polar-orbiting Operational Environmental
    Satellite System
  • Combine satellite systems of civil and defense
    programs
  • HDF5 to be used to distribute data to users
  • First implementation in 2006
  • Support the NPOESS Preparatory Program
  • Later full implementation by 2013
  • Converged system provides global coverage
  • http//www.ipo.noaa.gov

26
Neutron Research Community
  • Worldwide research community
  • England, France, Germany, Japan, Italy,
    Switzerland, Russia
  • US centers at Argonne, NIST, Los Alamos
  • Neutron and X-ray scattering experiments and
    simulations
  • Common software and formats to gather, share,
    archive, post-process data
  • NeXus data format
  • Enforces standardization of metadata and data
    structures
  • Based on HDF4 for many years
  • Now switching to HDF5
  • http//www.neutron.anl.gov/nexus/

27
National Archives and Records Administration
  • Pilot project for HDF5
  • Explore scientific data format requirements for
    long term archiving of electronic records
  • Identify record types for which HDF5 is suited

28
Atmospheric and Ocean Models
  • Modeling Environment for Atmospheric Discovery
    (MEAD)
  • HDF5 for high performance I/O for atmospheric and
    ocean modeling
  • Weather Research and Forecasting (WRF) model
  • Regional Ocean Modeling System (ROMS)
  • Coupling of WRF and ROMS
  • UAH ESML data mining also involved

29
HDF5 Mesh API prototype
  • Support for structured and unstructured mesh
    data
  • For applications such as computational fluid
    dynamics, finite element analysis, and
    visualization.
  • A higher-level API
  • Format
  • HDF5 groups and datasets to organize the data
  • Collaboration involving NCSA, CEI and others
  • Documentation still pretty sketchy, but see
  • ftp//ftp.ensight.com/pub/HDF_RW/hdf_rw.tgz
  • Discussion list in the works

30
HDF5 Wins 2002 RD Magazine Award
The 100 products and processes that are the most
technologically significant and can change
people's lives for the better http//www.ncsa.u
iuc.edu/News/Access/Releases/020722.HDF5.html
31
Thank you!
Information Sources
  • HDF website
  • http//hdf.ncsa.uiuc.edu/
  • HDF5 Information Center
  • http//hdf.ncsa.uiuc.edu/HDF5/
  • HDF Helpdesk
  • hdfhelp_at_ncsa.uiuc.edu
  • HDF users mailing list
  • hdfnews_at_ncsa.uiuc.edu

32
Backup slides
33
HDF5 funding sources
34
HDF5 User Community
  • Worldwide use in government, academia, industry
  • How many users?
  • 450 organizations or individuals have filled in
    user form in the past year
  • There are many times this many anonymous users
  • And some organizations have thousands of users
    (e.g. the Earth Observing System)
  • Public applications
  • More than 25 publicly available applications
  • Four vendors so far
  • LabVIEW
  • IDL
  • EarthScan Network
  • HDF Explorer
  • Others in the works (e.g. Matlab)

35
Technical fields that use HDF5
  • Aerospace
  • Agricultural research
  • Air traffic control
  • Aircraft emissions database
  • Applied mathematics
  • Astrophysics
  • Astrophysics / supernovae
  • Atmospheric chemistry
  • Atmospheric physics
  • Bioengineering
  • CEM Simulation
  • Climatology / hydrology
  • Computational fluid dynamics
  • Computational physics
  • Computational physics / education
  • Computational physics and computational
    astrophysics
  • Computer modeling
  • Computer science
  • Data processing
  • Photonic band gap studies
  • Photonic crystals
  • Photonics
  • Post-fire erosion analysis
  • Protein crystallography, molecular modeling
  • Protostellar accretion discs
  • Remote sensing
  • SAR processing
  • Satellite / weather radar remote sensing
  • Satellite oceanography
  • Semiconductor process simulation
  • Software engineering, distributed systems
  • Space geodesy
  • Space physics
  • Surface water flow and sediment transport
  • Theoretical chemistry
  • Visualization
  • Volcanology
  • Water resources management
  • Environmental science
  • Fast searching, sorting and retrieval
  • Film making special effects
  • Fluid mechanics
  • GIS
  • Geodetic Science
  • Geology
  • Gravitational physics
  • Hydrology
  • Information technology
  • Magnetic mass spectrometer development
  • Marine biology / ecology
  • Materials science
  • Meteorological data products
  • Meteorology
  • Microscopy
  • Molecular biology
  • Nano device simulation
  • Neutron scattering

36
Users of HDF5 66 countries
37
Next major release -- HDF5 1.6
38
Next major release -- HDF5 1.6
  • Performance improvements
  • Chunking
  • Compression (several)
  • Parallel I/O
  • Metadata I/O
  • Compact dataset storage
  • Other parallel
  • Parallel I/O performance benchmark suite
  • Flexible parallel HDF5
  • Portland group C, Fortran 90 and C compilers
  • Quite a bit of Fortran work

39
Next major release -- HDF5 1.6
  • Testing (several)
  • Special platforms
  • PSC cluster
  • Cray
  • Windows XP
  • Mac
  • Several new compilers (e.g. Intel, Portland
    Group)
  • Documentation
  • New Users Guide-good draft, first version

40
HDF5 High Level APIs HDF5 Image
  • For datasets to be interpreted as images/palettes
  • 2-D raster data like HDF4 raster images
  • Image operations
  • Create, write, read, query
  • Based on HDF5 Image Palette Specification

41
HDF5 High Level APIs HDF5 Table
  • For datasets to be interpreted as tables
  • A collection of records
  • All records have the same structure
  • Like Vdatas in HDF4, but more operations
  • Table operations
  • Create, write, read, query
  • Insert, delete records or fields
  • Future sort and search
  • Includes the following new Table functions

42
HDF5 High Level APIs HDF5 Table
  • For datasets to be interpreted as tables
  • A collection of records
  • All records have the same structure
  • Like Vdatas in HDF4, but more operations
  • Table operations
  • Create, write, read, query
  • Insert, delete records or fields
  • Later sort and search

43
HDF5 High Level API Future
  • Dimension scales
  • Similar to HDF4
  • In progress
  • More table operations
  • sort and search
  • Unstructured grids
  • E.g. triangle mesh

44
Szip Compression Software
  • Implements CCSDS lossless compression algorithm
  • Fast compression method for EOS data
  • Expect to include in next releases of HDF4 and
    HDF5
  • HDF4 compress SDS and image
  • HDF5 compress datasets
  • Intellectual property issues
  • Owned by U of Idaho (formerly U of New Mexico)
  • Open source
  • No commercial of encoder use without license
  • Decoder free for everyone

45
Performance study BZIP2 compression
  • Goal decide whether or not to support bzip2
    compression
  • Compared bzip2 and gzip
  • Observations
  • Bzip2 always better than gzip in compression
    ratio
  • But the difference was just a few percentage
    points
  • And bzip2 always takes more processing time,
    especially for decoding
  • Result
  • Not currently supported in the release
  • But BZIP2 can be used with HDF5 (checked with
    HDF5-1.4.4)
  • http//hdf.ncsa.uiuc.edu/HDF5/papers/bzip2/

46
New HDFView features
  • Display palette in graph as separate RGB lines.
  • Open file as read-only option
  • Create new array from old array
  • Import data from text file
  • Save to HDF4, HDF5 or binary
  • Create new image from subset of existing image
  • Modify string-type dataset content
  • Convert jpeg to HDF image
  • Convert HDF to jpeg image
  • More user options and well organized GUI
  • Select vdata or compound datatype by field
  • Select subset from preview image and using mouse
  • Support unlimited dimension when creating new
    HDF4 dataset.
  • Enable application of simple math calculations to
    data
  • Support multiple palettes/image
  • Create new image with default attributes
  • Modify image palette or select predefined palette

47
CORBA, XML etc. permutations
The Net
Client/Remote
Server/Local
Distributed Product Demonstrated in
Research Should work, but not demonstrated
48
National Polar-orbiting Operational Environmental
Satellite System (NPOESS)
U.S. civil and defense programs to combine
weather data collection, expanding to global
coverage and long-term continuity of observations
at less cost!
POES
METOP
NPOESS
DMSP
DMSP
0830
0730
1330
1330
0930
0530
0830
0530
POES
POES
Local Equatorial Crossing Time
Local Equatorial Crossing Time
DMSP
DMSP
  • Today
  • 4-Orbit System
  • 2 US Military
  • 2 US Civilian

Distribute in HDF5
Write a Comment
User Comments (0)
About PowerShow.com