Mike Folk - PowerPoint PPT Presentation

About This Presentation
Title:

Mike Folk

Description:

James Laird. Raymond Lu. John Mainzer. Robert McGrath. Pedro Nunes. Elena ... the integrating simulation and modeling capabilities and technologies needed ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 39
Provided by: hdf4
Learn more at: http://hdfeos.org
Category:

less

Transcript and Presenter's Notes

Title: Mike Folk


1
HDF Update
  • Mike Folk
  • National Center for Supercomputing Applications
  • HDF and HDF-EOS Workshop VIII
  • October 27, 2004

2
Topics
  • HDF Team and Supporters
  • HDF software update
  • Other Activities of Interest

3
The HDF Team
Xuan Bai Frank Baker Peter Cao Vailin Choi Mike
Folk Barbara Jones Quincey Koziol James
Laird Raymond Lu
John Mainzer Robert McGrath Pedro Nunes Elena
Pourmal Binh-minh Ribler Eric Shapiro Rishi
Sinha Kent Yang
And all those wonderful folks out there who
contribute ideas, requests, bug reports, code,
and support.
4
Organization
HDF Project
Basic library development
Support, doc, QA, maintenance
Tools and Java
Parallel I/O, Grid, big machines
  • Staff breakdown
  • User support, documentation
  • QA, maintenance, testing
  • Software development
  • System administration
  • Management
  • See Thursday tutorial on HDF Software Process

5
Who is supporting HDF?
  • Organizations and communities with institutional
    and financial commitment to HDF
  • NCSA, NASA, State of IL, DOE, Boeing
  • Agencies supporting RD
  • NCSA, NASA, NARA, DOE, NSF, ONR
  • Collaborators who make in-kind contributions
  • Cactus, PyTables, NeXUS, CGNS, many others

6
HDF Software Update
7
HDF software milestones in FY 2004
HDF 4.2r0
HDF5 1.6.2
HDF5 Java 2.0HDF5 High Level
Flexible parallel HDF5 (Alpha)
HDF5 1.6.3
Jan
Feb
Mar
Apr
May
Jun
Jul
Aug
Sep
Oct
Dec
2003
2004
8
HDF4.2 Release 0 Dec. 2003
  • Bug fixes
  • New features
  • Support for new platforms and compilers

9
HDF4.2 Release 0Bug fixes
  • Support for reading NetCDF 3.5 files with
    multiple unlimited dimensions
  • Multiple bug fixes and improvements to HDF4
    dumper utility hdp
  • Improvements to HDF ? GIF converter hdf2gif

10
HDF4.2r0New Features
  • Tools (per DAAC and Instrument Team requests)
  • hdfimport converts float/integer data to
    SDS/raster
  • Replaces fp2hdf
  • Hdiff compares two HDF4 files
  • Revision of earlier hdfdiff tool
  • Hrepack makes a copy of an HDF4 file
  • optionally rewrite objects with compression,
    chunking, etc.
  • h4cc, h4fc, h4redeploy
  • Helper scripts to facilitate compilation and
    installation

11
HDF4.2r0New Features
  • Szip compression
  • Fast compression method
  • Available on all platforms except Crays
  • NCSA distributes Szip source and binaries
  • HDF Library binaries come with SZIP enabled
  • SZIP Documentation available from
    http//hdf.ncsa.uiuc.edu/SZIP

12
HDF4.2r0New Configuration
  • Addressing key needs
  • Porting to new platforms
  • New versions of JPEG and ZLIB libraries
  • Optional SZIP compression
  • Many features were hard coded, but could be done
    at configuration time

13
HDF4.2r0New Compilers and Platforms
  • New compilers
  • Intel C and Fortran
  • Portland Group Compilers (C only for now)
  • New OS
  • Mac OSX
  • RedHat 8/9
  • AIX 5.1 64-bit
  • OSF1
  • Linux 64 (SuSE and RH8) (JPL machines)
  • Altix (Aura Team)

14
HDF5 1.6.2 Feb. 2004
  • New functions
  • better user control over open/close objects
  • Bug fixes
  • Parallel improvements
  • h5pcc, h5pfc helper scripts for parallel compiles
  • Configure improvements
  • Improved parallel performance
  • Speed improvements of data conversion routines
  • Some SZIP improvements

15
HDF5 1.6.2
  • Support for new compilers and platforms
  • IBM Fortran on MacOS X
  • Support for gcc 3.3.4
  • Linux 64 (SuSE and RH) at JPL
  • Altix (Aura team) including parallel C and
    Fortran Libraries
  • Investigated SX-6 (NEC) port

16
HDF5 1.6.3 Oct. 2004
  • Windows
  • Improvements to the build, test, and installation
  • New API routines
  • H5Fget_filesize. Returns size of opened file.
  • New H5Fget_name. Returns name of file by object
    ID
  • Some F90 and C routines added

17
HDF5 1.6.3
  • Utilities
  • H5repack utility (new)
  • Regenerates an HDF5 file from another HDF5 file,
  • Optionally applies filters, chunking to new file
  • H5dump utility improvements
  • Print new info, such as dataset filters, storage
    layout, fill value info

18
Szip in HDF5 1.6.3
  • HDF5 can now include SZIP compression with or
    without Szip's encoder
  • Required to create SZIP compressed files
  • Not required to read SZIP compressed files
  • Info on Szip and Szip licensing
  • http//hdf.ncsa.uiuc.edu/doc_resource/SZIP/

19
HDF5 1.6.3 New platforms compilers
  • PGI Fortran for Linux64 (x86-64)
  • Absoft F95 for Linux 2.4 -32 bit
  • IBM XL Fortran and Absoft F95 for Mac OS X

20
HDF Java Products 2.0 March 2004
  • Tested with HDF5-1.6.2
  • Platforms
  • Windows (98/NT/2000/XP)
  • Solaris
  • Linux
  • AIX
  • IRIX 6.5
  • Mac OSX
  • OSF1
  • http//hdf.ncsa.uiuc.edu/hdf-java-html/

21
Modular HDFView
Modular HDFView improved HDFView where I/O and
GUI components are replaceable modules.
Application (HDFView)
  • Replaceable modules
  • File I/O (file/data format)
  • Tree view (show file structure)
  • Table view (spreadsheet-like)
  • Text view (view/edit text dataset)
  • Image view (view/process image)
  • Palette view (view/change palette)
  • Metadata (attribute) view
  • http//hdf.ncsa.uiuc.edu/hdf-java-html/hdfview/

Interfaces I/O, TreeView, TableView, etc
Default Implementation
User Implementation
22
HDFView Web Browser Plug-in
  • Goal Click-and-view HDF files remotely and
    locally from popular web browsers.
  • See poster.

23
Parallel HDF5 in 2004
  • A few performance improvements
  • MPICH/MPE instrumentation feature added
  • performance analysis tools for their MPI programs
  • Flexible parallel HDF5 programming model
  • More flexible model for parallel HDF5
  • Other options currently under investigation

24
Parallel HDF5 developments
  • New parallel platforms supported
  • Solaris 2.8 (32 64 bits)
  • OSF 5.1
  • Cray T3E, SV1, T90
  • HPUX 11.0
  • FreeBSD

25
Flexible Parallel HDF5 (FPHDF5)
  • Problem
  • Parallel computation requires a consistent view
    of file metadata across all processes
  • Parallel HDF5 does this by requiring all
    operations that modify metadata to be executed
    collectively 
  • This is clumsy at best.
  • E.g. suppose each of 1,000 processes needs to
    create its own dataset. 
  • Then there must be 1,000 collective creations --
    each requiring the participation of all processes.

26
Flexible Parallel HDF5 (FPHDF5)
  • Approach
  • Allow individual processes to modify the file
    metadata without explicit, application level
    synchronization between processes.
  • Use "Set Aside Process (SAP) to set up a shared
    metadata cache, allowing individual processes to
    read or write lock individual pieces of metadata
    as required.
  • Easier to program, simpler to understand.
  • http//hdf.ncsa.uiuc.edu/Parallel_HDF/PHDF5/FPH5/

27
Flexible Parallel HDF5 (FPHDF5)
  • New problem
  • The cache is managed by a single process (SAP),
    and the metadata accessed frequently. 
  • SAP becomes bottleneck, affecting performance and
    scalability.
  • Currently investigating other solutions.

28
Other Activities of Interest
29
DOE/ASCI
ASCI provides the integrating simulation and
modeling capabilities and technologies needed
for future design assessment and certification
of nuclear weapons and their components
  • Massively parallel computing and I/O
  • Complex data models and big data
  • HDF5 a standard format for ASCI apps

Advanced Simulation and Computing Program
30
BoeingHDF5 for real-time flight test data
  • Needed for flight test data systems
  • Must handle raw, real-time data
  • Implemented API to read/write data
  • Based on HDF5 table API
  • Challenge Variable length data
  • Possible Boeing-wide standard
  • Potential applications to many domains
  • See poster

31
NCASSR Indexing viewing tables
  • Opportunities arising from Boeing work
  • Make test-data features widely available
  • Common data model and API for tabular data in
    HDF5
  • Indexing for post-processing
  • Viewing capabilities
  • Tasks
  • Identify apps to study and gather requirements
  • Develop data model and API for tabular data
  • Include general purpose indexing structures and
    API
  • Implement prototype API and viewer

National Center for Advanced Secure Systems
Research
32
National Archives and Records Administration
(NARA)
  • Investigate HDF5 as format for records archiving
  • Focus on geospatial data
  • Images (e.g. elevation models, aerial
    photography)
  • Features (e.g. boundaries, roads, rivers)
  • Results so far
  • HDF5 data model handles all data types
  • Feature (vector) data present access and size
    challenges
  • Work is leading to good performance lessons
  • See poster about study of vector data

33
SciDAC/PMODELArithmetic Data Transform
  • Apply algebraic operations to dataset during
    read/write.
  • Initial goal
  • transform individual elements (e.g., x 1.8
    32).
  • During reads, applies to result in memory.
    During writes, data in the file changed.
  • Implemented in HDF5 v1.7, to be released in v1.8
  • Future
  • Transformations on attributes or multiple
    datasets (e.g. (A B) / 2.0)
  • http//hdf.ncsa.uiuc.edu/PMODELS/datatransform/

34
Weather Research Forecast (WRF) Model
  • WRF NCAR community standard model
  • HDF5 I/O module for NCARs WRF
  • HDF5-WRF parallel I/O studies
  • Improved performance for computations with large
    I/O
  • Sequential HDF5-WRF studies
  • Compression can save disk space
  • See the poster
  • And see http//hdf.ncsa.uiuc.edu/apps/WRF-ROMS 

35
netCDF-HDF Project
  • Enhanced NetCDF-4 Interface to HDF5
  • Combine features of netCDF and HDF5
  • Take advantage of their separate strengths
  • Collaboration between NCSA and Unidata
  • See poster Merging the netCDF and HDF5
    libraries to achieve gains in performance and
    interoperability


36
OPeNDAP netCDF HDF5
  • OPeNDAP
  • A system for the transmitting data across the
    Internet
  • Supports selection of data using constraint
    expressions
  • Can translate data from one format to another
  • NetCDF and HDF5
  • Formats of major interest to the OPeNDAP
    community
  • All three are in heavy use in the earth sciences
  • So the question is

37
Are the planets finally aligned?
HDF5
netCDF
To harmonize OPeNDAPnetCDFHDF5?
OPeNDAP
38
OpenDAP/netCDF/HDF5 Harmonization
  • Opportunity
  • Unidata is creating netcdf-4
  • Existing OPeNDAP work with netcdf and HDF5
  • OPeNDAP project working on a new spec (4.0) 
  • John Caron working on new java-netCDF library
    (2.2)
  • Creates a "common data model" which is
    more-or-less a union of the 3 models.
  • But there are important differences
  • Different ecological niche
  • Some very different object types
  • So a union of all the models is unlikely 

39
OpenDAP/netCDF/HDF5 Harmonization
  • Goal map between the three models, and possibly
    tweak the models to better make them harmonize.
  • Tackle certain important differences
  • OPeNDAP Sequences
  • Hard to represent in the netCDF API
  • But seems like they might work in HDF5.
  • HDF5 attributes
  • Hard to represent in the DAP.
  • Also perhaps devise a formal mapping between the
    three models

40
Thank you
Acknowledgements This report is based upon work
supported in part by a Cooperative Agreement with
NASA under NASA grant NAG 5-2040 and NAG
NCCS-599. Any opinions, findings, and
conclusions or recommendations expressed in this
material are those of the author(s) and do not
necessarily reflect the views of the National
Aeronautics and Space Administration.  Other
support provided by NCSA and other sponsors and
agencies. (http//hdf.ncsa.uiuc.edu/acknowledge.ht
ml). Made on location in Champaign Illinois. To
the best of our knowledge, no animals were abused
in the making of these slides.
41
Questions/comments?
42
Information Sources
  • HDF website
  • http//hdf.ncsa.uiuc.edu/
  • HDF5 Information Center
  • http//hdf.ncsa.uiuc.edu/HDF5/
  • HDF Helpdesk
  • hdfhelp_at_ncsa.uiuc.edu
  • HDF users mailing list
  • hdfnews_at_ncsa.uiuc.edu
Write a Comment
User Comments (0)
About PowerShow.com