Title: HDF5 Backward and Forward Compatibility issues or What do we promise to our users?
1HDF5 Backward and Forward Compatibility issuesor
What do we promise to our users?
2- Forward compatibility is the ability of a system
to accept input intended for later versions of
itself. - In technology, a product is said to be backward
compatible when it is able to take place of an
older product, by interoperating with other
products that were designed for the older
product. - Wikipedia
3Outline
- Introduction
- HDF5 library versioning
- HDF5 file format versioning
- Forward compatibility
- Backward compatibility
- What to expect with 1.8.0?
4Introduction
- Why new versions of the library and file format?
- Bug fixes
- Performance improvements
- New features
- All above may require
- File format changes
- API changes
- New APIs
- Public structure changes
5Introduction
- To upgrade or not to upgrade?
- Did they finally fix a bug I reported 5 years
ago? - Will I be able to read my old files with the new
library? - Do I need to re-link my application?
- Will IDL that we bought 3 years ago work with
files created by the new library? - My colleagues and I use different versions of
HDF5 libraries. Can we modify and access each
others files?
6Introduction
- This talk is about what HDF5 users should expect
when moving from one version of the HDF5
libraries to another. - Information
- Backward and forward compatibility issues
- http//hdfgroup.org/HDF5/faq/bkfwd-compat.html
- API changes from release to release
- http//hdfgroup.org/HDF5/doc_1.8pre/doc/ADGuide/Ch
anges.html - File Format changes
- http//hdfgroup.org/HDF5/doc/H5.format.html
7HDF5 Library Versioning
- HDF5 version number has a form of X.Y.Z(-suffix)
- X is called major version number
- Y is called minor version number (always an even
number for public release) - Z is called release number
- suffix is present in snapshots and release
candidates (e.g. snap8, pre1) - Examples
- Releases
- HDF5 1.6.5 and upcoming HDF5 1.8.0
- source tar file names hdf5-1.6.5.tar and
hdf5-1.8.0.tar - Snapshots (source under development)
- HDF5 1.6.6-snap8 and HDF5 1.7.58
- source tar file names hdf5-1.6.6-snap8.tar and
hdf5-1.7.58.tar
8HDF5 Library Versioning
- HDF5 release number Z in a public release X.Y.Z
- Incremented each time a new set of bug fixes and
/or performance enhancements is made available to
the public - Upgrading/downgrading between different versions
with the same X.Y and different Zs may cause bugs
to disappear/appear
9HDF5 Library Versioning
- HDF5 release number Z in a public release X.Y.Z
- No file format change
- No changes to the existing APIs
- No change to public data structures
- New APIs may be added by popular demand or by
demand of the funding agencies (NASA, ASC) or as
a result of a bug fix. - Existing applications should be able to
re-compile with the newest version
10HDF5 Library Versioning
- HDF5 release number Z in a public release X.Y.Z
- Some exceptions for severe bugs
- Examples
- File format change
- File format changed between 1.6.0 and 1.6.X to
support control of B-trees for indexing chunked
datasets (ASC) - 1.6.0 library couldnt read 1.6.X files when the
feature was used 1.6.X could read 1.6.0 files - API change
- APIs changed between 1.6.4 and 1.6.3 to replace
signed with unsigned to improve library
performance and code portability - Behavior change
- Application will fail at run-time if compiled
with versions greater than 1.6.5 (a rare file
corruption issue was discovered when better error
checking was added to the library) - HDF group provided a tool to fix corrupted files
11HDF5 Library Versioning
- HDF5 minor version number Y in a public release
X.Y.Z - Incremented each time when a new set of features
is introduced - File format may change
- New APIs are added
- Old APIs may be removed or deprecated (will be
removed in the next Y release) - Public data structures may change (handled the
same way as deprecated APIs)
12HDF5 Library Versioning
- HDF5 minor version number Y in a public release
X.Y.Z - Upgrading/downgrading between different versions
with the same X may cause problems - X.Y1 may not be able to read X.Y2 files
- Application written with X.Y2 features will not
link with X.Y1 library - Application written for X.Y1 may not link with
X.Y2 library due to changed or removed APIs or
due to a change in public data structures
13HDF5 Library Versioning
- HDF5 major version number X in a public release
X.Y.Z - Indicates major file format changes
- It is probably HDF6 lets talk about it in 10
years -)
14HDF5 File Format Versioning
- There is no HDF5 file format version number
- Micro-versioning each object and structure
within an HDF5 file is versioned - Updated File Format Specification is available
with every public release - There is no way to find what version of the
library created or modified a particular file - Why did we choose such approach?
15HDF5 File Format Versioning
- Maximum file format compatibility principle
- By default the HDF5 files are written with the
earliest version of file format that describes
information, rather than always using the latest
version possible. - Assures best forward compatibility with the older
versions (objects in new files can be read with
old libraries if that object is known to the
old libraries)
16HDF5 File Format Versioning
- Maximum file format compatibility principle
- Example Datatype header message
- Versions 0, 1 and 2
- Version 0 used by the latest library for datatype
messages in all situations where are no array
datatypes used - Version 1 (introduced in 1.4.0) used by 1.6.5 and
earlier versions of the library to encode
compound datatypes with explicit array fields. - Version 2 is used for 1.8.0 and later if
requested by setting special flag (latest file
format) helps to reduce overhead in describing
complex datatypes - By default 1.8.0 writes compound data compatible
with 1.4.0 1.6.X libraries - If feature is requested, compound data created
by 1.8.0 will not readable by earlier versions
17HDF5 Forward Compatibility
- Forward compatibility or what do we promise (file
format) - Forward compatibility is most difficult to
achieve and maintain - Achieved by using micro-versioning and maximum
compatibility principle - Old versions of the library will read all objects
in a file created by a newer library if objects
are known to the old library - Example 1.6.5 library will read a group in a
file created by 1.8.0 version unless new 1.8.0
features are used (e.g. external links or compact
groups)
18HDF5 Forward Compatibility
- Forward compatibility or what do we promise
(APIs) - Application written to work with an older version
will compile, link and run as expected with a
newer version - APIs are not deleted or changed (if possible)
- APIs do not change behavior (if possible)
- May require configuration flag enablehdf5v1_Y
to enable old APIs, data structures and behavior - Drawbacks
- Have to keep old APIs until another major release
or indefinitely - Cannot make new features to be enabled by default
- H5Gcreate will create old style groups in 1.8.0
- H5Gcreate2 will create new groups (supports
creation order, compact storage, improved heap
structure, low and controlled overhead, etc.)
19HDF5 Backward Compatibility
- Backward compatibility or what do we promise
- File Format
- Newer version of the library will always read
files created with an older version - Aside HDF4 can read HDF4 files created in 1988 ?
- Library APIs
- Application that doesnt use new features will
compile and link with the older library
20What to expect with 1.8.0?
- 1.8.0 introduces file format changes, new APIs
and old API changes - File format changes
- Revised internal file structures to support new
features such as creation order, UTF-8 encoding,
external links, etc., and to reduce files
overhead - New APIs added
- Group API revisions to allow two types of indices
on links within a group - New compression methods (scaleoffset, n-bit)
- Link APIs including UTF-8 encoding for links and
external links - Etc.
- Old API changes
- There are several APIs (rarely used) with changed
signatures or behavior - http//hdfgroup.org/HDF5/doc_1.8pre/doc/ADGuide/C
hanges.html
21What to expect with 1.8.0?
- Application written for 1.6.5
- Will compile, link and run with 1.8.0 library as
expected producing files compatible with 1.6.5
release - Will take advantage of new meta data cache,
performance enhancements and bug fixes - New 1.8.0 features are not available unless
application is modified - Smaller file overhead
- Shared object messages
- More space efficient object header storage
- Compact groups
- Efficient heap storage for groups with many links
- New group and links features
- Creation order on links
- UTF-8 encoding
- Groups compact storage
- External links
22What to expect with 1.8.0?
- Applications written for 1.8.0
- Will always read older files
- May modify 1.6.5 file in a way that 1.6.5 library
will not be able to access some old objects it - Example
- Groups converted to use new format
- Compact storage for compound datatype
- May produce files NOT compatible with 1.6.5 !!!
- Example
- Root group is created using 1.8.0 features
- Takes full advantage of greatest latest HDF5!
23Example
- How application can create a 1.6.5 incompatible
file? - Latest format is used for storing compound
datatypes - fapl H5Pcreate(H5P_FILE_ACCESS)
- H5Pset_latest_format(fapl, TRUE)
- file H5Fcreate(filename, H5F_ACC_TRUNC,
H5P_DEFAULT, - fapl)
- tid H5Tcreate(H5T_COMPOUND, sizeof(struct
s1)) - H5Tinsert()
- dset H5Dcreate(file, New compound,
tid,) - H5Dwrite(dset, )
-
-
24Acknowledgements
Thank you! Questions?
25Acknowledgement
This report is based upon work supported in part
by a Cooperative Agreement with NASA under NASA
NNG05GC60A. Any opinions, findings, and
conclusions or recommendations expressed in this
material are those of the author(s) and do not
necessarily reflect the views of the National
Aeronautics and Space Administration.