HDF5 Tools Update - PowerPoint PPT Presentation

About This Presentation
Title:

HDF5 Tools Update

Description:

Title: Selling a Product or Service Author: Peter X. Cao Last modified by: Peter Cao Created Date: 5/18/2006 2:39:14 PM Document presentation format – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 43
Provided by: Pet133
Learn more at: http://www.hdfeos.org
Category:

less

Transcript and Presenter's Notes

Title: HDF5 Tools Update


1
HDF5 Tools Update
  • Peter Cao
  • The HDF Group
  • xcao_at_hdfgroup.org
  • November 28, 2006

This report is based upon work supported in part
by a Cooperative Agreement with NASA under NASA
NNG05GC60A. Any opinions, findings, and
conclusions or recommendations expressed in this
material are those of the author(s) and do not
necessarily reflect the views of the National
Aeronautics and Space Administration.
2
Outline
  • Overview of current tools
  • New tools with HDF5 1.8 release
  • A future tool h5ub

3
HDF5 Command Line Tools
  • Readers
  • h5dump, h5diff, h5ls
  • new tools h5check, h5stat
  • Writers
  • h5repack, h5repart, h5import, h5jam/h5unjam
  • new tool h5copy
  • Converters
  • h4toh5, h5toh4, gif2h5, h52gif

4
Help Information
  • Located at bin/ with binary release
  • Use -h option for help
  • Online help at http//www.hdfgroup.org/hdf5tools.h
    tml
  • For further help, bug report and feature request,
    send to hdfhelp_at_hdfgroup.org

5
h5dump
  • Dumps the content of an HDF5 file to stdout and
    optionally to following types of files
  • ASCII text file
  • XML file
  • Binary file (new feature)

6
h5dump -H SDS.h5
  • HDF5 "SDS.h5"
  • GROUP "/"
  • GROUP "Floats"
  • DATASET "FloatArray"
  • DATATYPE H5T_IEEE_F32LE
  • DATASPACE SIMPLE ( 4, 3 ) / ( 4, 3
    )
  • DATASET "IntArray"
  • DATATYPE H5T_STD_I32LE
  • DATASPACE SIMPLE ( 5, 6 ) / ( 5, 6 )

7
h5dump -d /Floats/FloatArray SDS.h5
  • HDF5 "SDS.h5"
  • DATASET "/Floats/FloatArray"
  • DATATYPE H5T_IEEE_F32LE
  • DATASPACE SIMPLE ( 4, 3 ) / ( 4, 3 )
  • DATA
  • (0,0) 0.01, 0.02, 0.03,
  • (1,0) 0.1, 0.2, 0.3,
  • (2,0) 1, 2, 3,
  • (3,0) 10, 20, 30

8
h5dump -x SDS.h5
9
h5dump Binary Output
  • -b F, --binaryF
  • The form of the binary output (F)
  • MEMORY -- for memory type
  • FILE -- for the disk file type
  • LE -- for pre-defined little endian type
  • BE -- for pre-defined big endian type

10
h5dump -d /IntArray -o out_le.bin -b LE SDS.h5
Dumps a 32-bit integer dataset, IntArray, from
SDS.h5 to a little endian binary file out_le.bin
  • od --width24 -t x4 out_le.bin
  • 0000000 00000000 00000001 00000002 00000003
    00000004 00000005
  • 0000030 0000000a 0000000b 0000000c 0000000d
    0000000e 0000000f
  • 0000060 00000014 00000015 00000016 00000017
    00000018 00000019
  • 0000110 0000001e 0000001f 00000020 00000021
    00000022 00000023
  • 0000140 00000028 00000029 0000002a 0000002b
    0000002c 0000002d

11
h5diff
  • Using h5diff, you can
  • compares two objects in the same file
  • compares two objects between two files
  • compares all objects between two files

12
h5diff SDS.h5 SDS2.h5
  • Dataset lt/IntArraygt and lt/IntArraygt
  • 5 differences found

13
h5diff SDS.h5 SDS2.h5 -r /IntArray
  • Dataset lt/IntArraygt and lt/IntArraygt
  • position IntArray IntArray difference
  • --------------------------------------------------
    ----------
  • 0 0 0 10 10
  • 1 0 10 100 90
  • 2 0 20 200 180
  • 3 0 30 300 270
  • 4 0 40 400 360
  • 5 differences found

14
h5repack
  • Copies an HDF5 file to a new file with/without
    compression/chunking
  • Remove un-used space
  • Apply compression filter
  • Apply layout

15
h5repack new filters
  • -f FILTER
  • GZIP, to apply GZIP compression
  • SZIP, to apply SZIP compression
  • SHUF, to apply the HDF5 shuffle filter
  • FLET, to apply the HDF5 checksum filter
  • NBIT, to apply NBIT compression
  • SOFF, to apply the HDF5 Scale/Offset filter
  • NONE, to remove all filters
  • For example
  • h5repack -i SDS2.h5 -o SDS2_compressed.h5
    /IntArrayGZIP9

16
h5repack data layout
  • -l LAYOUT
  • CHUNK, to apply chunking layout
  • COMPA, to apply compact layout
  • CONTI, to apply continuous layout
  • For example
  • h5repack -i SDS.h5 -o SDS_chunk.h5
  • -l /Floats/FloatArray,/IntArrayCHUNK2x3

17
new h5repack using H5Ocopy()
18
h5repart
  • Repartitions a file or family of files
  • For example
  • h5repart -m 200m int16kx16k.h5 part200md.h5

200 MB
part200m0.h5
977 MB
200 MB
part200m1.h5
200 MB
part200m2.h5
200 MB
part200m3.h5
177 MB
part200m1.h5
19
h5import
  • Imports binary/ASCII data into an HDF5 file
  • h5import infile -c config_file infile -c
    config_file2 ... -outfile outfile
  • For eaxmple
  • h5import float5x4x2.txt -c First_set.conf -o
    First_set.h5

GROUP "/" GROUP "work" DATASET
"First-set" DATATYPE H5T_IEEE_F64LE
DATASPACE SIMPLE ( 5, 2, 4 ) / ( 8, 8,
H5S_UNLIMITED ) DATA
(0,0,0) 1.01, 1.02, 1.03, 1.04,
(0,1,0) 1.11, 1.12, 1.13, 1.14,
(1,0,0) 1.21, 1.22, 1.23, 1.24,
(1,1,0) 1.31, 1.32, 1.33, 1.34,
(2,0,0) 1.41, 1.42, 1.43, 1.44,
(2,1,0) 1.51, 1.52, 1.53, 1.54,
(3,0,0) 2.01, 2.02, 2.03, 2.04,
(3,1,0) 2.11, 2.12, 2.13, 2.14,
(4,0,0) 2.21, 2.22, 2.23, 2.24,
(4,1,0) 2.31, 2.32, 2.33, 2.34

PATH work/First-set INPUT-CLASS
TEXTFP RANK 3
DIMENSION-SIZES 5 2 4 OUTPUT-CLASS
FP OUTPUT-SIZE 64
OUTPUT-ARCHITECTURE IEEE
OUTPUT-BYTE-ORDER LE
CHUNKED-DIMENSION-SIZES 2 2 2
MAXIMUM-DIMENSIONS 8 8 -1
20
h5jam/h5unjam
  • h5jam -- add text to User Block
  • h5jam -u test_ub.txt -i test_ub.h5
  • h5unjam -- remove text from User Block
  • h5unjam -i test_ub.h5 -o out_ub.txt -o out_ub.h5

21
h5ls
  • Lists selected information about file objects in
    the specified format
  • For Example,
  • h5ls -r SDS2.h5

/Floats Group /Floats/DoubleArray
Dataset 10, 5 /Floats/FloatArray
Dataset 4, 3 /Floats/subs
Group /IntArray Dataset 5, 6
22
gif2h5 / h52gif
  • gif2h5 converts a GIF file into HDF5
  • gif2h5 apollo17_earth.gif apollo17_earth.h5
  • h52gif converts an HDF5 file into GIF
  • h52gif apollo17_earth.h5 apollo17_earth2.gif -i
    /apollo17_earth.gif/Image0 -p "/apollo17_earth.gif
    /Global Palette"

23
h5toh4 / h4toh5
  • h5toh4 -- Converts an HDF5 file to an HDF4 file
  • h4toh5 -- Converts an HDF4 file to an HDF5 file

24
New tools
  • h5copy
  • h5check
  • h5stat

25
h5copy
  • Copies an object from one location to another
    location within a file or across files
  • http//hdfgroup.com/RFC/h5copy/h5copy.htm

/
/
Floats
IntArray
FloatArray
FloatArray
26
h5copy
  • usage h5copy OPTIONS OBJECTS...
  • -i, --input input file name
  • -o, --output output file name
  • -s, --source source object name
  • -d, --destination destination object name
  • -f, --flag
  • shallow Copy only immediate members for
    groups
  • soft Expand soft links into new objects
  • ext Expand external links into new
    objects
  • ref Copy objects that are pointed by
    references
  • noattr Copy object without copying
    attributes

27
h5copy
  • For example
  • h5copy -i SDS.h5 -o SDS_cp.h5 -s
    /Floats/FloatArray -d /FloatArray

/
/
Floats
IntArray
FloatArray
FloatArray
SDS_cp.h5
SDS.h5
28
h5copy -f shallow
/
floats
-f shallow
64-bit
/
f32
floats
integers
/
64-bit
i1
i2
floats
f32
f2
f1
64-bit
f32
f2
f1
29
h5copy -f soft
/
/
f1
dset_SL /f1
-f soft
f1
dset_SL /f1
/
dset_SL /f1
30
h5copy -f ref
/
d1
d2
/
dset_ref
679
1287
-f ref
d2
d1
/
dset_ref
1895
763
dset_ref
0
0
31
h5copy todo
  • Fix references embedded in compound datatype
  • Follow external links
  • Test functionalities
  • Test performance

32
h5stat
  • Prints different statistics about HDF5 file
  • Helps
  • To troubleshoot size overhead in HDF5 files
  • To choose specific objects properties and
    storage strategies

33
h5check
  • A validation tool that verifies if an HDF5 file
    is encoded according to the HDF5 File Format
    Specification

34
Why is it needed?
  • Verify if the file is compliant with the File
    Format to ensure the data model integrity and
    long term compatibility between evolving versions
    of the HDF5 library
  • As a verification tool required by the
    application of HDF5 File Format to be an ANSI
    standard
  • Serves as a watch dog that the HDF5 library
    implementation is compliant with the File Format

35
What does it do?
  • Given a file, it scans through the encoded
    content against the defined File Format
  • If it finds any non-compliance, it prints out the
    error and reason of non-compliance.
  • After finding any non-compliance, it tries to
    continue scanning the file if possible.
  • Eventually, it exits with non-zero.
  • If it does not find any non-compliance, it prints
    out an approval statement at the end and exits
    with zero.

36
How is it implemented? (1/2)
  • The tool is coded from scratch and does not use
    the formal HDF5 library API calls
  • It does not link with the HDF5 library at all
  • It may borrow coding, including algorithms or
    data structure from the HDF5 library source code
    but after close verification that they are in
    compliance with the File Format.

37
How is it implement? (2/2)
  • It links external libraries that HDF5 library
    uses. E.g.,
  • Zlib
  • szlib

38
How to use it?
  • H5check -vn ltfilenamegt
  • -vn verboseness mode
  • n0 Terseonly prints if the file is compliant
    or not
  • n1 Defaultprints its progress and all errors
    found
  • n2 Verboseprints everything it knows, usually
    for debugging

39
Example a compliant file
  • h5check example1.h5
  • VALIDATING example1.h5
  • FOUND super block signature
  • VALIDATING the super block at 0...
  • VALIDATING the object header at 928...
  • VALIDATING the btree at 384...
  • FOUND btree signature.
  • VALIDATING the local heap at 96...
  • FOUND local heap signature.
  • Result File is in compliance.

40
Example a non-compliant file
  • h5check invalid2.h5
  • FOUND super block signature
  • VALIDATING the super block at 0...
  • VALIDATING the object header at 928...
  • VALIDATING the btree at 384...
  • FOUND btree signature.
  • VALIDATING the SNOD at 1248...
  • FOUND SNOD signature.
  • VALIDATING the object header at 976...
  • check_sym(at 1248) Errors from
    check_obj_header()
  • decode_validate_messages() Failure in
    type-gtdecode().
  • H5O_sdspace_decode() Bad version number in
    simple dataspace message.
  • VALIDATING the local heap at 96...
  • FOUND local heap signature.
  • Main() Errors from check_obj_header().
  • decode_validate_messages() Failure in
    type-gtdecode().
  • H5O_attr_decode() Can't decode attribute
    dataspace.
  • H5O_sdspace_decode() Bad version number in
    simple dataspace message.

41
Implementation Status
  • All basic File Format components are implemented
  • Coding recognition of HDF5 files created by
    non-default Virtual File Driver such as the
    Multi-File format
  • Alpha release planned in December 2006

42
h5ub
  • Combine nub, h5jam and h5unjam
  • nub -- NPOESS user block tool for HDF5 files
    (Richard)
  • Plan for h5ub development

3/22/07
3/25/07
3/5/07
4/2/07
2/23/07
Design
Implementation
Testing
Release
Write a Comment
User Comments (0)
About PowerShow.com