Using HDF5 tools for performance tuning and troubleshooting - PowerPoint PPT Presentation

About This Presentation
Title:

Using HDF5 tools for performance tuning and troubleshooting

Description:

Discover objects and their properties in HDF5 files. h5dump -p ... h5ls vr good.h5 /Definitions/timespec Type. Location: 0:1:0:900. h5debug good.h5 900 ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 13
Provided by: peter1061
Learn more at: http://hdfeos.org
Category:

less

Transcript and Presenter's Notes

Title: Using HDF5 tools for performance tuning and troubleshooting


1
Using HDF5 tools for performance tuning and
troubleshooting
2
Introduction
  • HDF5 tools may be very useful for performance
    tuning and troubleshooting
  • Discover objects and their properties in HDF5
    files
  • h5dump -p
  • Get file size overhead information
  • h5stat
  • Get locations of the objects in a file
  • h5ls
  • Discover differences
  • h5diff, h5ls
  • Location of raw data
  • h5ls vra

3
h5stat
  • Prints different statistics about HDF5 file
  • Helps
  • To troubleshoot size overhead in HDF5 files
  • To choose specific objects properties and
    storage strategies
  • To use
  • h5stat --help
  • h5stat file.h5
  • Spec can be found http//www.hdfgroup.org/RFC/h5st
    at/
  • Let us know if you need some special type of
    statistics

4
h5stat
  • Reports two types of statistics
  • High-level information about objects (examples)
  • Number of different objects (groups, datasets,
    datatypes) in a file
  • Number of unique datatypes
  • Size of raw data in a file
  • Information about objects structural metadata
  • Sizes of structural metadata (total/free)
  • Object headers, local and global heaps
  • Sizes of B-trees
  • Object headers fragmentation

5
h5stat
  • Examples of high-level information
  • File information
  • of unique groups 10008
  • of unique datasets 30
  • of unique named datatypes 0
  • Max. of links to object 1
  • Max. depth of hierarchy 4
  • Max. of objects in group 19
  • Group bins
  • of groups of size 0 10000
  • of groups of size 1 - 9 7
  • of groups of size 10 - 99 1
  • Max. dimension size of 1-D datasets 1643

6
h5stat
  • Conclusion
  • There are a lot of empty groups in the file good
    candidate for compact group feature
  • Some datasets use user-defined filters and may
    not be readable by HDF5 library
  • SZIP compression is needed to read some datasets

Oh my application uses buffers of size 1024 to
read data No wonder it crashes on reading Do I
have all filters needed to read the data?
7
h5stat
  • Examples of structural metadata information
  • Object header size (total/unused)
  • Groups 1808/72
  • Datasets 15792/832
  • Dataset storage information
  • Total raw data size 6140688
  • Dataset datatype 3
  • Count (total/named) (2/0)
  • Size (desc./elmt) (10/65535)
  • Dataset datatype 4
  • Count (total/named) (1/0)
  • Size (desc./elmt) (10/32000)

8
h5stat
  • Conclusions
  • File size 6228197
  • 1.5 overhead (not bad at all!)
  • There some elements are of size 65535 and 32000

Oh Is it really what I want? Should I use other
datatype and get advantage of compression?
9
Case study Using HDF5tools to debug a problem
  • My applications creates files on Windows with
    VS2005 and VS2003. I can read the VS2003 file but
    not the VS2005 one. H5dump reads both files OK
    and there are no differences. What am I doing
    wrong?
  • h5diff good.h5 bad.h5
  • Datatype lt/Definitions/timespecgt and
    lt/Definitions/timespecgt 1 differences found
  • h5ls vr good.h5
  • /Definitions/timespec Type
  • Location 010900
  • h5debug good.h5 900
  • Message Information
  • Type class
    compound
  • Size 8
    bytes
  • h5debug bad.h5 900
  • Message Information
  • Type class
    compound
  • Size 16
    bytes

10
Case study Using HDF5tools to debug a problem
  • Conclusions
  • Compound datatype timespec requires different
    number of bytes on VS2005 (16 bytes 2x8bytes)
    and on VS2003 (8bytes 2x4bytes)

Oh How do I read my data back? I assumed that my
struct would need only 8 bytes for each elements
but it needs 16 bytes on VS2005. I need
H5Tget_native_type function to find the type of
my data in memory
11
Where is my data?
  • h5ls var be_data.h5
  • Opened "be_data.h5" with sec2 driver.
  • /Array Dataset 5/5, 6/6
  • Location 010792
  • Links 1
  • Modified 2006-04-07 150839 CDT
  • Storage 240 logical bytes, 240 allocated
    bytes, 100.00 utilization
  • Type IEEE 64-bit big-endian float
  • Address 2048
  • 30 8-byte elements can be read from address 2048
    by non-HDF5 application

12
Questions? Comments?
? Thank you!
Write a Comment
User Comments (0)
About PowerShow.com