Using HDF5 for Geospatial Vector Data - PowerPoint PPT Presentation

About This Presentation
Title:

Using HDF5 for Geospatial Vector Data

Description:

Store geometry and attribute information for spatial features as shapes ... Ragged array 1-D array of ... length structures (ragged array) is high. ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 2
Provided by: mike193
Category:

less

Transcript and Presenter's Notes

Title: Using HDF5 for Geospatial Vector Data


1
Using HDF5 for Geospatial Vector Data
Different HDF5 arrangements of vertices
Question How suitable is a general purpose
format like HDF5 for storing and accessing
geospatial feature data?
HDF5 1-D dataset, each element is a
variable-length array containing all vertices for
a shape.
  • Ragged array 1-D array of variable-length data
    types
  • Index array of offsets to data values in single
    linear array. Similar to Shapefiles.
  • 2-D array one shape per row, multiple arrays
    when shape sizes vary.

metadata
x
y
x
y
1
metadata
x
y
2
metadata
x
y
x
y
x
y
3
4
metadata
x
y
5
metadata
x
y
x
y

Feature data example
HDF5 1-D dataset containing all vertices, in order

Index
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
x
y
metadata
1
0
metadata
2
2
metadata
3
3
metadata
4
6
  • Test case ESRI Shapefiles
  • Store geometry and attribute information for
    spatial features as shapes with vector
    coordinates.
  • Support point, line, and area features.
  • Widely used file format for geospatial feature
    data.


metadata
5
7
HDF5 2-D datasets, each row containing all
vertices for a shape
Large shapes
Small shapes
Distribution showing vertices/shape
HDF5 example (1 file)
Shapefile format (3 files)
Data compression recovers unused space
Results Comparing Shapefile and HDF5
.shp
Main file - each record describes a shape with a
list of its vertices
File size
Access time
.shx
Index file - each record contains offset of
corresp. main file record
  • Overhead for variable-length structures (ragged
    array) is high. HDF5 file always bigger than
    Shapefile.
  • HDF5 linear array with index is comparable to
    shapefile.
  • Compression
  • HDF5 linear array with index saves up to 40 vs.
    Shapefile.
  • HDF5 2-D arrays comparable to Shapefile when
    compression used. Without compression, HDF5 files
    much larger.
  • I/O overhead from variable length and compound
    types significantly slows access in HDF5. (HDF5
    reads 5-20 times slower than Shapefile reads).
  • Can be improved considerably by turning off
    internal free lists.
  • When compound and variable-length types not used,
    HDF5 access time is comparable to Shapefile
    access.

.dbx
A dBASE table - feature attributes for each
record.
Shapefiles tested
ESRI Environmental Systems Research Institute,
Inc
Write a Comment
User Comments (0)
About PowerShow.com