HDF5
Life cycle of data

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD

1
Outline
• “Life cycle” of HDF5 data
• I/O operations for datasets with different storage
layouts
• Compact dataset
• Contiguous dataset
• Datatype conversion
• Partial I/O for contiguous dataset

• Chunked dataset
• I/O for chunked dataset

• Variable length datasets and I/O
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
2
Life cycle of HDF5 data
• Life cycle: what does happen to data when it is transferred from
application buffer to HDF5 file?

Application

Data buffer

Object API

H5Dwrite

Library internals

Magic box

Virtual file I/O

Unbuffered I/O

File or other “storage”
02/18/14

Data in a file

HDF and HDF-EOS Workshop X, Landover, MD
3
“Life cycle” of HDF5 data: inside the magic box
• Operations on data inside the magic box
• Datatype conversion
• Scattering - gathering
• Data transformation (filters, compression)
• Copying to/from internal buffers
• Concepts involved
• HDF5 metadata, metadata cache
• Chunking, chunk cache
• Data structures used
• B-trees (groups, dataset chunks)
• Hash tables
• Local and Global heaps (variable length data: link names,
strings, etc.)
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
4
“Life cycle” of HDF5 data: inside the magic box
• Understanding of what is happening to data inside the
magic box will help to write efficient applications
• HDF5 library has mechanisms to control behavior inside
the magic box
• Goals of this and the next talk are to
• Introduce the basic concepts and internal data structures
and explain how they affect performance and storage sizes
• Give some “recipes” for how to improve performance

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
5
Operations on data inside the magic box
• Datatype conversion
• Examples:
• float  integer
• LE  BE
• 64-bit integer to 16-bit integer (overflow may occur!)

• Scattering - gathering
• Data is scattered/gathered from/to user’s buffers into internal
buffers for datatype conversion and partial I/O
• Data transformation (filters, compression)
• Checksum on raw data and metadata (in 1.8.0)
• Algebraic transform
• GZIP and SZIP compressions
• User-defined filters
• Copying to/from internal buffers
02/18/14
HDF and HDF-EOS Workshop X, Landover, MD
6
“Life cycle” of HDF5 data: inside the magic box
• HDF5 metadata
• Information about HDF5 objects used by the library
• Examples: object headers, B-tree nodes for group, B-Tree
nodes for chunks, heaps, super-block, etc.
• Usually small compared to raw data sizes (KB vs. MB-GB)

• Metadata cache
• Space allocated to handle pieces of the HDF5 metadata
• Allocated by the HDF5 library in application’s memory
space
• Cache behavior affects overall performance
• Will cover in the next talk
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
7
“Life cycle” of HDF5 data: inside the magic box
• Chunking mechanism
• Chunking – storage layout where a dataset is partitioned in
fixed-size multi-dimensional tiles or chunks
• Used for extendible datasets and datasets with filters
applied (checksum, compression)
• HDF5 library treats each chunk as atomic object
• Greatly affects performance and file sizes

• Chunk cache
• Created for each chunked dataset
• Default size 1MB
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
8
Writing a dataset

Metadata

Data

Dataspace

Rank

Dimensions

3

Dim_1 = 4
Dim_2 = 5
Dim_3 = 7

Datatype
IEEE 32-bit float

Storage info

Attributes
Time = 32.4

Chunked

Pressure = 987

Compressed

Temp = 56

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
9
I/O operations for HDF5 datasets with different storage
layouts

• Storage layouts
• Compact
• Contiguous
• Chunked

• I/O performance depends on
•
•
•
•

Dataset storage properties
Chunking strategy
Metadata cache performance
Etc.

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
10
Writing a compact dataset

Dataset header
………….
Datatype
Dataspace
………….
Attribute 1
Attribute 2
Data

Metadata cache

Application memory
Raw data is stored within the dataset header

File

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
11
Writing a contiguous dataset with no datatype conversion

Dataset header
………….
Datatype
Dataspace
………….
Attribute 1
Attribute 2
…………

Metadata cache

User buffer (matrix 5x4x7)

File

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
12
Writing a contiguous dataset with conversion

Dataset header

Metadata cache

………….
Datatype
Dataspace
………….
Attribute 1
Attribute 2
…………

Dataset raw data

Conversion buffer 1MB
Application memory

File
Dataset header

02/18/14

Dataset raw data

HDF and HDF-EOS Workshop X, Landover, MD
13
Sub-setting of contiguous dataset
Series of adjacent rows
Application data in memory
N
M
One I/O operation

M rows

File

02/18/14

Data is contiguous in a file

HDF and HDF-EOS Workshop X, Landover, MD
14
Sub-setting of contiguous dataset
Adjacent, partial rows
Application data in memory
N
Several small I/O operation

M

N elements
File

02/18/14

…

Data is scattered in a file in M contiguous blocks

HDF and HDF-EOS Workshop X, Landover, MD
15
Sub-setting of contiguous dataset
Extreme case: writing a column
Application data in memory
N
Several small I/O operation

M

1 element

…

Data is scattered in a file in M different locations

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
16
Sub-setting of contiguous dataset
Data sieve buffer
Application data in memory
N

Data is gathered in a sieve buffer in memory 64K
memcopy

M

1 element
File

02/18/14

…

Data is scattered in a file in M contiguous blocks

HDF and HDF-EOS Workshop X, Landover, MD
17
Performance tuning for contiguous dataset
• Datatype conversion
• Avoid for better performance
• Use H5Pset_buffer function to customize
conversion buffer size

• Partial I/O
• Write/read in big contiguous blocks (at least the size of a
block on FS)
• Use H5Pset_sieve_buf_size to improve performance for
complex subsetting

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
18
Possible tuning work
• Datatype conversion
• Use of multiple threads for datatype conversion

• Partial I/O
• OS vector I/O

• Asynchronous I/O

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
19
Writing chunked dataset
Dimension sizes X x Y x Z

Dataset is partitioned into fixed-size multi-dimensional chunks of sizes X/4 x Y/2 x Z

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
20
Extending chunked dataset in any dimension

•Data can be added in any dimensions
•Compression is applied to each chunk
•Datatype conversion is applied to each chunk

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
21
Writing chunked dataset
Chunked dataset
A
C

Chunk cache
C

B

Filter pipeline

File

B

A

…………..

C

• Each chunk is written as a contiguous blob
• Chunks may be scattered all over the file
• Compression is performed when chunk is evicted from the chunk cache
• Other filters when data goes through filter pipeline (e.g. encryption)

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
22
Writing chunked dataset

Dataset_1 header

Metadata cache

…………
………

Dataset_N header Chunking B-tree nodes
…………

Chunk cache
Default size is 1MB

• Size of chunk cache is set for file
• Each chunked dataset has its own chunk cache
• Chunk may be too big to fit into cache
• Memory may grow if application keeps opening datasets

Application memory

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
23
Partial I/O for chunked dataset

1

2

3

4

02/18/14

• Build list of chunks and loop through the list
• For each chunk:
• Bring chunk into memory
• Map selection in memory to selection in file
• Gather elements into conversion buffer and
perform conversion
• Scatter elements back to the chunk
• Apply filters (compression) when chunk is
flushed from chunk cache
For each element 3 memcopy performed

HDF and HDF-EOS Workshop X, Landover, MD
24
Partial I/O for chunked dataset

Application buffer
3
Chunk
memcopy
Elements participated in I/O are gathered into corresponding chunk
Application memory

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
25
Partial I/O for chunked dataset

Chunk cache
Gather data
Conversion buffer
3

Scatter data

Application memory

On eviction from cache chunk is compressed
and is written to the file
File

02/18/14

Chunk

HDF and HDF-EOS Workshop X, Landover, MD
26
Variable length datasets and I/O
• Examples of variable-length data
• String
A[0] “the first string we want to write”
…………………………………
A[N-1] “the N-th string we want to write”

• Each element is a record of variable-length
A[0] (1,1,0,0,0,5,6,7,8,9) length of the first record is 10
A[1] (0,0,110,2005)
………………………..
A[N] (1,2,3,4,5,6,7,8,9,10,11,12,….,M) length of the N+1
record is M
02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
27
Variable length datasets and I/O
• Variable length description in HDF5 application
typedef struct {
size_t length;
void
*p;
}hvl_t;

• Base type can be any HDF5 type
H5Tvlen_create(base_type)

• ~ 20 bytes overhead for each element
• Raw data cannot be compressed

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
28
Variable length datasets and I/O
Raw data

Global heap

Global heap
Application buffer
Elements in application buffer point
to global heaps where actual data is
stored

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
29
Writing chunked VL datasets
Metadata cache

B-tree nodes

Chunk cache

Dataset header
…………

Application memory

Global heap

………
Raw data

Chunk cache
Conversion buffer
Filter pipeline

VL chunked dataset with selected region

File

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
30
VL chunked dataset in a file

Chunking B-tree

File
Dataset header
Raw data

02/18/14

Dataset chunks

HDF and HDF-EOS Workshop X, Landover, MD
31
Variable length datasets and I/O
• Hints
• Avoid closing/opening a file while writing VL datasets
• global heap information is lost
• global heaps may have unused space

• Avoid writing VL datasets interchangeably
• data from different datasets will is written to the same heap

• If maximum length of the record is known, use fixedlength records and compression

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
32
Thank you!

Questions ?

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
33
Acknowledgement

This report is based upon work supported in part by a Cooperative Agreement with NASA under
NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in
this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics
and Space Administration.

02/18/14

HDF and HDF-EOS Workshop X, Landover, MD
34

HDF5 Life cycle of data

  • 1.
    HDF5 Life cycle ofdata 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 1
  • 2.
    Outline • “Life cycle”of HDF5 data • I/O operations for datasets with different storage layouts • Compact dataset • Contiguous dataset • Datatype conversion • Partial I/O for contiguous dataset • Chunked dataset • I/O for chunked dataset • Variable length datasets and I/O 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 2
  • 3.
    Life cycle ofHDF5 data • Life cycle: what does happen to data when it is transferred from application buffer to HDF5 file? Application Data buffer Object API H5Dwrite Library internals Magic box Virtual file I/O Unbuffered I/O File or other “storage” 02/18/14 Data in a file HDF and HDF-EOS Workshop X, Landover, MD 3
  • 4.
    “Life cycle” ofHDF5 data: inside the magic box • Operations on data inside the magic box • Datatype conversion • Scattering - gathering • Data transformation (filters, compression) • Copying to/from internal buffers • Concepts involved • HDF5 metadata, metadata cache • Chunking, chunk cache • Data structures used • B-trees (groups, dataset chunks) • Hash tables • Local and Global heaps (variable length data: link names, strings, etc.) 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 4
  • 5.
    “Life cycle” ofHDF5 data: inside the magic box • Understanding of what is happening to data inside the magic box will help to write efficient applications • HDF5 library has mechanisms to control behavior inside the magic box • Goals of this and the next talk are to • Introduce the basic concepts and internal data structures and explain how they affect performance and storage sizes • Give some “recipes” for how to improve performance 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 5
  • 6.
    Operations on datainside the magic box • Datatype conversion • Examples: • float  integer • LE  BE • 64-bit integer to 16-bit integer (overflow may occur!) • Scattering - gathering • Data is scattered/gathered from/to user’s buffers into internal buffers for datatype conversion and partial I/O • Data transformation (filters, compression) • Checksum on raw data and metadata (in 1.8.0) • Algebraic transform • GZIP and SZIP compressions • User-defined filters • Copying to/from internal buffers 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 6
  • 7.
    “Life cycle” ofHDF5 data: inside the magic box • HDF5 metadata • Information about HDF5 objects used by the library • Examples: object headers, B-tree nodes for group, B-Tree nodes for chunks, heaps, super-block, etc. • Usually small compared to raw data sizes (KB vs. MB-GB) • Metadata cache • Space allocated to handle pieces of the HDF5 metadata • Allocated by the HDF5 library in application’s memory space • Cache behavior affects overall performance • Will cover in the next talk 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 7
  • 8.
    “Life cycle” ofHDF5 data: inside the magic box • Chunking mechanism • Chunking – storage layout where a dataset is partitioned in fixed-size multi-dimensional tiles or chunks • Used for extendible datasets and datasets with filters applied (checksum, compression) • HDF5 library treats each chunk as atomic object • Greatly affects performance and file sizes • Chunk cache • Created for each chunked dataset • Default size 1MB 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 8
  • 9.
    Writing a dataset Metadata Data Dataspace Rank Dimensions 3 Dim_1= 4 Dim_2 = 5 Dim_3 = 7 Datatype IEEE 32-bit float Storage info Attributes Time = 32.4 Chunked Pressure = 987 Compressed Temp = 56 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 9
  • 10.
    I/O operations forHDF5 datasets with different storage layouts • Storage layouts • Compact • Contiguous • Chunked • I/O performance depends on • • • • Dataset storage properties Chunking strategy Metadata cache performance Etc. 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 10
  • 11.
    Writing a compactdataset Dataset header …………. Datatype Dataspace …………. Attribute 1 Attribute 2 Data Metadata cache Application memory Raw data is stored within the dataset header File 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 11
  • 12.
    Writing a contiguousdataset with no datatype conversion Dataset header …………. Datatype Dataspace …………. Attribute 1 Attribute 2 ………… Metadata cache User buffer (matrix 5x4x7) File 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 12
  • 13.
    Writing a contiguousdataset with conversion Dataset header Metadata cache …………. Datatype Dataspace …………. Attribute 1 Attribute 2 ………… Dataset raw data Conversion buffer 1MB Application memory File Dataset header 02/18/14 Dataset raw data HDF and HDF-EOS Workshop X, Landover, MD 13
  • 14.
    Sub-setting of contiguousdataset Series of adjacent rows Application data in memory N M One I/O operation M rows File 02/18/14 Data is contiguous in a file HDF and HDF-EOS Workshop X, Landover, MD 14
  • 15.
    Sub-setting of contiguousdataset Adjacent, partial rows Application data in memory N Several small I/O operation M N elements File 02/18/14 … Data is scattered in a file in M contiguous blocks HDF and HDF-EOS Workshop X, Landover, MD 15
  • 16.
    Sub-setting of contiguousdataset Extreme case: writing a column Application data in memory N Several small I/O operation M 1 element … Data is scattered in a file in M different locations 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 16
  • 17.
    Sub-setting of contiguousdataset Data sieve buffer Application data in memory N Data is gathered in a sieve buffer in memory 64K memcopy M 1 element File 02/18/14 … Data is scattered in a file in M contiguous blocks HDF and HDF-EOS Workshop X, Landover, MD 17
  • 18.
    Performance tuning forcontiguous dataset • Datatype conversion • Avoid for better performance • Use H5Pset_buffer function to customize conversion buffer size • Partial I/O • Write/read in big contiguous blocks (at least the size of a block on FS) • Use H5Pset_sieve_buf_size to improve performance for complex subsetting 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 18
  • 19.
    Possible tuning work •Datatype conversion • Use of multiple threads for datatype conversion • Partial I/O • OS vector I/O • Asynchronous I/O 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 19
  • 20.
    Writing chunked dataset Dimensionsizes X x Y x Z Dataset is partitioned into fixed-size multi-dimensional chunks of sizes X/4 x Y/2 x Z 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 20
  • 21.
    Extending chunked datasetin any dimension •Data can be added in any dimensions •Compression is applied to each chunk •Datatype conversion is applied to each chunk 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 21
  • 22.
    Writing chunked dataset Chunkeddataset A C Chunk cache C B Filter pipeline File B A ………….. C • Each chunk is written as a contiguous blob • Chunks may be scattered all over the file • Compression is performed when chunk is evicted from the chunk cache • Other filters when data goes through filter pipeline (e.g. encryption) 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 22
  • 23.
    Writing chunked dataset Dataset_1header Metadata cache ………… ……… Dataset_N header Chunking B-tree nodes ………… Chunk cache Default size is 1MB • Size of chunk cache is set for file • Each chunked dataset has its own chunk cache • Chunk may be too big to fit into cache • Memory may grow if application keeps opening datasets Application memory 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 23
  • 24.
    Partial I/O forchunked dataset 1 2 3 4 02/18/14 • Build list of chunks and loop through the list • For each chunk: • Bring chunk into memory • Map selection in memory to selection in file • Gather elements into conversion buffer and perform conversion • Scatter elements back to the chunk • Apply filters (compression) when chunk is flushed from chunk cache For each element 3 memcopy performed HDF and HDF-EOS Workshop X, Landover, MD 24
  • 25.
    Partial I/O forchunked dataset Application buffer 3 Chunk memcopy Elements participated in I/O are gathered into corresponding chunk Application memory 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 25
  • 26.
    Partial I/O forchunked dataset Chunk cache Gather data Conversion buffer 3 Scatter data Application memory On eviction from cache chunk is compressed and is written to the file File 02/18/14 Chunk HDF and HDF-EOS Workshop X, Landover, MD 26
  • 27.
    Variable length datasetsand I/O • Examples of variable-length data • String A[0] “the first string we want to write” ………………………………… A[N-1] “the N-th string we want to write” • Each element is a record of variable-length A[0] (1,1,0,0,0,5,6,7,8,9) length of the first record is 10 A[1] (0,0,110,2005) ……………………….. A[N] (1,2,3,4,5,6,7,8,9,10,11,12,….,M) length of the N+1 record is M 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 27
  • 28.
    Variable length datasetsand I/O • Variable length description in HDF5 application typedef struct { size_t length; void *p; }hvl_t; • Base type can be any HDF5 type H5Tvlen_create(base_type) • ~ 20 bytes overhead for each element • Raw data cannot be compressed 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 28
  • 29.
    Variable length datasetsand I/O Raw data Global heap Global heap Application buffer Elements in application buffer point to global heaps where actual data is stored 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 29
  • 30.
    Writing chunked VLdatasets Metadata cache B-tree nodes Chunk cache Dataset header ………… Application memory Global heap ……… Raw data Chunk cache Conversion buffer Filter pipeline VL chunked dataset with selected region File 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 30
  • 31.
    VL chunked datasetin a file Chunking B-tree File Dataset header Raw data 02/18/14 Dataset chunks HDF and HDF-EOS Workshop X, Landover, MD 31
  • 32.
    Variable length datasetsand I/O • Hints • Avoid closing/opening a file while writing VL datasets • global heap information is lost • global heaps may have unused space • Avoid writing VL datasets interchangeably • data from different datasets will is written to the same heap • If maximum length of the record is known, use fixedlength records and compression 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 32
  • 33.
    Thank you! Questions ? 02/18/14 HDFand HDF-EOS Workshop X, Landover, MD 33
  • 34.
    Acknowledgement This report isbased upon work supported in part by a Cooperative Agreement with NASA under NASA NNG05GC60A. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the National Aeronautics and Space Administration. 02/18/14 HDF and HDF-EOS Workshop X, Landover, MD 34