SlideShare a Scribd company logo
Substituting HDF5 tools with Python/H5py scripts
Daniel Kahn
Science Systems and Applications Inc.

HDF HDF-EOS Workshop XIV, 28 Sep. 2010

1 of 14
What are HDF5 tools?
HDF5 tools are command line programs distributed with the HDF5
library. They allow users to manipulate HDF5 files.
h5dump: dump HDF5 data as ASCII text.
h5import: convert non-HDF5 data to HDF5
h5diff: show differences between HDF5 files.
h5copy: Copy objects between HDF5 files.
h5repack: Copy entire file while changing storage
properties of HDF5 objects.
h5edit: (proposed) add attributes to HDF5 objects.

HDF5 tools have a long history as the first (and for a long time only)
way to manipulate HDF5 files conveniently. I.e. without writing a C or
Java program, or without buying expensive commercial software such
as IDL or Matlab.

2 of 14
The tools can be characterized as having three parts:
Text Processing—Evaluate command arguments, process input
text files, match group names.
Tree Walking – Search HDF5 file hierarchy for objects by name.
Object Level Operations – Operate on the objects: copy, diff,
repack, etc.

The tools are simple to use and convenient as they are
distributed with the HDF5 library.
3 of 14
Disadvantage of HDF5 tools:
The command line arguments limit tool capability.
Adding new features with command line syntax which is both
readable and does not break the legacy syntax becomes difficult.

Development time for designing and implementing new features is
long (weeks...months).
Use cases must be evaluated, a solution proposed in an RFC, the
proposal must be implemented, new code is distributed in next
release.

4 of 14
Here's an example from HDF documentation:
h5copy -v -i "test1.h5" -o "test1.out.h5" -s "/array" -d "/array

But suppose we had multiple datasets named arrayNNN
where N is 0–9. We'd like to write something like:
h5copy -v -i "test1.h5" -o "test1.out.h5" -s "/arrayd+{3}”

So that d+{3} would provide a match to all such objects.
Extending the tool syntax to meet this use case, and then
again for the next use case would be a never ending game of
catch up.
A more flexible substitute is desirable...

5 of 14
...Python?

6 of 14
What is Python?
Python is a programming language.
It features dynamic binding of variables, like Perl or shell
scripts, IDL, Matlab, but not C or Fortran.
Unlike Perl, it supports native floating point numbers.
It has scientific array support in the style of IDL or Matlab
(numpy module). Array operations can be programmed using
normal arithmetic operators.
It has access to the HDF5 library (Anderw Collette's
h5py module).
Python is currently the only programming language in wide
spread use to have all these features. They are essential to the
success of the language for easy HDF5 file manipulation.
7 of 14
Real world Experience: Learning Python and h5py is quick.
In the summer of 2010 SSAI hired a summer intern.
Equipped with some Perl programming experience the
intern was able to come up to speed on Python, HDF5,
h5py, and numpy within one to two weeks and, over
the summer, develop a specialized file/dataset merging
tool and a dataset conversion tool.

Python and h5py are the best way to introduce HDF5
because it allows the user to concentrate on the H in
HDF5, rather than the C API syntax.

8 of 14
Python is well suited to HDF5
Python is well suited to HDF5 because the HDF5 array objects
carry the dimensionality, extent, and element data type
information, just as HDF5 datasets do. The object oriented
nature of Python allows these objects to be manipulated at a
high level. C, by contrast, lacks a scientific array object and
the ability to define object methods.

9 of 14
Example: Creating and Writing a Dataset to a New File
Python:

import h5py
import numpy
TestData = numpy.array(range(1,25),dtype='int32').reshape(4,6)
h5py.File("WrittenByH5PY.h5","w")['/TestDataset'] = TestData

Compare to C version:
#include "hdf5.h"
int main() {

hid_t
file_id, dataspace_id, dataset_id; /* identifiers */
herr_t status;
hsize_t dims[2];
const int FirstIndex = 4, SecondIndex = 6;
int
i, j, dset_data[4][6];
for (i = 0; i < 4; i++) /* Initialize the dataset. */
for (j = 0; j < 6; j++)
dset_data[i][j] = i * 6 + j + 1;
dims[0] = FirstIndex;
dims[1] = SecondIndex;
file_id = H5Fcreate("WrittenByC.h5", H5F_ACC_TRUNC, H5P_DEFAULT,H5P_DEFAULT); /* Open an existing file.
*/
dataspace_id = H5Screate_simple(2, dims, NULL);
dataset_id = H5Dcreate(file_id, "/TestDataset", H5T_STD_I32LE, dataspace_id,
H5P_DEFAULT,H5P_DEFAULT,H5P_DEFAULT);
/* Write the dataset. */
status = H5Dwrite(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_data);
status = H5Dclose(dataset_id); /* Close the dataset. */
status = H5Fclose(file_id); /* Close the file. */

10 of 14

}
And here's the output:
h5dump WrittenByH5PY.h5
HDF5 "WrittenByH5PY.h5" {
GROUP "/" {
DATASET "TestDataset" {
DATATYPE H5T_STD_I32LE
DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) }
DATA {
(0,0): 1, 2, 3, 4, 5, 6,
(1,0): 7, 8, 9, 10, 11, 12,
(2,0): 13, 14, 15, 16, 17, 18,
(3,0): 19, 20, 21, 22, 23, 24
}
}
}
}

11 of 14
Python and the Three Pillars of HDF5 Tools
Python is well suited to Text Processing
Python has wide range of string manipulation functions, an easyto-use regular expression module, and list and dictionary (hash
table) objects. No segmentation faults!

Python is well suited to Tree Walking. Recursive
functions and loops over lists are easy to write

Object Level Operations...Not so much.
Object Level Operations (e.g. copy, diff) are challenging to write
efficiently and should be provided as part of the API by the HDF
Group, for example h5o_copy. API functions are available to the
Python programmer via h5py.

12 of 14
Why use Python to substitute HDF5 tools?
Python is available now.
Some HDF5 tools are still under development as new use
cases are presented. For example, users have requested a
tool to add attributes to HDF5 files. Such a capability
already exists with h5py:
python -c "import h5py ; fid = h5py.File('FileForAttributeAddition.h5','r+') ;
fid['/TestDataset'].attrs['CmdLine1'] = 'NewValue' ; fid.close()"

It's little ugly, but it is available today.
Python is a full programming language. It can accomplish
tasks which HDF5 tools cannot.
Further Resources:
http://groups.google.com/group/h5py
http://h5py.alfven.org/

13 of 14
Recommendations:
Users should consider Python and H5py to accomplish their HDF5
file manipulation projects.
The HDF Group should concentrate on providing efficient
API functions for object level tasks: object copy, dataset
difference, etc.

The HDF Group should avoid complex enhancements to tools
where Python/h5py could be used instead.
An easily searched contributed application repository on the HDF
Group website with user ratings would be very helpful.

14 of 14

More Related Content

What's hot

Chp5 - Les outils CASE
Chp5 - Les outils CASEChp5 - Les outils CASE
Chp5 - Les outils CASE
Lilia Sfaxi
 
LAS16-504: Secure Storage updates in OP-TEE
LAS16-504: Secure Storage updates in OP-TEELAS16-504: Secure Storage updates in OP-TEE
LAS16-504: Secure Storage updates in OP-TEE
Linaro
 
DSLの使い所
DSLの使い所DSLの使い所
DSLの使い所
disc99_
 
Introduction to JDF / JMF
Introduction to JDF / JMFIntroduction to JDF / JMF
Introduction to JDF / JMF
Stefan Meissner
 
Génie Logiciel.pptx
Génie Logiciel.pptxGénie Logiciel.pptx
Génie Logiciel.pptx
LatifaBen6
 
[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...
[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...
[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...
Deep Learning JP
 
Learning AOSP - Android Linux Device Driver
Learning AOSP - Android Linux Device DriverLearning AOSP - Android Linux Device Driver
Learning AOSP - Android Linux Device Driver
Nanik Tolaram
 
Stefano Cordibella - An introduction to Yocto Project
Stefano Cordibella - An introduction to Yocto ProjectStefano Cordibella - An introduction to Yocto Project
Stefano Cordibella - An introduction to Yocto Project
linuxlab_conf
 
Q4.11: Porting Android to new Platforms
Q4.11: Porting Android to new PlatformsQ4.11: Porting Android to new Platforms
Q4.11: Porting Android to new Platforms
Linaro
 
Summary of linux kernel security protections
Summary of linux kernel security protectionsSummary of linux kernel security protections
Summary of linux kernel security protections
Shubham Dubey
 
Revisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIORevisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIO
Hisaki Ohara
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Angela Mendoza M.
 
A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...
A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...
A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...
Derek Buitenhuis
 
.NET Conf 2018 - Message Queue Based RPC
.NET Conf 2018 - Message Queue Based RPC.NET Conf 2018 - Message Queue Based RPC
.NET Conf 2018 - Message Queue Based RPC
Andrew Wu
 
Rustを支える技術
Rustを支える技術Rustを支える技術
Rustを支える技術
Keisuke Umezawa
 
Base de données distribuée
Base de données distribuéeBase de données distribuée
Base de données distribuée
kamar MEDDAH
 
Cours data warehouse
Cours data warehouseCours data warehouse
Cours data warehouse
khlifi z
 
あるキャッシュメモリの話
あるキャッシュメモリの話あるキャッシュメモリの話
あるキャッシュメモリの話
nullnilaki
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Hatim CHAHDI
 
Map server入門 - FOSS4G 2012 Hokkaido
Map server入門 - FOSS4G 2012 HokkaidoMap server入門 - FOSS4G 2012 Hokkaido
Map server入門 - FOSS4G 2012 Hokkaido
Hideo Harada
 

What's hot (20)

Chp5 - Les outils CASE
Chp5 - Les outils CASEChp5 - Les outils CASE
Chp5 - Les outils CASE
 
LAS16-504: Secure Storage updates in OP-TEE
LAS16-504: Secure Storage updates in OP-TEELAS16-504: Secure Storage updates in OP-TEE
LAS16-504: Secure Storage updates in OP-TEE
 
DSLの使い所
DSLの使い所DSLの使い所
DSLの使い所
 
Introduction to JDF / JMF
Introduction to JDF / JMFIntroduction to JDF / JMF
Introduction to JDF / JMF
 
Génie Logiciel.pptx
Génie Logiciel.pptxGénie Logiciel.pptx
Génie Logiciel.pptx
 
[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...
[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...
[DL Hacks]Learning Cross-modal Embeddings for Cooking Recipes and Food Images...
 
Learning AOSP - Android Linux Device Driver
Learning AOSP - Android Linux Device DriverLearning AOSP - Android Linux Device Driver
Learning AOSP - Android Linux Device Driver
 
Stefano Cordibella - An introduction to Yocto Project
Stefano Cordibella - An introduction to Yocto ProjectStefano Cordibella - An introduction to Yocto Project
Stefano Cordibella - An introduction to Yocto Project
 
Q4.11: Porting Android to new Platforms
Q4.11: Porting Android to new PlatformsQ4.11: Porting Android to new Platforms
Q4.11: Porting Android to new Platforms
 
Summary of linux kernel security protections
Summary of linux kernel security protectionsSummary of linux kernel security protections
Summary of linux kernel security protections
 
Revisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIORevisit DCA, PCIe TPH and DDIO
Revisit DCA, PCIe TPH and DDIO
 
Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08Nvidia cuda tutorial_no_nda_apr08
Nvidia cuda tutorial_no_nda_apr08
 
A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...
A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...
A Progressive Approach to the Past: Ensuring Backwards Compatability Through ...
 
.NET Conf 2018 - Message Queue Based RPC
.NET Conf 2018 - Message Queue Based RPC.NET Conf 2018 - Message Queue Based RPC
.NET Conf 2018 - Message Queue Based RPC
 
Rustを支える技術
Rustを支える技術Rustを支える技術
Rustを支える技術
 
Base de données distribuée
Base de données distribuéeBase de données distribuée
Base de données distribuée
 
Cours data warehouse
Cours data warehouseCours data warehouse
Cours data warehouse
 
あるキャッシュメモリの話
あるキャッシュメモリの話あるキャッシュメモリの話
あるキャッシュメモリの話
 
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
Cours HBase et Base de Données Orientées Colonnes (HBase, Column Oriented Dat...
 
Map server入門 - FOSS4G 2012 Hokkaido
Map server入門 - FOSS4G 2012 HokkaidoMap server入門 - FOSS4G 2012 Hokkaido
Map server入門 - FOSS4G 2012 Hokkaido
 

Viewers also liked

The Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5PyThe Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5Py
The HDF-EOS Tools and Information Center
 
Python and HDF5: Overview
Python and HDF5: OverviewPython and HDF5: Overview
Python and HDF5: Overview
andrewcollette
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
Introduction To Programming with Python-1
Introduction To Programming with Python-1Introduction To Programming with Python-1
Introduction To Programming with Python-1
Syed Farjad Zia Zaidi
 
Logic Over Language
Logic Over LanguageLogic Over Language
Logic Over Language
Purple, Rock, Scissors
 
Logic: Language and Information 1
Logic: Language and Information 1Logic: Language and Information 1
Logic: Language and Information 1
Syed Farjad Zia Zaidi
 
Introduction To Programming with Python-5
Introduction To Programming with Python-5Introduction To Programming with Python-5
Introduction To Programming with Python-5
Syed Farjad Zia Zaidi
 
An Introduction to Interactive Programming in Python 2013
An Introduction to Interactive Programming in Python 2013An Introduction to Interactive Programming in Python 2013
An Introduction to Interactive Programming in Python 2013
Syed Farjad Zia Zaidi
 
Introduction to Databases
Introduction to DatabasesIntroduction to Databases
Introduction to Databases
Syed Farjad Zia Zaidi
 
Introduction To Programming with Python-4
Introduction To Programming with Python-4Introduction To Programming with Python-4
Introduction To Programming with Python-4
Syed Farjad Zia Zaidi
 
Introduction to UBI
Introduction to UBIIntroduction to UBI
Introduction to UBIRoy Lee
 
Python 4 Arc
Python 4 ArcPython 4 Arc
Python 4 Arc
absvis
 
Clase 2 estatica
Clase 2 estatica Clase 2 estatica
Clase 2 estatica
Gerald Moreira Ramírez
 
Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP
The HDF-EOS Tools and Information Center
 
Python programming - Everyday(ish) Examples
Python programming - Everyday(ish) ExamplesPython programming - Everyday(ish) Examples
Python programming - Everyday(ish) Examples
Ashish Sharma
 
Lets learn Python !
Lets learn Python !Lets learn Python !
Lets learn Python !
Kiran Gangadharan
 
Introduction To Programming with Python Lecture 2
Introduction To Programming with Python Lecture 2Introduction To Programming with Python Lecture 2
Introduction To Programming with Python Lecture 2
Syed Farjad Zia Zaidi
 
Cyberoam Firewall Presentation
Cyberoam Firewall PresentationCyberoam Firewall Presentation
Cyberoam Firewall Presentation
Manoj Kumar Mishra
 
introduction to python
introduction to pythonintroduction to python
introduction to python
Sardar Alam
 

Viewers also liked (20)

Using HDF5 and Python: The H5py module
Using HDF5 and Python: The H5py moduleUsing HDF5 and Python: The H5py module
Using HDF5 and Python: The H5py module
 
The Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5PyThe Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5Py
 
Python and HDF5: Overview
Python and HDF5: OverviewPython and HDF5: Overview
Python and HDF5: Overview
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
HDF5 Tools
 
Introduction To Programming with Python-1
Introduction To Programming with Python-1Introduction To Programming with Python-1
Introduction To Programming with Python-1
 
Logic Over Language
Logic Over LanguageLogic Over Language
Logic Over Language
 
Logic: Language and Information 1
Logic: Language and Information 1Logic: Language and Information 1
Logic: Language and Information 1
 
Introduction To Programming with Python-5
Introduction To Programming with Python-5Introduction To Programming with Python-5
Introduction To Programming with Python-5
 
An Introduction to Interactive Programming in Python 2013
An Introduction to Interactive Programming in Python 2013An Introduction to Interactive Programming in Python 2013
An Introduction to Interactive Programming in Python 2013
 
Introduction to Databases
Introduction to DatabasesIntroduction to Databases
Introduction to Databases
 
Introduction To Programming with Python-4
Introduction To Programming with Python-4Introduction To Programming with Python-4
Introduction To Programming with Python-4
 
Introduction to UBI
Introduction to UBIIntroduction to UBI
Introduction to UBI
 
Python 4 Arc
Python 4 ArcPython 4 Arc
Python 4 Arc
 
Clase 2 estatica
Clase 2 estatica Clase 2 estatica
Clase 2 estatica
 
Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP
 
Python programming - Everyday(ish) Examples
Python programming - Everyday(ish) ExamplesPython programming - Everyday(ish) Examples
Python programming - Everyday(ish) Examples
 
Lets learn Python !
Lets learn Python !Lets learn Python !
Lets learn Python !
 
Introduction To Programming with Python Lecture 2
Introduction To Programming with Python Lecture 2Introduction To Programming with Python Lecture 2
Introduction To Programming with Python Lecture 2
 
Cyberoam Firewall Presentation
Cyberoam Firewall PresentationCyberoam Firewall Presentation
Cyberoam Firewall Presentation
 
introduction to python
introduction to pythonintroduction to python
introduction to python
 

Similar to Substituting HDF5 tools with Python/H5py scripts

Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
The HDF-EOS Tools and Information Center
 
Parallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory TutorialParallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory Tutorial
The HDF-EOS Tools and Information Center
 
HDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/OHDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/O
The HDF-EOS Tools and Information Center
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
The HDF-EOS Tools and Information Center
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
The HDF-EOS Tools and Information Center
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Hdf5 parallel
Hdf5 parallelHdf5 parallel
Hdf5 parallel
mfolk
 
HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
The HDF-EOS Tools and Information Center
 
Overview of Parallel HDF5
Overview of Parallel HDF5Overview of Parallel HDF5
Overview of Parallel HDF5
The HDF-EOS Tools and Information Center
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
HDF5 Tools in IDL
HDF5 Tools in IDLHDF5 Tools in IDL
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014
Edwin de Jonge
 
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
The HDF-EOS Tools and Information Center
 
Implementing HDF5 in MATLAB
Implementing HDF5 in MATLABImplementing HDF5 in MATLAB
Implementing HDF5 in MATLAB
The HDF-EOS Tools and Information Center
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5

Similar to Substituting HDF5 tools with Python/H5py scripts (20)

Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
 
Parallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory TutorialParallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory Tutorial
 
Introduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming ModelsIntroduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming Models
 
HDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/OHDF5 Advanced Topics - Datatypes and Partial I/O
HDF5 Advanced Topics - Datatypes and Partial I/O
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Hdf5 parallel
Hdf5 parallelHdf5 parallel
Hdf5 parallel
 
HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
 
Overview of Parallel HDF5
Overview of Parallel HDF5Overview of Parallel HDF5
Overview of Parallel HDF5
 
Unit V.pdf
Unit V.pdfUnit V.pdf
Unit V.pdf
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
HDF5 Tools in IDL
HDF5 Tools in IDLHDF5 Tools in IDL
HDF5 Tools in IDL
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014
 
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
 
Implementing HDF5 in MATLAB
Implementing HDF5 in MATLABImplementing HDF5 in MATLAB
Implementing HDF5 in MATLAB
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 

More from The HDF-EOS Tools and Information Center

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
The HDF-EOS Tools and Information Center
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
The HDF-EOS Tools and Information Center
 
The State of HDF
The State of HDFThe State of HDF
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
The HDF-EOS Tools and Information Center
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
The HDF-EOS Tools and Information Center
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
The HDF-EOS Tools and Information Center
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
The HDF-EOS Tools and Information Center
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
The HDF-EOS Tools and Information Center
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
The HDF-EOS Tools and Information Center
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
The HDF-EOS Tools and Information Center
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
The HDF-EOS Tools and Information Center
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
The HDF-EOS Tools and Information Center
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
The HDF-EOS Tools and Information Center
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
The HDF-EOS Tools and Information Center
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
The HDF-EOS Tools and Information Center
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
The HDF-EOS Tools and Information Center
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
The HDF-EOS Tools and Information Center
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
The HDF-EOS Tools and Information Center
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020

More from The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
nkrafacyberclub
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
Alex Pruden
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
Safe Software
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
UiPathCommunity
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptxSecstrike : Reverse Engineering & Pwnable tools for CTF.pptx
Secstrike : Reverse Engineering & Pwnable tools for CTF.pptx
 
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex ProofszkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
zkStudyClub - Reef: Fast Succinct Non-Interactive Zero-Knowledge Regex Proofs
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Essentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FMEEssentials of Automations: The Art of Triggers and Actions in FME
Essentials of Automations: The Art of Triggers and Actions in FME
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
Le nuove frontiere dell'AI nell'RPA con UiPath Autopilot™
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 

Substituting HDF5 tools with Python/H5py scripts

  • 1. Substituting HDF5 tools with Python/H5py scripts Daniel Kahn Science Systems and Applications Inc. HDF HDF-EOS Workshop XIV, 28 Sep. 2010 1 of 14
  • 2. What are HDF5 tools? HDF5 tools are command line programs distributed with the HDF5 library. They allow users to manipulate HDF5 files. h5dump: dump HDF5 data as ASCII text. h5import: convert non-HDF5 data to HDF5 h5diff: show differences between HDF5 files. h5copy: Copy objects between HDF5 files. h5repack: Copy entire file while changing storage properties of HDF5 objects. h5edit: (proposed) add attributes to HDF5 objects. HDF5 tools have a long history as the first (and for a long time only) way to manipulate HDF5 files conveniently. I.e. without writing a C or Java program, or without buying expensive commercial software such as IDL or Matlab. 2 of 14
  • 3. The tools can be characterized as having three parts: Text Processing—Evaluate command arguments, process input text files, match group names. Tree Walking – Search HDF5 file hierarchy for objects by name. Object Level Operations – Operate on the objects: copy, diff, repack, etc. The tools are simple to use and convenient as they are distributed with the HDF5 library. 3 of 14
  • 4. Disadvantage of HDF5 tools: The command line arguments limit tool capability. Adding new features with command line syntax which is both readable and does not break the legacy syntax becomes difficult. Development time for designing and implementing new features is long (weeks...months). Use cases must be evaluated, a solution proposed in an RFC, the proposal must be implemented, new code is distributed in next release. 4 of 14
  • 5. Here's an example from HDF documentation: h5copy -v -i "test1.h5" -o "test1.out.h5" -s "/array" -d "/array But suppose we had multiple datasets named arrayNNN where N is 0–9. We'd like to write something like: h5copy -v -i "test1.h5" -o "test1.out.h5" -s "/arrayd+{3}” So that d+{3} would provide a match to all such objects. Extending the tool syntax to meet this use case, and then again for the next use case would be a never ending game of catch up. A more flexible substitute is desirable... 5 of 14
  • 7. What is Python? Python is a programming language. It features dynamic binding of variables, like Perl or shell scripts, IDL, Matlab, but not C or Fortran. Unlike Perl, it supports native floating point numbers. It has scientific array support in the style of IDL or Matlab (numpy module). Array operations can be programmed using normal arithmetic operators. It has access to the HDF5 library (Anderw Collette's h5py module). Python is currently the only programming language in wide spread use to have all these features. They are essential to the success of the language for easy HDF5 file manipulation. 7 of 14
  • 8. Real world Experience: Learning Python and h5py is quick. In the summer of 2010 SSAI hired a summer intern. Equipped with some Perl programming experience the intern was able to come up to speed on Python, HDF5, h5py, and numpy within one to two weeks and, over the summer, develop a specialized file/dataset merging tool and a dataset conversion tool. Python and h5py are the best way to introduce HDF5 because it allows the user to concentrate on the H in HDF5, rather than the C API syntax. 8 of 14
  • 9. Python is well suited to HDF5 Python is well suited to HDF5 because the HDF5 array objects carry the dimensionality, extent, and element data type information, just as HDF5 datasets do. The object oriented nature of Python allows these objects to be manipulated at a high level. C, by contrast, lacks a scientific array object and the ability to define object methods. 9 of 14
  • 10. Example: Creating and Writing a Dataset to a New File Python: import h5py import numpy TestData = numpy.array(range(1,25),dtype='int32').reshape(4,6) h5py.File("WrittenByH5PY.h5","w")['/TestDataset'] = TestData Compare to C version: #include "hdf5.h" int main() { hid_t file_id, dataspace_id, dataset_id; /* identifiers */ herr_t status; hsize_t dims[2]; const int FirstIndex = 4, SecondIndex = 6; int i, j, dset_data[4][6]; for (i = 0; i < 4; i++) /* Initialize the dataset. */ for (j = 0; j < 6; j++) dset_data[i][j] = i * 6 + j + 1; dims[0] = FirstIndex; dims[1] = SecondIndex; file_id = H5Fcreate("WrittenByC.h5", H5F_ACC_TRUNC, H5P_DEFAULT,H5P_DEFAULT); /* Open an existing file. */ dataspace_id = H5Screate_simple(2, dims, NULL); dataset_id = H5Dcreate(file_id, "/TestDataset", H5T_STD_I32LE, dataspace_id, H5P_DEFAULT,H5P_DEFAULT,H5P_DEFAULT); /* Write the dataset. */ status = H5Dwrite(dataset_id, H5T_NATIVE_INT, H5S_ALL, H5S_ALL, H5P_DEFAULT, dset_data); status = H5Dclose(dataset_id); /* Close the dataset. */ status = H5Fclose(file_id); /* Close the file. */ 10 of 14 }
  • 11. And here's the output: h5dump WrittenByH5PY.h5 HDF5 "WrittenByH5PY.h5" { GROUP "/" { DATASET "TestDataset" { DATATYPE H5T_STD_I32LE DATASPACE SIMPLE { ( 4, 6 ) / ( 4, 6 ) } DATA { (0,0): 1, 2, 3, 4, 5, 6, (1,0): 7, 8, 9, 10, 11, 12, (2,0): 13, 14, 15, 16, 17, 18, (3,0): 19, 20, 21, 22, 23, 24 } } } } 11 of 14
  • 12. Python and the Three Pillars of HDF5 Tools Python is well suited to Text Processing Python has wide range of string manipulation functions, an easyto-use regular expression module, and list and dictionary (hash table) objects. No segmentation faults! Python is well suited to Tree Walking. Recursive functions and loops over lists are easy to write Object Level Operations...Not so much. Object Level Operations (e.g. copy, diff) are challenging to write efficiently and should be provided as part of the API by the HDF Group, for example h5o_copy. API functions are available to the Python programmer via h5py. 12 of 14
  • 13. Why use Python to substitute HDF5 tools? Python is available now. Some HDF5 tools are still under development as new use cases are presented. For example, users have requested a tool to add attributes to HDF5 files. Such a capability already exists with h5py: python -c "import h5py ; fid = h5py.File('FileForAttributeAddition.h5','r+') ; fid['/TestDataset'].attrs['CmdLine1'] = 'NewValue' ; fid.close()" It's little ugly, but it is available today. Python is a full programming language. It can accomplish tasks which HDF5 tools cannot. Further Resources: http://groups.google.com/group/h5py http://h5py.alfven.org/ 13 of 14
  • 14. Recommendations: Users should consider Python and H5py to accomplish their HDF5 file manipulation projects. The HDF Group should concentrate on providing efficient API functions for object level tasks: object copy, dataset difference, etc. The HDF Group should avoid complex enhancements to tools where Python/h5py could be used instead. An easily searched contributed application repository on the HDF Group website with user ratings would be very helpful. 14 of 14