SlideShare a Scribd company logo
Using HDF5 and Python: The H5py module

Daniel Kahn
Science Systems and Applications, Inc.
Acknowledgement: Thanks to Ed Masuoka, NASA Contract NNG06HX18C

HDF & HDF-EOS Workshop XV 17 April 2012
Python has lists:
>>> for elem in ['FirstItem','SecondItem','ThirdItem']:
...
print elem
...
FirstItem
SecondItem
ThirdItem
>>>

We can assign the list to a variable.
>>> MyList = ['FirstItem','SecondItem','ThirdItem']
>>> for elem in MyList:
...
print elem
...
FirstItem
SecondItem
ThirdItem
HDF & HDF-EOS Workshop XV 17 April 2012
>>>
Lists can contain a mix of objects:
>>> MixedList = ['MyString',5,[72, 99.44]]
>>> for elem in MixedList:
...
print elem
...
MyString
A list inside a list
5
[72, 99.44]
Lists can be addressed by index:
>>> MixedList[0]
'MyString'
>>> MixedList[2]
[72, 99.44]
HDF & HDF-EOS Workshop XV 17 April 2012
A note about Python lists:
Python lists are one dimensional.
Arithmetic operations don’t work on them.
Don’t be tempted to use them for scientific array
based data sets. More the ‘right way’ later...

HDF & HDF-EOS Workshop XV 17 April 2012
Python has dictionaries.
Dictionaries are key,value pairs
>>> Dictionary =

{'FirstKey':'FirstValue',
'SecondKey':'SecondValue',
'ThirdKey':'ThirdValue'}

>>> Dictionary
{'SecondKey': 'SecondValue', 'ThirdKey': 'ThirdValue',
'FirstKey': 'FirstValue'}
>>>

Notice that Python prints the key,value pairs in a different
order than I typed them.
The Key,Value pairs in a dictionary are unordered.
HDF & HDF-EOS Workshop XV 17 April 2012
Dictionaries are not lists, however we can easily create a list
of the dictionary keys:
>>> list(Dictionary)
['SecondKey', 'ThirdKey', 'FirstKey']
>>>

We can use a dictionary in a loop without additional
elaboration:
>>> for Key in Dictionary:
...
print Key,"---->",Dictionary[Key]
...
SecondKey ----> SecondValue
ThirdKey ----> ThirdValue
FirstKey ----> FirstValue
>>>
HDF & HDF-EOS Workshop XV 17 April 2012
HDF5 is made of
“Dictionaries” a dataset
name is the key, and the
array is the value.
Keys
Value
HDFView is a tool which
shows use the keys
(TreeView) and the values
(TableView) of an HDF5 file.
HDF & HDF-EOS Workshop XV 17 April 2012
Andrew Collette’s H5py module allows us to use Python and
HDF5 together.
We can use H5py to manipulate HDF5 files as if they were
Python Dictionaries
>>> import h5py
>>> in_fid = h5py.File('DansExample1.h5','r')
>>> for DS in in_fid:
...
print DS,"------->",in_fid[DS]
...
FirstDataset -------> <HDF5 dataset "FirstDataset": shape (25,), type "<i4">
SecondDataset -------> <HDF5 dataset "SecondDataset": shape (3, 3), type "<i4">
ThirdDataset -------> <HDF5 dataset "ThirdDataset": shape (5, 5), type "<i4">

>>>

Keys

Values
HDF & HDF-EOS Workshop XV 17 April 2012
So What? We need to be able to manipulate the arrays, not
just the file.
The Numpy module by Travis Oliphant allows the manipulation
of arrays in Python.
We will see examples of writing arrays later, but to get arrays
from the H5py object we have the ellipses.
>>> import h5py
>>> fid = h5py.File('DansExample1.h5','r')
>>> fid['FirstDataset']
<HDF5 dataset "FirstDataset": shape (25,), type "<i4">
>>> fid['FirstDataset'][...]
array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15,
17, 18, 19, 20, 21, 22, 23, 24])
>>> type(fid['FirstDataset'][...])
<type 'numpy.ndarray'>
>>>
HDF & HDF-EOS Workshop XV 17 April 2012

16,
Reasons to use Python and HDF5 instead of C or Fortran
The basic Python Dictionary object has a close similarity to
the HDF5 Group. The object oriented and dynamic nature of
Python allows the existing Dictionary syntax to be repurposed
for HDF5 manipulation.
In short, working with HDF5 in Python requires much less
code than C or Fortran which means faster development and
fewer errors.

HDF & HDF-EOS Workshop XV 17 April 2012
Comparison to C, h5_gzip:

C
# Lines of code

106

Python from
THG site
37

Fewer lines of code means fewer places to make mistakes
The 37 line h5_gzip.py example is a “direct” translation of the
C version. Some more advanced techniques offer insight into
advantages of Python/H5py programming. Text in next
slides is color coded to help match code with same functionality.
First writing a file…
HDF & HDF-EOS Workshop XV 17 April 2012
Original h5_gzip.py

Pythonic h5_gzip.py

# This example creates and writes GZIP compressed
dataset.
import h5py
import numpy as np
# Create gzip.h5 file.
#
file = h5py.File('gzip.h5','w')
#
# Create /DS1 dataset; in order to use compression,
dataset has to be chunked.
#
dataset = file.create_dataset('DS1',
(32,64),'i',chunks=(4,8),compression='gzip',compressi
on_opts=9)
#
# Initialize data.
#
data = np.zeros((32,64))
for i in range(32):
for j in range(64):
data[i][j]= i*j-j
# Write data.
print "Writing data..."
dataset[...] = datafile.close()

#!/usr/bin/env python
# It's a UNIX thing.....

from __future__ import print_function # Code will work
with python 3 as well....

# This example creates and writes GZIP compressed
dataset.
import h5py # load the HDF5 interface module
import numpy as np # Load the array processing
module
# Initialize data. Note the numbers 32 and 64 only
appear ONCE in the code!
LeftVector = np.arange(-1,32-1,dtype='int32')
RightVector = np.arange(64,dtype='int32')
DataArray = np.outer(LeftVector,RightVector) # create
32x64 array of i*j-j
# The _with_ construct will automatically create and
close the HDF5 file
with h5py.File('gzip-pythonic.h5','w') as h5_fid:
# Create and write /DS1 dataset; in order to use
compression, dataset has to be chunked.
h5_fid.create_dataset('DS1',data=DataArray,chunks=(4
,8),compression='gzip',compression_opts=9)

dataset[...] = data
file.close()
file.close()

HDF & HDF-EOS Workshop XV 17 April 2012
Reading data….

# Read data back; display compression properties and
dataset max value.
#
file = h5py.File('gzip.h5','r')
dataset = file['DS1']
print "Compression method is", dataset.compression
print "Compression parameter is",
dataset.compression_opts
data = dataset[...]
print "Maximum value in", dataset.name, "is:",
max(data.ravel())
file.close()

# Read data back; display compression properties and
dataset max value.
#
with h5py.File('gzip-pythonic.h5','r') as h5_fid:
dataset = h5_fid['DS1']
print("Compression method is", dataset.compression)
print("Compression parameter is",
dataset.compression_opts)
print("Maximum value in", dataset.name, "is:",
dataset.value.max())

HDF & HDF-EOS Workshop XV 17 April 2012
And finally, just to see what the file looks like…

HDF & HDF-EOS Workshop XV 17 April 2012
Real world example: Table Comparison
Background:
For the OMPS Instruments we need to design binary
arrays to be uploaded to the satellite to sub-sample the
CCD to reduced data rate.
For ground processing use we store these arrays in
HDF5.
As part of the design process we want to be able to
compare arrays in two different files.

HDF & HDF-EOS Workshop XV 17 April 2012
Here is an example of a Sample Table

HDF & HDF-EOS Workshop XV 17 April 2012
Here is another example:

HDF & HDF-EOS Workshop XV 17 April 2012
Here is the “difference” of the arrays. Red pixels are
unique to the first array.

HDF & HDF-EOS Workshop XV 17 April 2012
The code: CompareST.py
#!/usr/bin/env python
""" Documentation """
from __future__ import print_function,division
import h5py
import numpy
import ViewFrame
def CompareST(ST1,ST2,IntTime):
with h5py.File(ST1,'r') as st1_fid,h5py.File(ST2,'r') as st2_fid:
ST1 = st1_fid['/DATA/'+IntTime+'/SampleTable'].value
ST2 = st2_fid['/DATA/'+IntTime+'/SampleTable'].value
ST1[ST1!=0] = 1
ST2[ST2!=0] = 1
Diff = (ST1 - ST2)
ST1[Diff == 1] = 2
ViewFrame.ViewFrame(ST1)

HDF & HDF-EOS Workshop XV 17 April 2012
..and the command line argument parsing.

if __name__ == "__main__":
import argparse
OptParser = argparse.ArgumentParser(description = __doc__)
OptParser.add_argument("--ST1",help="SampleTableFile1")
OptParser.add_argument("--ST2",help="SampleTableFile2")
OptParser.add_argument("--IntTime",help="Integration Time",
default='Long')
options = OptParser.parse_args()
CompareST(options.ST1,options.ST2,options.IntTime)

HDF & HDF-EOS Workshop XV 17 April 2012
Recursive descent into HDF5 file
Print group names, number of children and dataset names.
#!/usr/bin/env python
from __future__ import print_function
import h5py
def print_num_children(obj):
if isinstance(obj,h5py.highlevel.Group):
print(obj.name,"Number of Children:",len(obj))
for ObjName in obj: # ObjName will a string
print_num_children(obj[ObjName])
else:
print(obj.name,"Not a group")
with h5py.File('OMPS-NPP-NPP-LP_STB', 'r+') as f:
print_num_children(f)

HDF & HDF-EOS Workshop XV 17 April 2012
The Result….
ssai-s01033@dkahn: ~/python % ./print_num_children.py
/ Number of Children: 1
/DATA Number of Children: 10
/DATA/AutoSplitLong Not a group
/DATA/AutoSplitShort Not a group
/DATA/AuxiliaryData Number of Children: 6
/DATA/AuxiliaryData/FeatureNames Not a group
/DATA/AuxiliaryData/InputSpecification Not a group
/DATA/AuxiliaryData/LongLowEndSaturationEstimate Not a group
/DATA/AuxiliaryData/ShortLowEndSaturationEstimate Not a group
/DATA/AuxiliaryData/Timings Number of Children: 2
/DATA/AuxiliaryData/Timings/Long Not a group
/DATA/AuxiliaryData/Timings/Short Not a group
/DATA/AuxiliaryData/dummy Not a group
/DATA/Long Number of Children: 14
/DATA/Long/BadPixelTable Not a group
/DATA/Long/BinTransitionTable Not a group
/DATA/Long/FeatureNamesIndexes Not a group
/DATA/Long/Gain Not a group
/DATA/Long/InverseOMPSColumns Not a group
HDF & HDF-EOS Workshop XV 17 April 2012
Summary
Python with H5py and Numpy modules make developing
Programs to manipulate HDF5 files and perform calculations
With HDF5 arrays simpler which increase development
speed and reduces errors.

HDF & HDF-EOS Workshop XV 17 April 2012

More Related Content

What's hot

Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobile
Anirudh Koul
 
Federated Learning
Federated LearningFederated Learning
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
준식 최
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
DataWorks Summit
 
The google MapReduce
The google MapReduceThe google MapReduce
The google MapReduce
Romain Jacotin
 
A Privacy Framework for Hierarchical Federated Learning
A Privacy Framework for Hierarchical Federated LearningA Privacy Framework for Hierarchical Federated Learning
A Privacy Framework for Hierarchical Federated Learning
Debmalya Biswas
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
KIRAN R
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Lalit Jain
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
Noura Hussein
 
Artificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimizationArtificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimization
Mohammed Bennamoun
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphs
Deakin University
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Simplilearn
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
Mohamed Loey
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
MAHMOUD729246
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
Lukas Masuch
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series Prediction
Gianluca Bontempi
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
Ralph Schlosser
 
Deep learning ppt
Deep learning pptDeep learning ppt
Deep learning ppt
BalneSridevi
 
How to use deep learning on biological data
How to use deep learning on biological dataHow to use deep learning on biological data
How to use deep learning on biological data
Aly Abdelkareem
 
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...Virot "Ta" Chiraphadhanakul
 

What's hot (20)

Deep learning on mobile
Deep learning on mobileDeep learning on mobile
Deep learning on mobile
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
Paper Summary of Beta-VAE: Learning Basic Visual Concepts with a Constrained ...
 
Federated Learning
Federated LearningFederated Learning
Federated Learning
 
The google MapReduce
The google MapReduceThe google MapReduce
The google MapReduce
 
A Privacy Framework for Hierarchical Federated Learning
A Privacy Framework for Hierarchical Federated LearningA Privacy Framework for Hierarchical Federated Learning
A Privacy Framework for Hierarchical Federated Learning
 
Image classification using convolutional neural network
Image classification using convolutional neural networkImage classification using convolutional neural network
Image classification using convolutional neural network
 
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow) Object classification using CNN & VGG16 Model (Keras and Tensorflow)
Object classification using CNN & VGG16 Model (Keras and Tensorflow)
 
Image classification using CNN
Image classification using CNNImage classification using CNN
Image classification using CNN
 
Artificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimizationArtificial Neural Networks Lect8: Neural networks for constrained optimization
Artificial Neural Networks Lect8: Neural networks for constrained optimization
 
Representation learning on graphs
Representation learning on graphsRepresentation learning on graphs
Representation learning on graphs
 
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
Convolutional Neural Network - CNN | How CNN Works | Deep Learning Course | S...
 
Convolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep LearningConvolutional Neural Network Models - Deep Learning
Convolutional Neural Network Models - Deep Learning
 
GANs Presentation.pptx
GANs Presentation.pptxGANs Presentation.pptx
GANs Presentation.pptx
 
Deep learning - A Visual Introduction
Deep learning - A Visual IntroductionDeep learning - A Visual Introduction
Deep learning - A Visual Introduction
 
Machine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series PredictionMachine Learning Strategies for Time Series Prediction
Machine Learning Strategies for Time Series Prediction
 
LSTM Tutorial
LSTM TutorialLSTM Tutorial
LSTM Tutorial
 
Deep learning ppt
Deep learning pptDeep learning ppt
Deep learning ppt
 
How to use deep learning on biological data
How to use deep learning on biological dataHow to use deep learning on biological data
How to use deep learning on biological data
 
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...Image Classification with Deep Learning  |  DevFest + GDay, George Town, Mala...
Image Classification with Deep Learning | DevFest + GDay, George Town, Mala...
 

Similar to Using HDF5 and Python: The H5py module

The Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5PyThe Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5Py
The HDF-EOS Tools and Information Center
 
Substituting HDF5 tools with Python/H5py scripts
Substituting HDF5 tools with Python/H5py scriptsSubstituting HDF5 tools with Python/H5py scripts
Substituting HDF5 tools with Python/H5py scripts
The HDF-EOS Tools and Information Center
 
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
The HDF-EOS Tools and Information Center
 
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
The HDF-EOS Tools and Information Center
 
Parallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory TutorialParallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory Tutorial
The HDF-EOS Tools and Information Center
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
Henry Schreiner
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
Overview of Parallel HDF5
Overview of Parallel HDF5Overview of Parallel HDF5
Overview of Parallel HDF5
The HDF-EOS Tools and Information Center
 
The MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 InterfaceThe MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 Interface
The HDF-EOS Tools and Information Center
 
Product Designer Hub - Taking HPD to the Web
Product Designer Hub - Taking HPD to the WebProduct Designer Hub - Taking HPD to the Web
Product Designer Hub - Taking HPD to the Web
The HDF-EOS Tools and Information Center
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014
Edwin de Jonge
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
HDF5 Advanced Topics
HDF5 Advanced TopicsHDF5 Advanced Topics
Dimension Scales in HDF-EOS2 and HDF-EOS5
Dimension Scales in HDF-EOS2 and HDF-EOS5 Dimension Scales in HDF-EOS2 and HDF-EOS5
Dimension Scales in HDF-EOS2 and HDF-EOS5
The HDF-EOS Tools and Information Center
 
This project is the first projects you will be working on this quart.pdf
This project is the first projects you will be working on this quart.pdfThis project is the first projects you will be working on this quart.pdf
This project is the first projects you will be working on this quart.pdf
eyewaregallery
 
Spsl iv unit final
Spsl iv unit  finalSpsl iv unit  final
Spsl iv unit final
Sasidhar Kothuru
 
Spsl iv unit final
Spsl iv unit  finalSpsl iv unit  final
Spsl iv unit final
Sasidhar Kothuru
 

Similar to Using HDF5 and Python: The H5py module (20)

The Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5PyThe Python Programming Language and HDF5: H5Py
The Python Programming Language and HDF5: H5Py
 
Substituting HDF5 tools with Python/H5py scripts
Substituting HDF5 tools with Python/H5py scriptsSubstituting HDF5 tools with Python/H5py scripts
Substituting HDF5 tools with Python/H5py scripts
 
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
 
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 LibraryOverview of Parallel HDF5 and Performance Tuning in HDF5 Library
Overview of Parallel HDF5 and Performance Tuning in HDF5 Library
 
Parallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory TutorialParallel HDF5 Introductory Tutorial
Parallel HDF5 Introductory Tutorial
 
What's new in Python 3.11
What's new in Python 3.11What's new in Python 3.11
What's new in Python 3.11
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
HDF5 Tools
 
Overview of Parallel HDF5
Overview of Parallel HDF5Overview of Parallel HDF5
Overview of Parallel HDF5
 
The MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 InterfaceThe MATLAB Low-Level HDF5 Interface
The MATLAB Low-Level HDF5 Interface
 
Product Designer Hub - Taking HPD to the Web
Product Designer Hub - Taking HPD to the WebProduct Designer Hub - Taking HPD to the Web
Product Designer Hub - Taking HPD to the Web
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Docopt, beautiful command-line options for R, user2014
Docopt, beautiful command-line options for R,  user2014Docopt, beautiful command-line options for R,  user2014
Docopt, beautiful command-line options for R, user2014
 
Introduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming ModelsIntroduction to HDF5 Data and Programming Models
Introduction to HDF5 Data and Programming Models
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
HDF5 Advanced Topics
HDF5 Advanced TopicsHDF5 Advanced Topics
HDF5 Advanced Topics
 
Dimension Scales in HDF-EOS2 and HDF-EOS5
Dimension Scales in HDF-EOS2 and HDF-EOS5 Dimension Scales in HDF-EOS2 and HDF-EOS5
Dimension Scales in HDF-EOS2 and HDF-EOS5
 
This project is the first projects you will be working on this quart.pdf
This project is the first projects you will be working on this quart.pdfThis project is the first projects you will be working on this quart.pdf
This project is the first projects you will be working on this quart.pdf
 
Spsl iv unit final
Spsl iv unit  finalSpsl iv unit  final
Spsl iv unit final
 
Spsl iv unit final
Spsl iv unit  finalSpsl iv unit  final
Spsl iv unit final
 

More from The HDF-EOS Tools and Information Center

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
The HDF-EOS Tools and Information Center
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
The HDF-EOS Tools and Information Center
 
The State of HDF
The State of HDFThe State of HDF
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
The HDF-EOS Tools and Information Center
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
The HDF-EOS Tools and Information Center
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
The HDF-EOS Tools and Information Center
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
The HDF-EOS Tools and Information Center
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
The HDF-EOS Tools and Information Center
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
The HDF-EOS Tools and Information Center
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
The HDF-EOS Tools and Information Center
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
The HDF-EOS Tools and Information Center
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
The HDF-EOS Tools and Information Center
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
The HDF-EOS Tools and Information Center
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
The HDF-EOS Tools and Information Center
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
The HDF-EOS Tools and Information Center
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
The HDF-EOS Tools and Information Center
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
The HDF-EOS Tools and Information Center
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
The HDF-EOS Tools and Information Center
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020

More from The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Recently uploaded

Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
Nexer Digital
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
Pierluigi Pugliese
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
Globus
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 

Recently uploaded (20)

Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?Elizabeth Buie - Older adults: Are we really designing for our future selves?
Elizabeth Buie - Older adults: Are we really designing for our future selves?
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024By Design, not by Accident - Agile Venture Bolzano 2024
By Design, not by Accident - Agile Venture Bolzano 2024
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Enhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZEnhancing Performance with Globus and the Science DMZ
Enhancing Performance with Globus and the Science DMZ
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 

Using HDF5 and Python: The H5py module

  • 1. Using HDF5 and Python: The H5py module Daniel Kahn Science Systems and Applications, Inc. Acknowledgement: Thanks to Ed Masuoka, NASA Contract NNG06HX18C HDF & HDF-EOS Workshop XV 17 April 2012
  • 2. Python has lists: >>> for elem in ['FirstItem','SecondItem','ThirdItem']: ... print elem ... FirstItem SecondItem ThirdItem >>> We can assign the list to a variable. >>> MyList = ['FirstItem','SecondItem','ThirdItem'] >>> for elem in MyList: ... print elem ... FirstItem SecondItem ThirdItem HDF & HDF-EOS Workshop XV 17 April 2012 >>>
  • 3. Lists can contain a mix of objects: >>> MixedList = ['MyString',5,[72, 99.44]] >>> for elem in MixedList: ... print elem ... MyString A list inside a list 5 [72, 99.44] Lists can be addressed by index: >>> MixedList[0] 'MyString' >>> MixedList[2] [72, 99.44] HDF & HDF-EOS Workshop XV 17 April 2012
  • 4. A note about Python lists: Python lists are one dimensional. Arithmetic operations don’t work on them. Don’t be tempted to use them for scientific array based data sets. More the ‘right way’ later... HDF & HDF-EOS Workshop XV 17 April 2012
  • 5. Python has dictionaries. Dictionaries are key,value pairs >>> Dictionary = {'FirstKey':'FirstValue', 'SecondKey':'SecondValue', 'ThirdKey':'ThirdValue'} >>> Dictionary {'SecondKey': 'SecondValue', 'ThirdKey': 'ThirdValue', 'FirstKey': 'FirstValue'} >>> Notice that Python prints the key,value pairs in a different order than I typed them. The Key,Value pairs in a dictionary are unordered. HDF & HDF-EOS Workshop XV 17 April 2012
  • 6. Dictionaries are not lists, however we can easily create a list of the dictionary keys: >>> list(Dictionary) ['SecondKey', 'ThirdKey', 'FirstKey'] >>> We can use a dictionary in a loop without additional elaboration: >>> for Key in Dictionary: ... print Key,"---->",Dictionary[Key] ... SecondKey ----> SecondValue ThirdKey ----> ThirdValue FirstKey ----> FirstValue >>> HDF & HDF-EOS Workshop XV 17 April 2012
  • 7. HDF5 is made of “Dictionaries” a dataset name is the key, and the array is the value. Keys Value HDFView is a tool which shows use the keys (TreeView) and the values (TableView) of an HDF5 file. HDF & HDF-EOS Workshop XV 17 April 2012
  • 8. Andrew Collette’s H5py module allows us to use Python and HDF5 together. We can use H5py to manipulate HDF5 files as if they were Python Dictionaries >>> import h5py >>> in_fid = h5py.File('DansExample1.h5','r') >>> for DS in in_fid: ... print DS,"------->",in_fid[DS] ... FirstDataset -------> <HDF5 dataset "FirstDataset": shape (25,), type "<i4"> SecondDataset -------> <HDF5 dataset "SecondDataset": shape (3, 3), type "<i4"> ThirdDataset -------> <HDF5 dataset "ThirdDataset": shape (5, 5), type "<i4"> >>> Keys Values HDF & HDF-EOS Workshop XV 17 April 2012
  • 9. So What? We need to be able to manipulate the arrays, not just the file. The Numpy module by Travis Oliphant allows the manipulation of arrays in Python. We will see examples of writing arrays later, but to get arrays from the H5py object we have the ellipses. >>> import h5py >>> fid = h5py.File('DansExample1.h5','r') >>> fid['FirstDataset'] <HDF5 dataset "FirstDataset": shape (25,), type "<i4"> >>> fid['FirstDataset'][...] array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 17, 18, 19, 20, 21, 22, 23, 24]) >>> type(fid['FirstDataset'][...]) <type 'numpy.ndarray'> >>> HDF & HDF-EOS Workshop XV 17 April 2012 16,
  • 10. Reasons to use Python and HDF5 instead of C or Fortran The basic Python Dictionary object has a close similarity to the HDF5 Group. The object oriented and dynamic nature of Python allows the existing Dictionary syntax to be repurposed for HDF5 manipulation. In short, working with HDF5 in Python requires much less code than C or Fortran which means faster development and fewer errors. HDF & HDF-EOS Workshop XV 17 April 2012
  • 11. Comparison to C, h5_gzip: C # Lines of code 106 Python from THG site 37 Fewer lines of code means fewer places to make mistakes The 37 line h5_gzip.py example is a “direct” translation of the C version. Some more advanced techniques offer insight into advantages of Python/H5py programming. Text in next slides is color coded to help match code with same functionality. First writing a file… HDF & HDF-EOS Workshop XV 17 April 2012
  • 12. Original h5_gzip.py Pythonic h5_gzip.py # This example creates and writes GZIP compressed dataset. import h5py import numpy as np # Create gzip.h5 file. # file = h5py.File('gzip.h5','w') # # Create /DS1 dataset; in order to use compression, dataset has to be chunked. # dataset = file.create_dataset('DS1', (32,64),'i',chunks=(4,8),compression='gzip',compressi on_opts=9) # # Initialize data. # data = np.zeros((32,64)) for i in range(32): for j in range(64): data[i][j]= i*j-j # Write data. print "Writing data..." dataset[...] = datafile.close() #!/usr/bin/env python # It's a UNIX thing..... from __future__ import print_function # Code will work with python 3 as well.... # This example creates and writes GZIP compressed dataset. import h5py # load the HDF5 interface module import numpy as np # Load the array processing module # Initialize data. Note the numbers 32 and 64 only appear ONCE in the code! LeftVector = np.arange(-1,32-1,dtype='int32') RightVector = np.arange(64,dtype='int32') DataArray = np.outer(LeftVector,RightVector) # create 32x64 array of i*j-j # The _with_ construct will automatically create and close the HDF5 file with h5py.File('gzip-pythonic.h5','w') as h5_fid: # Create and write /DS1 dataset; in order to use compression, dataset has to be chunked. h5_fid.create_dataset('DS1',data=DataArray,chunks=(4 ,8),compression='gzip',compression_opts=9) dataset[...] = data file.close() file.close() HDF & HDF-EOS Workshop XV 17 April 2012
  • 13. Reading data…. # Read data back; display compression properties and dataset max value. # file = h5py.File('gzip.h5','r') dataset = file['DS1'] print "Compression method is", dataset.compression print "Compression parameter is", dataset.compression_opts data = dataset[...] print "Maximum value in", dataset.name, "is:", max(data.ravel()) file.close() # Read data back; display compression properties and dataset max value. # with h5py.File('gzip-pythonic.h5','r') as h5_fid: dataset = h5_fid['DS1'] print("Compression method is", dataset.compression) print("Compression parameter is", dataset.compression_opts) print("Maximum value in", dataset.name, "is:", dataset.value.max()) HDF & HDF-EOS Workshop XV 17 April 2012
  • 14. And finally, just to see what the file looks like… HDF & HDF-EOS Workshop XV 17 April 2012
  • 15. Real world example: Table Comparison Background: For the OMPS Instruments we need to design binary arrays to be uploaded to the satellite to sub-sample the CCD to reduced data rate. For ground processing use we store these arrays in HDF5. As part of the design process we want to be able to compare arrays in two different files. HDF & HDF-EOS Workshop XV 17 April 2012
  • 16. Here is an example of a Sample Table HDF & HDF-EOS Workshop XV 17 April 2012
  • 17. Here is another example: HDF & HDF-EOS Workshop XV 17 April 2012
  • 18. Here is the “difference” of the arrays. Red pixels are unique to the first array. HDF & HDF-EOS Workshop XV 17 April 2012
  • 19. The code: CompareST.py #!/usr/bin/env python """ Documentation """ from __future__ import print_function,division import h5py import numpy import ViewFrame def CompareST(ST1,ST2,IntTime): with h5py.File(ST1,'r') as st1_fid,h5py.File(ST2,'r') as st2_fid: ST1 = st1_fid['/DATA/'+IntTime+'/SampleTable'].value ST2 = st2_fid['/DATA/'+IntTime+'/SampleTable'].value ST1[ST1!=0] = 1 ST2[ST2!=0] = 1 Diff = (ST1 - ST2) ST1[Diff == 1] = 2 ViewFrame.ViewFrame(ST1) HDF & HDF-EOS Workshop XV 17 April 2012
  • 20. ..and the command line argument parsing. if __name__ == "__main__": import argparse OptParser = argparse.ArgumentParser(description = __doc__) OptParser.add_argument("--ST1",help="SampleTableFile1") OptParser.add_argument("--ST2",help="SampleTableFile2") OptParser.add_argument("--IntTime",help="Integration Time", default='Long') options = OptParser.parse_args() CompareST(options.ST1,options.ST2,options.IntTime) HDF & HDF-EOS Workshop XV 17 April 2012
  • 21. Recursive descent into HDF5 file Print group names, number of children and dataset names. #!/usr/bin/env python from __future__ import print_function import h5py def print_num_children(obj): if isinstance(obj,h5py.highlevel.Group): print(obj.name,"Number of Children:",len(obj)) for ObjName in obj: # ObjName will a string print_num_children(obj[ObjName]) else: print(obj.name,"Not a group") with h5py.File('OMPS-NPP-NPP-LP_STB', 'r+') as f: print_num_children(f) HDF & HDF-EOS Workshop XV 17 April 2012
  • 22. The Result…. ssai-s01033@dkahn: ~/python % ./print_num_children.py / Number of Children: 1 /DATA Number of Children: 10 /DATA/AutoSplitLong Not a group /DATA/AutoSplitShort Not a group /DATA/AuxiliaryData Number of Children: 6 /DATA/AuxiliaryData/FeatureNames Not a group /DATA/AuxiliaryData/InputSpecification Not a group /DATA/AuxiliaryData/LongLowEndSaturationEstimate Not a group /DATA/AuxiliaryData/ShortLowEndSaturationEstimate Not a group /DATA/AuxiliaryData/Timings Number of Children: 2 /DATA/AuxiliaryData/Timings/Long Not a group /DATA/AuxiliaryData/Timings/Short Not a group /DATA/AuxiliaryData/dummy Not a group /DATA/Long Number of Children: 14 /DATA/Long/BadPixelTable Not a group /DATA/Long/BinTransitionTable Not a group /DATA/Long/FeatureNamesIndexes Not a group /DATA/Long/Gain Not a group /DATA/Long/InverseOMPSColumns Not a group HDF & HDF-EOS Workshop XV 17 April 2012
  • 23. Summary Python with H5py and Numpy modules make developing Programs to manipulate HDF5 files and perform calculations With HDF5 arrays simpler which increase development speed and reduces errors. HDF & HDF-EOS Workshop XV 17 April 2012