SlideShare a Scribd company logo
1 of 17
@AU_EarthObs
SPD and KEA:
HDF5 based file formats for Earth
Observation
Pete Bunting1, John Armston2, Sam Gillingham3, Neil Flood4
1. Aberystwyth University, UK (pfb@aber.ac.uk)
2. University of Maryland, USA (armston@umd.edu)
3. Landcare Research, NZ (gillingham.sam@gmail.com)
4. Science Division, Queensland Government, Australia (neil.flood@dsiti.qld.gov.au)
Contents
• Sorted Pulse Data (SPD) Format
– For storing laser scanning data
• KEA Image File Format
– Implementation of the GDAL raster data
model.
SPD: Little History…
• The first version of ‘SPDLib’ was written in 2008
– ‘Sorted Point Data’, simply stored a 2D grid based index
alongside the points file.
• 2009 I was using a ENVI image file to store the header
information (as a 2 band image). Having multiple files per
datasets wasn’t ideal also LAS missing fields (e.g., height)
I wanted for processing.
– Colleague suggested looking at HDF5
• 2011 John Armston visited Aberystwyth with a set of full
waveform acquisitions for use in his PhD.
– ‘Sorted Pulse Data’ was born.
Why a Pulse?
Transmitted Received
Video created by John
Armston using SPDLib
Python binding.
SPD File Format
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Pulse ID
GPSTime
Origin [X, Y, Z, H]
Index [X, Y]
Azimuth
Zenith
TransmitAmplitude
TransmitWidth
SourceID
Wavelength
NumberOfReturns
Returns
NumberOfTransmittedBins
TransmittedBins
NumberOfRecievedBins
RecievedBins
SPD Pulse
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Point ID
GPSTime
Location [X, Y, Z, H]
Classification
Amplitude
Width
Range
Red
Green
Blue
WaveformOffset
SPD Point
Sorted…
Indexing makes
processing faster
– Cartesian
– Spherical
– Polar
A)
B)
C)
X
Y
Azimuth
Zenith
Radius
Azimuth
SPD & HDF5
Why HDF5?
• Another file format…
– Not just another block of
binary you cannot do
anything with unless you
have a format definition.
• Fields can be logically
named and data types
defined and read from the
file.
– Self describing.
HDF5
Data
Header
Index
Quicklook Image
Pulses
Points
Received
Transmitted
Header Field 1
Header Field n
.
.
.
Bin Offset
Number of Pulses
Compression
• zlib compression is used by default
– Provided by HDF5 library
– Compression block size can be varied using SPD
header parameters
• File sizes are on average slight smaller than an
uncompressed LAS file but larger than LAZ.
– More complex data structures
– Two pieces of information pulse and point(s)
KEA: Little History…
• Created in 2012 and funded by Landcare Research, NZ.
• The problem:
“How to have large attribute tables of data alongside raster data?”
• Erdas Imagine format (HFA, *.img) supports attribute tables but compression is
only supported for 32bit file sizes (i.e., < 2Gb).
– Attribute tables are also uncompressed.
• BigTiff supports large raster imagery but not attribute tables.
• Initial implementation with a hdf5 file for attribute table with a separate image
file (e.g., tiff).
– This was untidy and having to keep track of multiple files is not desirable.
• “Why not just put the image in the HDF5 file with a gdal driver?”
– Result the KEA HDF5 schema.
Raster Storage: KEA file format
• HDF5 based image file format
• GDAL driver
– Therefore the format can be used in any GDAL
compatibly software (e.g., ArcMap)
• Support for large raster attribute tables
• zlib based compression
– Small file sizes
– 10 m SPOT mosaic of New Zealand ~5GB per
island (Each approx. 65000, 84000 pixels)
Bunting and Gillingham 2013
KEA File Structure
File Type
Number of
bands
GeneratorResolution
Rotation
Size
TL CoordVersion
WKT
Name: Value
Name: Value
Kea Image
Band 1
Band 2
Band n
Meta Data
Header
GCPs
GCPs
WKT
ATT
Image
Layer Type
Data Type
Description
Overviews
Meta Data
Name: Value
Name: Value
Overview 1
Overview 2
Overview n
Data
Header
Neighbours
Boolean Data Integer Data
String DataDouble Data
Size
Double Fields
Chunk Size
Integer FieldsBoolean Fields
String Fields
Neighbours
Band Mask Band Usage
• This structure is essentially
the GDAL raster data model.
• GDAL is defacto standard for
EO raster data I/O.
• Used in open source and
commercial software
(e.g., ESRI).
• We added a few addition for
our own needs.
• Attribute table has
concept of ‘neighbours’
to allow transversal of a
set of clumps (e.g.,
object oriented image
classification).
KEA Size and Speed
Is HDF5 a good base?
• Yes. - We’ve found it excellent.
– Coding is quick and relatively easy
– No worrying about Endian etc.
• Originally SPD was developed on PowerPC Mac.
– If used correctly compression is good, with little
overhead of the HDF5 structures
– Possible to make complex and flexible data
structures.
• However, it is the data structures in the file
rather the ‘file format’ that is important thing.
However,
• Compound data types can reduce flexibility
– Not possible to dynamically add new fields (c struct)
• Use tables instead (as implemented in KEA attribute tables)
– i.e., Single data type per table
• No boolean data type (C data types)
– Store as int8, wasted space?
• No compression on ‘ragged’ data structure
• HDF5 file can get defragmented
– Many changes (i.e., data added) happening within the file.
• Cannot remove data from the file
– Deleting does not reduce file size.
• Split data into suitable compression blocks and use / process
data in those blocks.
SPD v4
• Updated version of SPD (v3 has been the version widely used)
• Learning lessons from SPD and KEA
– Remove compound data types
– Uses tables of single data type rather than compound data types.
– Made as much optional as possible.
– Multiple waveforms per pulse.
• Implemented in pyLiDAR
– http://pylidar.org/en/latest/spdv4format.html
• Pulses are very useful
– But some times points are all you need
• Multiple methods of spatially indexing the data is useful
– 2D grid useful for many but not all applications.
Questions

More Related Content

What's hot

Using GDAL In Your GIS Workflow
Using GDAL In Your GIS WorkflowUsing GDAL In Your GIS Workflow
Using GDAL In Your GIS WorkflowGerry James
 

What's hot (20)

ArcGIS and Multi-D: Tools & Roadmap
ArcGIS and Multi-D: Tools & RoadmapArcGIS and Multi-D: Tools & Roadmap
ArcGIS and Multi-D: Tools & Roadmap
 
Utilizing HDF4 File Content Maps for the Cloud Computing
Utilizing HDF4 File Content Maps for the Cloud ComputingUtilizing HDF4 File Content Maps for the Cloud Computing
Utilizing HDF4 File Content Maps for the Cloud Computing
 
MATLAB and Scientific Data: New Features and Capabilities
MATLAB and Scientific Data: New Features and CapabilitiesMATLAB and Scientific Data: New Features and Capabilities
MATLAB and Scientific Data: New Features and Capabilities
 
Using GDAL In Your GIS Workflow
Using GDAL In Your GIS WorkflowUsing GDAL In Your GIS Workflow
Using GDAL In Your GIS Workflow
 
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited DimensionHDF5 Performance Enhancements with the Elimination of Unlimited Dimension
HDF5 Performance Enhancements with the Elimination of Unlimited Dimension
 
Multidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGISMultidimensional Scientific Data in ArcGIS
Multidimensional Scientific Data in ArcGIS
 
Working with Scientific Data in MATLAB
Working with Scientific Data in MATLABWorking with Scientific Data in MATLAB
Working with Scientific Data in MATLAB
 
Bridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data ProductsBridging ICESat and ICESat-2 Standard Data Products
Bridging ICESat and ICESat-2 Standard Data Products
 
NEON HDF5
NEON HDF5NEON HDF5
NEON HDF5
 
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
Geospatial Data Abstraction Library (GDAL) Enhancement for ESDIS (GEE)
 
HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)HDF Update for DAAC Managers (2017-02-27)
HDF Update for DAAC Managers (2017-02-27)
 
HDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the CloudHDFCloud Workshop: HDF5 in the Cloud
HDFCloud Workshop: HDF5 in the Cloud
 
Moving form HDF4 to HDF5/netCDF-4
Moving form HDF4 to HDF5/netCDF-4Moving form HDF4 to HDF5/netCDF-4
Moving form HDF4 to HDF5/netCDF-4
 
Working with HDF and netCDF Data in ArcGIS: Tools and Case Studies
Working with HDF and netCDF Data in ArcGIS: Tools and Case StudiesWorking with HDF and netCDF Data in ArcGIS: Tools and Case Studies
Working with HDF and netCDF Data in ArcGIS: Tools and Case Studies
 
HDF and netCDF Data Support in ArcGIS
HDF and netCDF Data Support in ArcGISHDF and netCDF Data Support in ArcGIS
HDF and netCDF Data Support in ArcGIS
 
NASA Terra Data Fusion
NASA Terra Data FusionNASA Terra Data Fusion
NASA Terra Data Fusion
 
GDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS ProjectGDAL Enhancement for ESDIS Project
GDAL Enhancement for ESDIS Project
 
Scientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDFScientific Computing and Visualization using HDF
Scientific Computing and Visualization using HDF
 
Open-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDFOpen-source Scientific Computing and Data Analytics using HDF
Open-source Scientific Computing and Data Analytics using HDF
 
Efficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAPEfficiently serving HDF5 via OPeNDAP
Efficiently serving HDF5 via OPeNDAP
 

Viewers also liked (13)

Breakthrough Listen
Breakthrough ListenBreakthrough Listen
Breakthrough Listen
 
An HDF-EOS DataBlade using Informix's Object-Relational Database
An HDF-EOS DataBlade using Informix's Object-Relational DatabaseAn HDF-EOS DataBlade using Informix's Object-Relational Database
An HDF-EOS DataBlade using Informix's Object-Relational Database
 
Converting between HDF4 and HDF5
Converting between HDF4 and HDF5Converting between HDF4 and HDF5
Converting between HDF4 and HDF5
 
Intel Array Visualizer
Intel Array VisualizerIntel Array Visualizer
Intel Array Visualizer
 
HDF Cloud Services
HDF Cloud ServicesHDF Cloud Services
HDF Cloud Services
 
Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP Using visualization tools to access HDF data via OPeNDAP
Using visualization tools to access HDF data via OPeNDAP
 
Introduction to HDF5
Introduction to HDF5Introduction to HDF5
Introduction to HDF5
 
Advanced HDF5 Features
Advanced HDF5 FeaturesAdvanced HDF5 Features
Advanced HDF5 Features
 
Hdf5 current future
Hdf5 current futureHdf5 current future
Hdf5 current future
 
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 dataUsing HDF5 Archive Information Package to preserve HDF-EOS2 data
Using HDF5 Archive Information Package to preserve HDF-EOS2 data
 
Unidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology SharingUnidata's Approach to Community Broadening through Data and Technology Sharing
Unidata's Approach to Community Broadening through Data and Technology Sharing
 
HDF5 Tools
HDF5 ToolsHDF5 Tools
HDF5 Tools
 
Apagar y encender led con arduino y visual studio 2015
Apagar y encender led con arduino y visual studio 2015Apagar y encender led con arduino y visual studio 2015
Apagar y encender led con arduino y visual studio 2015
 

Similar to SPD and KEA: HDF5 based file formats for Earth Observation

Oceangraphic data formats
Oceangraphic data formatsOceangraphic data formats
Oceangraphic data formatsFiddy Prasetiya
 
Data management principles
Data management principlesData management principles
Data management principlesFiddy Prasetiya
 
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...RCAHMW
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataSafe Software
 
NAPSG 2010 Fire/EMS Conference - Data Sharing Basics
NAPSG 2010 Fire/EMS Conference - Data Sharing BasicsNAPSG 2010 Fire/EMS Conference - Data Sharing Basics
NAPSG 2010 Fire/EMS Conference - Data Sharing Basicspdituri
 
2016-07-21-Godil-presentation.pptx
2016-07-21-Godil-presentation.pptx2016-07-21-Godil-presentation.pptx
2016-07-21-Godil-presentation.pptxD21CE161GOSWAMIPARTH
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...huguk
 
Apache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysisApache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysisliang chen
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @ScaleDr Hajji Hicham
 
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...David Horvath
 
An IDL-Based Validation Toolkit: Extensions to use the HDF-EOS Swath Format
An IDL-Based  Validation Toolkit: Extensions to  use the HDF-EOS Swath FormatAn IDL-Based  Validation Toolkit: Extensions to  use the HDF-EOS Swath Format
An IDL-Based Validation Toolkit: Extensions to use the HDF-EOS Swath FormatThe HDF-EOS Tools and Information Center
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...Matthäus Zloch
 

Similar to SPD and KEA: HDF5 based file formats for Earth Observation (20)

Oceangraphic data formats
Oceangraphic data formatsOceangraphic data formats
Oceangraphic data formats
 
Data management principles
Data management principlesData management principles
Data management principles
 
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
Canllawiau CBHC ar gyfer Archifau Archaeolegol Digidol – Ymagwedd Gynaliadwy ...
 
ICESat-2 H5-ES Product Development Strategy
ICESat-2 H5-ES Product Development StrategyICESat-2 H5-ES Product Development Strategy
ICESat-2 H5-ES Product Development Strategy
 
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial DataCloud Revolution: Exploring the New Wave of Serverless Spatial Data
Cloud Revolution: Exploring the New Wave of Serverless Spatial Data
 
NAPSG 2010 Fire/EMS Conference - Data Sharing Basics
NAPSG 2010 Fire/EMS Conference - Data Sharing BasicsNAPSG 2010 Fire/EMS Conference - Data Sharing Basics
NAPSG 2010 Fire/EMS Conference - Data Sharing Basics
 
Profile of HDF-EOS5 Files
Profile of HDF-EOS5 FilesProfile of HDF-EOS5 Files
Profile of HDF-EOS5 Files
 
2016-07-21-Godil-presentation.pptx
2016-07-21-Godil-presentation.pptx2016-07-21-Godil-presentation.pptx
2016-07-21-Godil-presentation.pptx
 
Profile of HDF-EOS5 Files
Profile of HDF-EOS5 FilesProfile of HDF-EOS5 Files
Profile of HDF-EOS5 Files
 
HDF5 Life cycle of data
HDF5 Life cycle of dataHDF5 Life cycle of data
HDF5 Life cycle of data
 
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
Using Big Data techniques to query and store OpenStreetMap data. Stephen Knox...
 
HDF5 for NPOESS Data Products
HDF5 for NPOESS Data ProductsHDF5 for NPOESS Data Products
HDF5 for NPOESS Data Products
 
Apache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysisApache CarbonData:New high performance data format for faster data analysis
Apache CarbonData:New high performance data format for faster data analysis
 
Introduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIsIntroduction to HDF5 Data Model, Programming Model and Library APIs
Introduction to HDF5 Data Model, Programming Model and Library APIs
 
Welcome to HDF Workshop V
Welcome to HDF Workshop VWelcome to HDF Workshop V
Welcome to HDF Workshop V
 
Processing Drone data @Scale
Processing Drone data @ScaleProcessing Drone data @Scale
Processing Drone data @Scale
 
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
20120606 Lazy Programmers Write Self-Modifying Code /or/ Dealing with XML Ord...
 
HDF Town Hall
HDF Town HallHDF Town Hall
HDF Town Hall
 
An IDL-Based Validation Toolkit: Extensions to use the HDF-EOS Swath Format
An IDL-Based  Validation Toolkit: Extensions to  use the HDF-EOS Swath FormatAn IDL-Based  Validation Toolkit: Extensions to  use the HDF-EOS Swath Format
An IDL-Based Validation Toolkit: Extensions to use the HDF-EOS Swath Format
 
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
ESWC 2019 - A Software Framework and Datasets for the Analysis of Graphs Meas...
 

More from The HDF-EOS Tools and Information Center

STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...The HDF-EOS Tools and Information Center
 

More from The HDF-EOS Tools and Information Center (20)

Cloud-Optimized HDF5 Files
Cloud-Optimized HDF5 FilesCloud-Optimized HDF5 Files
Cloud-Optimized HDF5 Files
 
Accessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDSAccessing HDF5 data in the cloud with HSDS
Accessing HDF5 data in the cloud with HSDS
 
The State of HDF
The State of HDFThe State of HDF
The State of HDF
 
Highly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance FeaturesHighly Scalable Data Service (HSDS) Performance Features
Highly Scalable Data Service (HSDS) Performance Features
 
Creating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 FilesCreating Cloud-Optimized HDF5 Files
Creating Cloud-Optimized HDF5 Files
 
HDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance DiscussionHDF5 OPeNDAP Handler Updates, and Performance Discussion
HDF5 OPeNDAP Handler Updates, and Performance Discussion
 
Hyrax: Serving Data from S3
Hyrax: Serving Data from S3Hyrax: Serving Data from S3
Hyrax: Serving Data from S3
 
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLABAccessing Cloud Data and Services Using EDL, Pydap, MATLAB
Accessing Cloud Data and Services Using EDL, Pydap, MATLAB
 
HDF - Current status and Future Directions
HDF - Current status and Future DirectionsHDF - Current status and Future Directions
HDF - Current status and Future Directions
 
HDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and FutureHDFEOS.org User Analsys, Updates, and Future
HDFEOS.org User Analsys, Updates, and Future
 
HDF - Current status and Future Directions
HDF - Current status and Future Directions HDF - Current status and Future Directions
HDF - Current status and Future Directions
 
H5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only LibraryH5Coro: The Cloud-Optimized Read-Only Library
H5Coro: The Cloud-Optimized Read-Only Library
 
MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10MATLAB Modernization on HDF5 1.10
MATLAB Modernization on HDF5 1.10
 
HDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDFHDF for the Cloud - Serverless HDF
HDF for the Cloud - Serverless HDF
 
HDF5 <-> Zarr
HDF5 <-> ZarrHDF5 <-> Zarr
HDF5 <-> Zarr
 
HDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server FeaturesHDF for the Cloud - New HDF Server Features
HDF for the Cloud - New HDF Server Features
 
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
Apache Drill and Unidata THREDDS Data Server for NASA HDF-EOS on S3
 
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
STARE-PODS: A Versatile Data Store Leveraging the HDF Virtual Object Layer fo...
 
HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?HDF5 and Ecosystem: What Is New?
HDF5 and Ecosystem: What Is New?
 
HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020HDF5 Roadmap 2019-2020
HDF5 Roadmap 2019-2020
 

Recently uploaded

Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 

Recently uploaded (20)

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 

SPD and KEA: HDF5 based file formats for Earth Observation

  • 1. @AU_EarthObs SPD and KEA: HDF5 based file formats for Earth Observation Pete Bunting1, John Armston2, Sam Gillingham3, Neil Flood4 1. Aberystwyth University, UK (pfb@aber.ac.uk) 2. University of Maryland, USA (armston@umd.edu) 3. Landcare Research, NZ (gillingham.sam@gmail.com) 4. Science Division, Queensland Government, Australia (neil.flood@dsiti.qld.gov.au)
  • 2. Contents • Sorted Pulse Data (SPD) Format – For storing laser scanning data • KEA Image File Format – Implementation of the GDAL raster data model.
  • 3. SPD: Little History… • The first version of ‘SPDLib’ was written in 2008 – ‘Sorted Point Data’, simply stored a 2D grid based index alongside the points file. • 2009 I was using a ENVI image file to store the header information (as a 2 band image). Having multiple files per datasets wasn’t ideal also LAS missing fields (e.g., height) I wanted for processing. – Colleague suggested looking at HDF5 • 2011 John Armston visited Aberystwyth with a set of full waveform acquisitions for use in his PhD. – ‘Sorted Pulse Data’ was born.
  • 4. Why a Pulse? Transmitted Received Video created by John Armston using SPDLib Python binding.
  • 5. SPD File Format Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Pulse ID GPSTime Origin [X, Y, Z, H] Index [X, Y] Azimuth Zenith TransmitAmplitude TransmitWidth SourceID Wavelength NumberOfReturns Returns NumberOfTransmittedBins TransmittedBins NumberOfRecievedBins RecievedBins SPD Pulse Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point Point ID GPSTime Location [X, Y, Z, H] Classification Amplitude Width Range Red Green Blue WaveformOffset SPD Point
  • 6. Sorted… Indexing makes processing faster – Cartesian – Spherical – Polar A) B) C) X Y Azimuth Zenith Radius Azimuth
  • 8. Why HDF5? • Another file format… – Not just another block of binary you cannot do anything with unless you have a format definition. • Fields can be logically named and data types defined and read from the file. – Self describing. HDF5 Data Header Index Quicklook Image Pulses Points Received Transmitted Header Field 1 Header Field n . . . Bin Offset Number of Pulses
  • 9. Compression • zlib compression is used by default – Provided by HDF5 library – Compression block size can be varied using SPD header parameters • File sizes are on average slight smaller than an uncompressed LAS file but larger than LAZ. – More complex data structures – Two pieces of information pulse and point(s)
  • 10. KEA: Little History… • Created in 2012 and funded by Landcare Research, NZ. • The problem: “How to have large attribute tables of data alongside raster data?” • Erdas Imagine format (HFA, *.img) supports attribute tables but compression is only supported for 32bit file sizes (i.e., < 2Gb). – Attribute tables are also uncompressed. • BigTiff supports large raster imagery but not attribute tables. • Initial implementation with a hdf5 file for attribute table with a separate image file (e.g., tiff). – This was untidy and having to keep track of multiple files is not desirable. • “Why not just put the image in the HDF5 file with a gdal driver?” – Result the KEA HDF5 schema.
  • 11. Raster Storage: KEA file format • HDF5 based image file format • GDAL driver – Therefore the format can be used in any GDAL compatibly software (e.g., ArcMap) • Support for large raster attribute tables • zlib based compression – Small file sizes – 10 m SPOT mosaic of New Zealand ~5GB per island (Each approx. 65000, 84000 pixels) Bunting and Gillingham 2013
  • 12. KEA File Structure File Type Number of bands GeneratorResolution Rotation Size TL CoordVersion WKT Name: Value Name: Value Kea Image Band 1 Band 2 Band n Meta Data Header GCPs GCPs WKT ATT Image Layer Type Data Type Description Overviews Meta Data Name: Value Name: Value Overview 1 Overview 2 Overview n Data Header Neighbours Boolean Data Integer Data String DataDouble Data Size Double Fields Chunk Size Integer FieldsBoolean Fields String Fields Neighbours Band Mask Band Usage • This structure is essentially the GDAL raster data model. • GDAL is defacto standard for EO raster data I/O. • Used in open source and commercial software (e.g., ESRI). • We added a few addition for our own needs. • Attribute table has concept of ‘neighbours’ to allow transversal of a set of clumps (e.g., object oriented image classification).
  • 13. KEA Size and Speed
  • 14. Is HDF5 a good base? • Yes. - We’ve found it excellent. – Coding is quick and relatively easy – No worrying about Endian etc. • Originally SPD was developed on PowerPC Mac. – If used correctly compression is good, with little overhead of the HDF5 structures – Possible to make complex and flexible data structures. • However, it is the data structures in the file rather the ‘file format’ that is important thing.
  • 15. However, • Compound data types can reduce flexibility – Not possible to dynamically add new fields (c struct) • Use tables instead (as implemented in KEA attribute tables) – i.e., Single data type per table • No boolean data type (C data types) – Store as int8, wasted space? • No compression on ‘ragged’ data structure • HDF5 file can get defragmented – Many changes (i.e., data added) happening within the file. • Cannot remove data from the file – Deleting does not reduce file size. • Split data into suitable compression blocks and use / process data in those blocks.
  • 16. SPD v4 • Updated version of SPD (v3 has been the version widely used) • Learning lessons from SPD and KEA – Remove compound data types – Uses tables of single data type rather than compound data types. – Made as much optional as possible. – Multiple waveforms per pulse. • Implemented in pyLiDAR – http://pylidar.org/en/latest/spdv4format.html • Pulses are very useful – But some times points are all you need • Multiple methods of spatially indexing the data is useful – 2D grid useful for many but not all applications.