Hi, my name is Joe Lee and I’m a software engineer at the hdf group.
This work is a joint effort with many people like John Allendorf and Brian Tisdale at NASA Atmospheric Science Data Center.
Since Brian is the main driver of this project so he should have presented this.
Unfortunately, he could not come to ESIP meeting so I’ll present GEE.
GEE name is perfect.
Unless you are a person from Geo-spatial user community, you may not have heard of GDAL.
Thus, let me explain What GDAL is.
GDAL is Geospatial Data Abstraction Library. Here, the key is in “A”, “Abstraction”.
Although is ‘L’ stands for library, GDAL comes with a set of tools.
The most famous tool is gdal_translate because it can convert one data format into another through abstraction.
It is free and open source written in C++. It also comes with Python binding for python users.
GDAL includes more than 226 drivers for diverse data formats. The list of supported drivers keeps increasing. I was surprised to know that GDAL provides a driver for Elasticsearch.
GDAL is indispensable like water for Geospatial community because famous tools and services like ArcGIS and MapServer are powered by GDAL.
If GDAL is such powerful library, why does it need enhancement?
If you have used GDAL or GDAL-based tools with an arbitrary NASA Earthdata, you may have encountered some problems.
For example, if you search GIS Stackexchange with keyword GDAL and HDF, you will get more than 100 questions.
The screenshots illustrate the typical symptoms when users open ASDC MOPITT products.
On the left, you can see that image is upside down.
On the right, image is rotated 90 degree. It is gridded data so width must be larger than the height.
In fact these are quite successful case because it is grid.
If you have swath and trajectory data GIS tools simply give up saying there’s no dataset to visualize.
Users may wonder what could be wrong? Tool or data?
In fact, the problem lies in abstraction layer.
NASA ESDIS and ASDC was aware of this issue and they started a new venture to help GIS community 3 years ago.
Here’s the brief history of our venture.
During the first phase in 2014, GMU, Esri and The HDF Group formed a team to develop a ArcGIS Plugin that works for a few ASDC products.
During the second phase in 2015, we generalized the patches made for ArcGIS. Instead of creating solution that works only for ArcGIS, we tackled the GDAL directly so other free and open source tools can benefit from it. We modified HDF drivers in GDAL and we used a custom XML file to trigger different behaviors on some ASDC data products.
During the third phase in 2016, we made an effort to make our patches official and become a part of GDAL development team.
I’ll focus on our latest effort for the rest of talk.
Here are highlights from last year’s work.
We enhanced …
We contacted the official GDAL team.
We tested our new functionalities across other clients.
We built contents for tutorials and training packages.
Let’s go over specifics on each highlight.
First, we enhanced the generic functions. Phase II effort left a few bugs and we fixed them.
Second, we allowed 4D and 5D dataset for HDF4 products. GDAL’s abstraction is 2D spatial. Any extra dimension is treated as multi-band.
Finally, HDF doesn’t prescribe dimension order. That is the first dimension can be latitude and the second dimension can be longitude. They can be reversed if data producer wants. However, the existing GDAL HDF drivers wanted X dimension first and then latitude dimension.
Since we wanted to share our code with GDAL main developer, we contacted the maintainer and got some responses.
He mentioned a few things to be a committer.
His primary concern was open source license.
He asked us to write a good code following their coding standards.
He gave a great of advice of using VRT. We are back on the right track with VRT. VRT is getting powerful as it allows user to insert Python code snippet in PixelFunctionCode element.
The last suggestion was to use the latest source only.
Both ArcGIS Explorer Desktop and QGIS were able to handle VRT. Out of curiosity, we tested the Hyrax GDAL handler whether it can support VRT. For the same HDF data, it could be served by either HDF handler or GDAL handler. We also tested it with rgdal because R doesn’t provide an easy way to access HDF4 products. During this process, we learned that rgdal cannot handle OpenOption properly and rgdal maintainer fixed the problem immediately. GDAL handler should change CF unit of latitude to degrees_south by processing UpperLeft and LowerRight corner metadata in VRT properly. Finally, GDAL python was tested because GDAL uses Python for autotesting.
GSFC showed a great interest in our work because they had trouble in converting SMAP data products into GeoTIFF.
We asked the status of VRT support in ArcGIS.
VRT is very powerful mechanism to correct simple problem and can be used to modify the behavior of drivers.
If you want to try GEE, user guide has many examples.
We also provide how OpenOption in VRT works. If you dare to fix your model mismatch problem between your HDF model and GDAL raster model through OpenOption, I recommend you to read this.
Serving HDF data via Hyrax GDAL driver needs some tricks. They can be found in the document.
If you want to have R/Python interface powered by GEE, we provide installation guide as well.
We also provide some auto-testing with real NASA data products. If you want to run autotest, please read the document.
Although we finished Phase III, I still think it’s just beginning because NASA ESDIS has so many data products already and new ones are coming. Some people may argue that netCDF is the anwer but I think netCDF GDAL handler may have some limitaitions because I could see 150+ results from GIS stack exchange. GEE project is a gateway for NASA HDF Earth data to penetrate GIS user community. We hope that more users and data producers test their data products with GEE and contribute common functions to be added. Then, we can keep adding more OpenOption functions and distribute them through the official GDAL channel. We believe that GEE is the most cost-effective way for both tool developers, data producers, and end-users.
GDAL Enhancement for ESDIS Project
JOE LEE @ THE HDF GROUP
JOHN ALLENDORF & BIRAN TISDALE @ ASDC
What is GDAL?
• Geospatial Data Abstraction Library
• Free and Open Source
• 226+ drivers for various formats: HDF / GeoTIFF / netCDF
• Powered by GDAL: ArcGIS QGIS GeoServer MapServer
MANY NASA EOS DATA PRODUCTS ARE CHALLENGING TO
CONSUME BY GIS TOOLS.
Image Displayed Inverted
MOP03TM.005 (HDF4): Retrieved Surface Temperature Night
Missing Geo-Reference & 90 Degree Rotated
MOP03TM.006 (HDF5): A Priori Surface Temperature Night
GDAL Enhancement History
• Phase I (PoC): ArcGIS Plugin + Product-specific HDF Drivers
• Phase II (Generalization): HDF Drivers + XML (custom)
• Phase III (Collaboration): HDF Drivers + VRT
GEE Phase III: Highlights
• Enhanced the generic functions completed in Phase II
• Coordination with the GDAL development team
• Tested functionality across other clients
• Conducted outreach to other DAACs & Esri
• Built content for tutorials and training packages
Enhanced the generic functions
• Bug fixes
• 3D+ data support
• Flexible dimension order
Coordination with the GDAL team
• Open source license
• Coding standards
• VRT OpenOptions
• Patch on the latest source from GitHub
Tested functionality across other clients
• ArGIS Explorer Desktop / QGIS
• OPeNDAP Hyrax GDAL Handler*
• R (rgdal)
• GDAL Python (including autotesting)
*has some issues in geo-ref.
Conducted outreach to other DAACs & Esri
• GSFC:Tested GEE with SMAP products
• Esri: ConfirmedVRT support
Built content for training and tutorials
● GEE Public User Guide
● How OpenOptions tag inVRT is processed
● How to serve HDF with OPeNDAP GDAL Handler
● How to install GEE Python
● How to install GEE R
● How to run auto testing
• Validation: Cross check with http://hdfeos.org/zoo examples.
• Maintenance: Resolve license issue and commit to the official GDAL
• Updates: Identify common needs from DAACs and end-users and
add them as OpenOption functions.
• Improvements: Bug fix forTRMM products and HDF5 chunked data