Reading HDF family of formats via NetCDF-Java / CDM
 

Reading HDF family of formats via NetCDF-Java / CDM

on

  • 211 views

HDF4 and HDF-EOS format reading has recently been added to the NetCDF-Java 4.0 library, while HDF5 / NetCDF-4 format reading has been improved. This talk will summarize the status of reading the HDF ...

HDF4 and HDF-EOS format reading has recently been added to the NetCDF-Java 4.0 library, while HDF5 / NetCDF-4 format reading has been improved. This talk will summarize the status of reading the HDF family of formats through the NetCDF-Java library, with particular attention to the mapping between these formats and the Common Data Model.

Statistics

Views

Total Views
211
Views on SlideShare
187
Embed Views
24

Actions

Likes
1
Downloads
2
Comments
0

5 Embeds 24

http://hdfeos.org 9
http://localhost 7
http://www.slideee.com 6
http://hdfdap 1
http://www.hdfeos.org 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Diversity of formats: <br />

Reading HDF family of formats via NetCDF-Java / CDM Reading HDF family of formats via NetCDF-Java / CDM Presentation Transcript

  • Reading HDF family of formats via NetCDF-Java / CDM John Caron UCAR/Unidata
  • NetCDF-Java library • • • • 100% Java Open Source (LGPL, MIT) Independent implementation Used as a component in other software (partial) – – – – – – – – – Integrated Data Viewer, THREDDS Data Server (Unidata) Panoply (NASA) ncBrowse (EPIC/NOAA) Java NEXRAD Viewer (NCDC/NOAA) MyWorld GIS (Northwestern) EDC for ArcGIS, ERRDAP (SFSC/NOAA) Live Access Server (PMEL/NOAA) ncWMS (Reading) Matlab plug-in (USGS)
  • Application Scientific Feature Types Datatype Adapter NetCDF-Java/ NetcdfDataset CDM architecture CoordSystem Builder NetcdfFile THREDDS I/O service provider OPeNDAP Catalog.xml NcML NcML NetCDF-3 NIDS NetCDF-4 GRIB HDF5 GINI Nexrad … DMSP
  • Format Readers (IOSP) • General: NetCDF, HDF5, HDF4, OPeNDAP • Gridded: GRIB-1, GRIB-2, GEMPAK • Radar: NEXRAD 2&3, DORADE, CINRAD, Universal Format • Point: BUFR, ASCII • Satellite: DMSP, GINI, McIDAS AREA • Misc: GTOPO, Lightning, etc • Others in development (partial): – AVHRR, GPCP, GACP, SRB, SSMI, HIRS (NCDC)
  • Line of Code (est) LOC netcdf3 hdf4 hdf-eos hdf5 common 1977 3151 3737 5735 28121 semicolons ratio LOC ratio semi 846 1 1 1405 1.6 1.7 1695 1.9 2.0 2672 2.9 3.2 9267
  • Why all the trouble? • ~20-40% C/C++ time spent on portability issues • Platform Independence – – – – Linux, Solaris, Windows (Sun) Mac OS X (Apple) AIX, Linux, Windows, z/OS (IBM) HP-UX (Hewlitt-Packard) • Progammer productivity – – – – Object-Oriented Garbage Collected – no memory leaks Rich libraries Open source • Faster than C for some applications
  • Independent implementation • Written entirely from reading HDF4, HDF5 file specifications • Helped debug (HDF5), validate file specs • File format spec is what will be needed in 100 years to read legacy data – OTOH, semantics not always obvious • Don’t confuse reference implementation with the file/protocol specification
  • HDF family of formats • HDF5/NetCDF-4 • HDF4 • HDF-EOS • Note: read-only, no parellel I/O, etc
  • HDF5/NetCDF4 • Goal is to read all HDF5 – Can read all HDF5 files that we have example – including references, soft links – Complete coverage difficult to guarantee – combinatoric explosion • Some esoteric features we are skipping – File drivers, external files, slib compression • Working on a comprehensive test harness – JNI interface to Netcdf4/HDF5 library – read every byte and compare
  • HDF4 / HDF-EOS • Complete, works against all examples • Tested against 400 sample files (27 Gb) – thanks to Ruth Duerr (NSIDC) • Spot checked against HDFView • Need systematic test to compare reading against the HDF4 C Library
  • Geolocation Primer
  • Swath Float Float Float Float lat(245, 33477); lon(245, 33477); time(33477); data(245, 33477); Just know that its swath data • 245 points cross track • 33477 along the track • Each scan has a time coordinate
  • Swath Float Float Float Float lat(33477, 245); lon(33477, 245); time(33477); data(245, 33477);
  • Swath Float Float Float Float lat(999,999); lon(999,999); time(999); data(999,999);
  • Swath Float Float Float Float v1(999, 999); v2(999, 999); v3(999); v4(999,999);
  • If you write data • • • • Don’t rely on variable name conventions Don’t rely on index ordering Don’t rely on matching index sizes Minimize “you just have to know that…”
  • Dimensions Dimensions d1=999; d2=999; Variables: float v1(d1=999, d2=999); float v2(d1=999, d2=999); float v3(d2=999); float v4(d2=999,d1=999);
  • Good Variables: float v1(d1=999, d2=999); v1:standard_name = “Latitude”; float v2(d1=999, d2=999); v2:standard_name = “Longitude”; float v3(d2=999); v3:standard_name = “Time”; float v4(d2=999,d1=999); Data_type = “Swath”; Conventions = “My unique name”;
  • If you write data • • • • • Unique signature Specify dimensions Identify georeferencing coordinates Identify data type Units are not optional
  • HDF-EOS, HDF-EOS2 • Read “structural metadata” field to obtain more semantics • Parse text in “ODL” – Data type: Swath, Grid, Point – Dimensions – Geolocation coordinate variable types: Latitude, Longitude, Time
  • HDF-EOS, HDF-EOS2 • Good – Unique signature, identify coordinates and data type • Not so good – ODL – Not using hdf4/5 constructs • Bad – No data units – No time coordinate units!
  • Better EOS Variables: float v1(999, 999); v1:standard_name = “Latitude”; v1:dims = “d1 d2”; float v2(999, 999); v2:standard_name = “Longitude”; v2:dims = “d1 d2”; float v3(999); v3:standard_name = “Time”; v3:dims = “d2”; float v4(999,999); v4:dims = “d2 d1”;
  • NPP (i1.4.0.3_NPP_QUAL) • Good – XML better than ODL • Not so good – Not using hdf4/5 constructs • Bad – No data units – No time coordinate units! • Fatal Error: please reboot – Metadata not in the same file
  • Summary • Netcdf-Java reads entire HDFx family • Good for Java-philes • Needs more testing – Send example files, $ • Dimensions are not optional • Keep structural and georeferncing metadata in the same file as the data – Can also have specialized external files
  • Contact caron@ucar.edu Google “netcdf java”
  • NetCDF-4 and Common Data Model (Data Access Layer)
  • Dimension primer Float Float Float Float Float lat(180); lon(360); alt(20); time(1200); data(1200,20,180,360);
  • Unique Name! Float Float Float Float Float lfip(lfip=180); lflop(lflop=180); zorg(zorg=20); skdf(skdf=1200); dglot(skdf=1200,zorg=20, lfip=180,lflop=180);
  • Float Float Float Float Float lfip(180); lflop(180); zorg(20); freebish(1200); dglot(1200,20,180,180);
  • Float Float Float Float Float lat(180); lon(180); alt(20); time(1200); data(1200,20,180,180);