HDF Lessons from NPOESS &
Future Opportunities
Alan M. Goldberg
HDF Workshop IX,

<agoldber@mitre.org>
December 2005

NOTI...
Outline


NPOESS Process & Successes



NPOESS Path Forward



Convergence with other developments



Fundamentals


...
Requirements for data products


Deal with complexity
– Large data granules


Order of Gb

– Complex intrinsic data comp...
SENSORS

CCSDS (mux, code, frame) & Encrypt

Delivered Raw

Packetization
Compression
Aux.
Sensor
Data

ENVIRONMENTAL
SOUR...
Sensor product types
Swath-oriented

multispectral

imagery

– CMIS – conical scan
– Imagery EDRs – resampled on
uniform ...
NPOESS product design development
Requirements
- Multi-platform, multisensor, long duration
data production
- Many data
pr...
Essentials of the resulting design
 The

design is the manner in which we combine,
limit, and extend the available resour...
Resulting design
Advantages



Disadvantages

– Flexible; Extensible;
Allows compression

– Inconsistent with heritage
o...
NPOESS future evolution
 Moving

toward the ideal

 NPOESS

has time to evolve during NPP

– Gain experience with data s...
Lessons & Way Forward

© 2005 The MITRE Corporation. All rights reserved
Observations from development to date






Avoid the temptation to use heritage approaches without
reconsideration, b...
Thoughts on future features for Earth
remote sensing products


Need to more fully integrate product components with HDF
...
Primary and Associated Arrays
Index
Attribute

n-Dimensional
Dependant
Variable (Entity)
Array

Primary Array
e.g., Flux, ...
1-Dimensional Attribute Variables
Index
Attribute

Primary
e.g., UTC time or angle
Additional
e.g., IET time, angle,
or pr...
Multi-Dimensional Attribute
Variables
2-Dimensional Independent
Variable Array(s)
e.g., lat/lon, XYZ, sun alt/az,
sat alt/...
Issues going forward - style


Issues with assuring access understanding
– How will applications know which metadata is p...
Issues going forward - features


Issues with tools
– Tools are needed to create, validate, and exploit the data sets.

...
Possible routes: Should there be an
HDF-GEO?


Specify a profile for the use of HDF in Earth science applications:



Ge...
Questions? Discussion?

© 2005 The MITRE Corporation. All rights reserved
Upcoming SlideShare
Loading in …5
×

HDF Lessons from NPOESS & Future Opportunities

207 views

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
207
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

HDF Lessons from NPOESS & Future Opportunities

  1. 1. HDF Lessons from NPOESS & Future Opportunities Alan M. Goldberg HDF Workshop IX, <agoldber@mitre.org> December 2005 NOTICE This technical data was produced for the U.S. Government under Contract No. 50-SPNA-9-00010, and is subject to the Rights in Technical Data - General clause at FAR 52.227-14 (JUN 1987) Approved for public release, distribution unlimited © 2005 The MITRE Corporation. All rights reserved
  2. 2. Outline  NPOESS Process & Successes  NPOESS Path Forward  Convergence with other developments  Fundamentals  Future © 2005 The MITRE Corporation. All rights reserved
  3. 3. Requirements for data products  Deal with complexity – Large data granules  Order of Gb – Complex intrinsic data complexity  Advanced sensors produce new challenges – Multi-platform, multi-sensor, long duration data production – Many data processing levels and product types  Satisfy operational, archival, and field terminal users – Multiple users with heritage traditions © 2005 The MITRE Corporation. All rights reserved
  4. 4. SENSORS CCSDS (mux, code, frame) & Encrypt Delivered Raw Packetization Compression Aux. Sensor Data ENVIRONMENTAL SOURCE COMPONENTS RDR Production RDR Level Filtration A/D Conversion Detection Cal. Source Comm Processing C3S Comm Receiver IDPS Comm Xmitter Data Store OTHER SUBSYSTEMS SPACE SEGMENT NPOESS products delivered at multiple levels Flux Manipulation TDR Level SDR Production SDR Level EDR Production EDR Level © 2005 The MITRE Corporation. All rights reserved
  5. 5. Sensor product types Swath-oriented multispectral imagery – CMIS – conical scan – Imagery EDRs – resampled on uniform grid spectra 3-d swath-oriented grid – Vertical profile EDRs 2-d map grid – Seasonal land products Abstract – OMPS SDRs – cross-track spectra, limb spectra Image-array fourier spectra – CrIS SDR Directional lists – Active fires – VIIRS – cross-track whiskbroom Slit Point spectra – SESS energetic particle sensor SDR byte structures – RDRs Abstract bit structures – Encapsulated ancillary data Bit planes – Quality flage Associated arrays (w/ stride?) – geolocation © 2005 The MITRE Corporation. All rights reserved
  6. 6. NPOESS product design development Requirements - Multi-platform, multisensor, long duration data production - Many data processing levels and product types - Satisfy operational, archival, and field terminal users Constraints - Processing architecture and optimization - Heritage designs - Contractor style and practices - Budget and schedule Intentions - Use simple, robust standards - Use best practices and experience from previous operational and EOS missions - Provide robust metadata - Maximize commonality among products - Forward-looking, not backward-looking standardization Design Process - Experience - Trades& Analyses Result Resources - HDF5 - FGDC - C&F conventions - Expectation of tools by others © 2005 The MITRE Corporation. All rights reserved
  7. 7. Essentials of the resulting design  The design is the manner in which we combine, limit, and extend the available resources into an NPOESS implementation  Granule is the fundamental unit of tracking, processing, and access  Structural hierarchy from Files to Granules to Arrays  Files & granules contain both data & metadata – Collection metadata (“quasi-static”) retained separate from granule (“dynamic”) metadata  Profiles in XML which describe the granule contents  Sample data sets have been delivered by Raytheon © 2005 The MITRE Corporation. All rights reserved
  8. 8. Resulting design Advantages  Disadvantages – Flexible; Extensible; Allows compression – Inconsistent with heritage operational formats (GRIB, BUFR) – Accessed by API, not format – Limited tools – Arrays can be addressed either by granule or by file – Potentially selfdocumenting – Handles abstract data types and large files – BLOBs (e.g., raw data, external files) can be wrapped File File Metadata Granule Metadata Granule Metadata Granule Metadata Arrays Arrays Arrays Granule Granule Granule © 2005 The MITRE Corporation. All rights reserved
  9. 9. NPOESS future evolution  Moving toward the ideal  NPOESS has time to evolve during NPP – Gain experience with data sets, metadata, operational users, direct users, CLASS ingest  Use flexibility of the HDF – Attributes can be added to fill out metadata needs  Use additional HDF data features (e.g., bit planes)  Use more complete self-documentation  Harmonization with community data description conventions  Develop more user tools  Possible benefit from netCDF – HDF convergence © 2005 The MITRE Corporation. All rights reserved
  10. 10. Lessons & Way Forward © 2005 The MITRE Corporation. All rights reserved
  11. 11. Observations from development to date     Avoid the temptation to use heritage approaches without reconsideration, but … Novel concepts need to be tested Data concepts, profiles, templates, or best practices should be defined before coding begins Use broad, basic standards to the greatest possible extent – FGDC has flexible definitions, if carefully thought through     Define terms in context; clarity and precision as appropriate Attempt to predefine data organizations in the past (e.g., HDF-EOS „swath‟ or HDF4 „palette‟) have offered limited flexibility. Keep to simple standards which can be built upon and described well. Lesson: be humble It is a great service to future programs if we capture lessons and evolve the standards How do we get true estimates of the life-cycle savings for good design? © 2005 The MITRE Corporation. All rights reserved
  12. 12. Thoughts on future features for Earth remote sensing products  Need to more fully integrate product components with HDF features  Formalize the organization of metadata items which establish the data structure – Need mechanism to associate arrays by their independent variables  Formalize the organization of metadata items which establish the data meaning – XML is a potential mechanism – can it be well integrated? – Work needed to understand the advantages and disadvantages – Climate and Forecast (CF) sets a benchmark  Need a mechanism to encapsulate files in native format – Case in which HDF is only used to provide consistent access  Need more investment in testing before committing to a design © 2005 The MITRE Corporation. All rights reserved
  13. 13. Primary and Associated Arrays Index Attribute n-Dimensional Dependant Variable (Entity) Array Primary Array e.g., Flux, Brightness, Counts, NDVI Associated Array(s) e.g., QC, Error bars dimension  n
  14. 14. 1-Dimensional Attribute Variables Index Attribute Primary e.g., UTC time or angle Additional e.g., IET time, angle, or presssure height Associated Independent Variable(s)
  15. 15. Multi-Dimensional Attribute Variables 2-Dimensional Independent Variable Array(s) e.g., lat/lon, XYZ, sun alt/az, sat alt/az, or land mask Key concept: Index Attributes organize the primary dependant variables, or entities. The same Index Attributes maybe used to organize associated independent variables. Associated independent variables may be used singly (almost always), in pairs (frequently), or in larger combinations.
  16. 16. Issues going forward - style  Issues with assuring access understanding – How will applications know which metadata is present? – Need to define a core set with a default approach  Issues with users – How to make providers and users comfortable with this or any standard – How to communicate the value of: best practices; careful & flexible design; consistency; beauty of simplicity – Ease of use as well as ease of creation  Issues with policy – Helping to meet the letter and intent of the Information Quality Act  Capturing data product design best practices – Flexibility vs. consistency vs. ease-of-use for a purpose © 2005 The MITRE Corporation. All rights reserved
  17. 17. Issues going forward - features  Issues with tools – Tools are needed to create, validate, and exploit the data sets.   Understand structure and semantics Issues with collections – How to implement file and collection metadata, with appropriate pointers forward and backward – How to implement quasi-static collection metadata  Issues with HDF – Processing efficiency (I/O) of compression, of compaction – Repeated (fixed, not predetermined) metadata items with the same <tag> not handled – Archival format © 2005 The MITRE Corporation. All rights reserved
  18. 18. Possible routes: Should there be an HDF-GEO?  Specify a profile for the use of HDF in Earth science applications:  Generalized point (list), swath (sensor coordinates), grid (georeferenced), abstract (raw), and encapsulated (native) profiles.  Generalized approach to associating georeferencing information with observed information.  Generalized approach to incorporating associated variables with the mission data  Generalized approach to „stride‟  Preferred core metadata to assure human and machine readability  Identification metadata in UserBlock  Map appropriate metadata items from HDF native features (e.g., array rank and axis sizes)  Preferred approach to data object associations: arrays-of-structs or structs-of-arrays?  Design guidelines or strict standardization? © 2005 The MITRE Corporation. All rights reserved
  19. 19. Questions? Discussion? © 2005 The MITRE Corporation. All rights reserved

×