Metadata Requirements for EOSDIS Data Providers


Published on



Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • in short, without metadata, a user of the data is in the dark.
  • Not all metadata is used in searching. Some metadata is merely informative and will not be used in database queries. This metadata can be viewed to assist data consumers in deciding whether to order data or not.
  • Metadata is needed to identify a data product once it is archived in the system.
    Without metadata, users could never find a file unless they knew the precise ID of the file (like a filename in some systems, or in ECS a UR).
    By supplying a rich set of metadata attributes for the data, users will be able to find the data more easily and in a greater variety of routes or search methods.
  • All textual metadata (i.e. excluding things that are specifically provided for by HDF like scales and units) should be contained in HDF text attributes.
    ECS compliant metadata must be written to HDF text attributes with specific names, and may span multiple attributes, numbered sequentially, to accommodate all metadata.
    This metadata must also be written in ODL, or Object Description Language.
    These tasks are best handled by using the SDP Toolkit.
    Collection level metadata is delivered separately from the granules and will be discussed later.
  • ECS requires only 2 attributes to insert and acquire granules: ShortName and VersionID. Upon granule generation, ProductionDateTime is generated by the system and is this can also be used to identify granules belonging to an collection.
  • Temporal can also be designated by range, or periodic attributes
    Spatial can also be designated by a single point, point & circle, or polygon.
  • ECS needs to be made aware of a data set prior to the arrival of the first “granule” of data, so that the archives that will hold the data and the database tables that will hold the metadata can be set up.
    This is done by defining an Earth Science Data Type (ESDT). An ESDT “descriptor” file contains all the metadata values that describe the entire “collection” of data granules.
    The ESDT descriptor also identifies the metadata that will pertain to the individual granules and whose values will be supplied as each granule is “inserted” into the system.
    The Distributed Active Archive Centers (DAACs) are responsible for generating ESDT descriptor files, DLLs and any custom code necessary to ingest granules into the system.
    (is it appropriate to say this?)
  • Metadata Requirements for EOSDIS Data Providers

    1. 1. Metadata Requirements for EOSDIS Data Providers Siri Jodha Singh Khalsa HDF-EOS Workshop II SJSK 1
    2. 2. Topics •Why metadata is important •Types of metadata in HDF-EOS files •Required metadata •How metadata is encoded and delivered HDF-EOS Workshop II SJSK 2
    3. 3. What is Metadata? •Metadata is information that identifies and characterizes an information product. •Sometimes called “data about data” HDF-EOS Workshop II SJSK 3
    4. 4. Users Need Metadata •Metadata is needed to answer questions such as: - What time and location does this data apply to? - Why type of instrument and processing produced the data? - What other inputs were used to generate the data? - What QA has been performed on this data? - Who do I contact if I have questions about this data? HDF-EOS Workshop II SJSK 4
    5. 5. Metadata is Essential •Large data archive systems cannot function without metadata. •Metadata is used to keep track of such things as: - where the data is what type of operations are possible on the data whether there are any access restrictions on the data how individual data files are logically grouped into “collections.” HDF-EOS Workshop II SJSK 5
    6. 6. Key Concepts •A granule is the smallest aggregation of data that is independently described and inventoried by the ECS. A granule consists of 1 or more physical files. •A collection is a logical grouping of granules. •The ECS Data Model allows for: - “Core” attributes - “Product-Specific” Attributes (PSAs) SJSK 6
    7. 7. Types of Metadata •Metadata in HDF files - stored as global text attributes •Types of Metadata used in HDF-EOS files: - Structural Metadata - Core Metadata (inventory, can include PSAs) - Archive Metadata (non-searchable, product-specific) •Collection level metadata - core and product-specific HDF-EOS Workshop II SJSK 7
    8. 8. Required Metadata •Origins of metadata requirements: - what is required to archive and retrieve files - what is required to provide search and other services on data - what is federally mandated (FGDC) •There are 287 attributes in the ECS data model - only a subset are used for any given product - 101 are applicable at the granule level HDF-EOS Workshop II SJSK 8
    9. 9. Metadata Coverage •Science Data that are delivered for archiving in ECS must meet what is called the Intermediate level of metadata coverage. This involves as few as: - 31 collection level attributes - 4 granule level attributes •Compliance at this level is not enforced by the system. HDF-EOS Workshop II SJSK 9
    10. 10. Collection-Level Metadata for Intermediate Coverage - ShortName LongName CollectionDescription VersionID ArchiveCenter RevisionDate VersionDescription CollectionState MaintenanceandUpdateFrequency ECSDisciplineKeyword ECSTopicKeyword ECSTermKeyword ECSVariableKeyword ContactOrganizationName Role HDF-EOS Workshop II - SpatialCoverageType PointLatitude PointLongitude TimeType DateType TemporalRangeType PrecisionofSeconds EndsatPresentFlag CalendarDate TimeofDay GuideName DataCenter DocumentVersion DocumentUpdated Title DocumentCreated SJSK 10
    11. 11. Granule-Level Metadata for Intermediate Coverage •There are only four granule-level metadata attributes required: - ShortName - VersionID - SizeMBECSDataGranule - ProductionDateTime •ShortName and VersionID are identical to the collectionlevel attributes with these names. •For granules coming into ECS, SizeMBECSDataGranule and ProductionDateTime are supplied by the system upon insertion. HDF-EOS Workshop II SJSK 11
    12. 12. How is Metadata Supplied? •Collection-level metadata is carried in an Earth Science Data Type (ESDT) Descriptor file. •Granule-level metadata is defined in the descriptor file and populated using a Metadata Configuration File (MCF). •Granule-level metadata is delivered in the HDFEOS granule *or* in a populated MCF accompanying a non-HDF granule. •The DAAC where a collection will reside is responsible for descriptors and ingest routines. HDF-EOS Workshop II SJSK 12
    13. 13. Metadata Work Flow for External Data Providers Data Provider Responsibility Popula t ion Analy s is MDWorks Data Model MDWorks Specs DAAC c ollec t ion c ore a t t ribut es + granule v a lue s c ore a t t ributf init ions P S A de es Data/Docs t y pe a nd f ormat c hec k PSA_Reg Tools V a lida t ion ODL Parser Descriptor MCF Build MCF O DL s y nt a x c he ck Ta s ks Validated Desc. Sc ie nc e S of t ware DLL c oding SDP Toolkit granule c ore va lues P S A v a lue s s t ruc t ura l me t a dat a Te st & Va lid. Const ra int s c he ck s Data Base Load File HDF-EOS file HDF-EOS Workshop II I nge s t S ubs y s t e m E SDT I ns ert DAAC Dat a Arc hiv e SJSK 13
    14. 14. Metadata Resources on the Web •ECS Metadata Homepage •Metadata Works (ESDT Descriptor Tool) http://et3ws1.HITC.COM/metadata_works/ •EOSDIS Information Architecture •Federal Geographic Data Committee SJSK 14
    15. 15. Q&A w/ Experts Panel •Q: “If you are a new data provider, how do you get your data into an HDF-EOS granule, given the bewildering array of utilities and tools available? What is the simplest solution for this?” •A: The recommended solution is to obtain the HCR package, which includes the HDF-EOS and HDF libraries. For populating the required metadata in the granule, obtain the Metadata/Time Toolkit_MDT. The steps would be: 1. Write an HCR and use the tools to turn this into a skeletal HDF-EOS granule. (This step is optional). 2. Use the HDF-EOS library to create a granule. (If starting with a skeletal HDF-EOS file generated from an HCR then plain HDF calls can be used to insert data into the granule ). 3. Use Toolkit_MDT calls to insert metadata into the granule. This requires generation of an MCF in ODL. Metadata_Works is available for doing this. As an alternative, a simple HDF call can be used to attach minimum metadata (in ODL) to an HDF file. Note: if the data are going to reside in a DAAC, or in an archive that must be interoperable with ECS, you will need to generate collection-level metadata. Metadata_Works is the recommended tool for this. SJSK 15