Kowal RDAP11 Data Archives in Federal Agencies


Published on

Dan Kowal, NOAA/NGDC; Data Archives in Federal Agencies; RDAP11 Summit

The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • NOAA Data Centers: National Geophysical Data Center National Oceanographic Data Center National Climatic Data Center CLASS - Comprehensive Large Array-Data Stewardship System: IT Infrastructure governed by the Data Centers and utilized for large volume data sets.
  • From the publication Creating Public Value by Mark Moore, Professor at the Kennedy School of Govt. in Harvard.
  • The timeline shows the development of the “What to Archive” Procedures. In a nutshell, the idea was borne from DAARWG Mtg that handed off the development guidelines to the DMC. The review process involved several line offices within and outside NOAA. The acting NOAA Administrator signed off on the amendment to the NAO 212-15 to include the Appraisal/Approval Guidelines as a part of the data management process - see next slide.
  • Two documents were created: one for data managers; the other for data providers and general public (in brochure form). NARA has adopted the process as a best practice.
  • This is the high level view of the “What to Archive.” It’s not always a serial process. For instance, the preliminary records appraisal includes filling out the Request-to-Archive questionnaire followed by assembling an appraisal team. Receipt of Request should be directed over to the Data Administrator after initial contact is made. Formal Records Appraisal will include more steps and “acceptance of or decision to remove” will initiate a process to post such actions for Public Comment - only in the Formal and/or Removal scenarios.
  • At a more detailed level, the formal process looks more like this. But as you click through, the more “no-brainer” approach is displayed, a path that’s taken if the archive request is non-complicated, fits the mission and has the data resources to steward it. The Formal process is just the opposite, and the proposed data set often has the complexity and dimensions (think large volumes) that make it worthy for CLASS. In this situation, contact with CLASS is initiated early in the “detail gathering” phase during the Formal Appraisal to get rough estimates for the infrastructure (storage and access) costs to support it.
  • A new system in prototype stage for tracking archive projects through their lifecycle.
  • NGDC archive projects being tracked in the ATRAC System.
  • Timeline status view of each project.
  • For clarification purposes, here are three items of note within the guidelines worth mentioning.
  • Original Data are data in their most basic useful form. These are data from individual times and locations that have not been summarized or processed to higher levels of analysis. While these data are often derived from other direct measurements (e.g., spectral signatures from a chemical analyzer, electronic signals from current meters), they represent properties of the environment. These data can be disseminated in both real time and retrospectively. examples of original data include oceanographic and meteorological observations from buoys, geophysical observation data from surface-based sensors, living marine resource inventories, bathymetric data from hydrographic surveys, biological and chemical properties of sediments, or weather observations and observation data from satellites. Synthesized Products are those that have been developed through analysis of original data. This includes analysis through statistical methods; model interpolations, extrapolations, and simulations; and combinations of multiple sets of original data. While some scientific evaluation and judgment is needed, the methods of analysis are well documented and relatively routine. examples of synthesized products include summaries of fisheries landings statistics, weather statistics, model outputs, data overlays displayed through Geographical Information System techniques, and satellite-derived maps Hydrometeorological, hazardous Chemical Spill, and Space weather warnings, Forecasts, and Advisories are time-critical interpretations of original data and synthesized products, prepared under tight time constraints and covering relatively short, discrete time periods. as such, these warnings, forecasts, and advisories represent the best possible information in given circumstances. They are subject to scientific interpretation, evaluation, and judgment. Some products in this category, such as weather forecasts, are routinely prepared. other products, such as tornado warnings, hazardous chemical spill trajectories, and solar flare alerts, are of an urgent nature and are prepared for unique circumstances. Experimental products are products that are experimental (in the sense that their quality has not yet been fully determined) in nature, or are products that are based in part on experimental capabilities or algorithms. experimental products fall into two classes. They are either 1) disseminated for experimental use, evaluation or feedback, or 2) used in cases where, in the view of qualified scientists who are operating in an urgent situation in which the timely flow of vital information is crucial to human health, safety, or the environment, the danger to human health, safety, or the environment will be lessened if every tool available is used. examples of experimental products include imagery or data from non-Noaa sources, algorithms currently being tested and evaluated, experimental climate forecasts, and satellite imagery processed with developmental algorithms for urgent needs (e.g., wildfire detection).
  • Kowal RDAP11 Data Archives in Federal Agencies

    1. 1. <ul><li>NOAA National Data Centers </li></ul><ul><li>Archive Appraisal/Approval Process </li></ul><ul><li>What issues require more investigation? </li></ul>Dan Kowal, Data Administrator, NGDC
    2. 2. My Day Job Producer Management Consumer 1. Common Services 5. Preservation Planning 4. Data Management 3. Archival Storage 2. Ingest 6. Access Administration Submission Agreement negotiation Submission Scheduling Ingest Reporting Data formatting & documentation stds Audit Rpt. [Updated SIP] SIP or AIP [for audit] Storage Mgmt. policies Operational Statistics Info Requests Info Responses Dissemination request DIP AIP/SIP templates AIP/SIP review Customization advice Migration packages Recommendations Approved stds. Migration goals Inventory reports Performance reports Consumer comment Policies Report request System updates Review updates Report Status of updates Reports Budget, Policies
    3. 3. NOAA National Data Centers NGDC NCDC NODC CLASS NESDIS
    4. 4. Strategic Triangle Legitimacy Value Capacities
    5. 5. What to Archive Background 2008-01 2008-04 2008-12 2009-01 2008-10 <ul><ul><li>DAARWG: Data Archiving and Access Requirements Working Group </li></ul></ul><ul><ul><li>Reports to the Science Advisory Board (SAB) </li></ul></ul><ul><ul><ul><li>DMC: Data Management Committee </li></ul></ul></ul><ul><ul><ul><li>Reports to the NOAA Observing Systems Council (NOSC), </li></ul></ul></ul><ul><ul><ul><li>a principal advisory body to the NOAA Administrator </li></ul></ul></ul><ul><li>NAO 212-15: NOAA Administrative Order on Environmental and Geospatial Data </li></ul>DAARWG MTG DMC 1st Draft, Review Begins NAO 212-15 Includes Procedure Documents Published Final Documents Distributed
    6. 6. NAO 212-15 SECTION 3. POLICY .04 Decisions within NOAA to retain and preserve environmental and geospatial data and information shall be based upon the procedure and decision process found in &quot;NOAA Procedure for Scientific Records Appraisal and Archive Approval,”… Note: This provision in the policy no longer exists. However, it is referred to in a more general sense as “Approved Procedural Directives” in Section 4, Implementation. http://www.corporateservices.noaa.gov/~ames/NAOs/Chap_212/naos_212_15.html
    7. 7. “ What to Archive” Documents The Complete Guide Informational Brochure Featured Best Practice at NARA Available at: http://nosc.ngdc.noaa.gov/docs/products/NOAA_Procedure_document_final_12-16-1.pdf
    8. 8. NOAA “What to Archive” Procedure Flow Diagram NOAA Archive Process from 30,000 ft.
    9. 9. Staff member confirms request is appropriate to NGDC and conducts Preliminary Appraisal Using R_to_A Questionnaire. Conduct Preliminary Appraisal Confirm appropriate Data Center and designate Appraisal Team Complete All Sections as Appropriate Legend Conduct Formal Appraisal Data Provider Data Center Receives Request to Archive Is a Formal Appraisal Required or CLASS support wanted? Yes No Assemble Recommendation Package Preliminary Information Detailed Information Does Recommendation request use of CLASS? No Yes Submit Final Recommendation Package to Data Center for Decision Data Center Makes Decision and Notifies Provider Recommendation Package Preliminary Agreement Requirements Documents Design Documents Submission Agreement Metadata and Documentation Ingest and archive data Data to be archived NGDC Simple Case from Sea Level Draft Requirements Yes No, suggest alternatives 28 Questions 10 Dimensions Cross-Data Center Activity Within-Data Center Activity Submission Agreement Draft Preliminary Agreement with Provider and Requirements Documents (if needed and with CLASS if appropriate) Finalize Submission Agreement with Provider and Design Documents (with CLASS if appropriate) COWG Evaluates and Makes Recommendation back to COPB COPB Tasks COWG to Evaluate Request COPB Makes Decision and Informs Data Center Data Center Makes Request to COPB for use of CLASS Resources Revise Recommendation Package as Necessary No or Conditional Yes Yes
    10. 10. What specific aspects of appraisal and selection require more investigation? Identification of Records: • How is the Procedure being communicated to groups in NOAA? • Requests coming in where no funding was identified at the outset for archive. • Past practices of data rescue activities subverting any appraisal assessment. • Data identified by Users, but not definitely by Providers. • Impromptu Data Campaigns. • When domain expertise does not reside at the center. • Data Center time and resources could be reduced if providers understood what is required for a project prior to engaging a Data Center. • Consistent tracking of receipt of request in a centralized fashion.
    11. 11. What specific aspects of appraisal and selection require more investigation? Appraisal of Records: • Assembly of an appraisal team (coordination, timeliness, incentives). • Lack of resources of assigning a Project Manager per each appraisal • Time intensiveness to conduct. • Tool to track the process (It’s coming!) • Enforcement/Accountability. • Improving cost estimation/Coordinating w/ CLASS.
    12. 15. “ What to Archive”: Notes of Interest Exemptions: Scientific records that existed within a NOAA archive prior to June 30, 2008 are exempt from this procedure, unless those records require evaluation for potential disposal according to existing NOAA records disposition schedules or do not have a defined disposition schedule. Timeline: The Information Provider will receive, within 30 days of the NOAA Facility’s receipt of the request, acknowledgement of the request and the expected duration of the process, which will return a decision to the requester. Public Comment: Any decision that results in a) existing scientific records being removed from a NOAA archive or b) newly acquired scientific records being added to a NOAA archive that have also gone through a formal records appraisal process will be advertised for public comment and appeal by the NOAA Office Director using their Line office’s procedure for implementing the “NOAA Policy on Partnership in the Provision of environmental Information.”
    13. 16. What Records are We Talking About? <ul><li>The four categories covered by this procedure are as follows: </li></ul><ul><li>Original Data </li></ul><ul><li>Synthesized Products </li></ul><ul><li>Hydro-meteorological, Hazardous Chemical Spill, and </li></ul><ul><li>Space Weather Warnings, Forecasts, and Advisories </li></ul><ul><li>4) Experimental Products </li></ul>