00 12-06 the national virtual observatory

606 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
606
On SlideShare
0
From Embeds
0
Number of Embeds
20
Actions
Shares
0
Downloads
2
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

00 12-06 the national virtual observatory

  1. 1. The National Virtual Observatory: Leveraging the Astronomical Resources of the Nation for Astrophysics in the 21st Century A National Mandate: The National Virtual Observatory (NVO) needs to provide a unified and negotiated architecture to the nation’s unparalleled, long-term investment in astronomical observations. In addition, the NVO should support a negotiated architecture for the support of laboratory and theoretical modeling results to facilitate the analysis of this same astronomical data. The development of the NVO is complicated by the heritage of existing programs and the legacy of existing systems. In the interest of future scientists and their research programs, NASA needs to lead a tactical and strategic effort to federate university, industry, and government offices under a seamless multi-node program dedicated to astrophysical research. The NVO should be this program. In return, NASA’s drive toward federation will enable highly focused new mission concepts, enhance the return of an investment in research to the public and the nation’s educators, and develop a national market place for analysis, visualization, and archival tools. The NVO will provide a higher return on the nation’s astrophysical research investment than is currently possible. After many decades of working in space, the NASA vision of space science missions divides into three phases: research, flight mission development, and mission operations and analysis. This cycle has been iteratively followed through a number of NASA observatory programs at wavelengths spanning the ultra-violet, gamma-ray, and x-ray and through the infrared and sub-millimeter. Without NASA flight opportunities and dedicated satellite platforms, observational astrophysical research at these wavelengths would not have been possible. Following the launch of SIRTF in the coming years, NASA will have completed its road map for launching the nation’s great observatory program. NASA programs have served as inspiration to the astronomical community and the nation and, as a result, have reshaped our view of the universe and have greatly influenced the practice of fundamental astrophysical research. NASA must now lead an effort reaching beyond federal organizations, beyond international boundaries, and into University and Government labs – including (but not limited to) programs and facilities created by NASA - to federate the nation’s astronomical archives, laboratory research, and theoretical modeling efforts. The federation of the astrophysical research community organized by wavelength regimes, observational analysis, and theoretical modeling will in addition to the benefits above provide a leveraged and growing legacy of NASA’s multi-decade investment in the space sciences. Federated systems are a natural by-product of evolutionary development. Riegler [1] has modeled the lifecycle of space science missions as a federated series of programs: - Research: Gives rise to mission concepts and test key concepts – theoretical studies, new instrumentation, exploratory ground based and sub-orbital research.
  2. 2. - Flight Development: Mission studies, development and planning, check out and testing. - Operations and Data Analysis: Planned observations, data interpretation, confirm or revise theoretical modeling. As illustrated in the figure below, each phase flows naturally into the one following. Operations and data analysis typical result in new scientific hypotheses, which requires new missions, new technology, after which the cycle repeats. Observer's Plan Observer's Proposal & Experimen t Archive Analyzed Data Data Data Collectio n Plan Mission Plan Quick Look Publication Simulation QLA B' C F F D G G F G PR A' A C' B C' Figure 1. The iterative lifecycle of NASA science missions: clear communications are essential among all lifecycle phases. What is evident from figure 1 is the need for clear communications between all phases of the science mission lifecycle. Above all else, the process of federation establishes clear channels of communication. Many other systems are federated. For example, NASA space flight and research centers, the US military (Army, Navy, Air Force, and Marines), the US economy, the human body, and the internet are all federated systems. All these examples have defined means of communication. As communication improves, so does performance. In many cases, systematic problems can be linked to problems in communication. Defects in human DNA (mistaken communications) can produce the wrong amino acids and have serious life threatening consequences. Clear and standardized communications are key to the success of all federated systems. In addition to communications, federated systems must have clear guide- lines for functional performance and expected deliverables. Each must
  3. 3. component must satisfy a specific need of the overall federated system. In the US economy, buyers and sellers compete for one another’s business based upon negotiated offerings. When viewed from a far, the market place is complex. But in an ideal economy, the solutions are often optimal. ‘Market place’ standard can specify the federated principles in any number of management, engineering, and scientific disciplines. NASA must work to federate the existing astrophysical ‘market place’ to provide an efficient means by which research and scientific data are clearly communicated among ‘market place’ producers and consumers. These standards of the ‘market place’ should be defined under the NASA’s NVO program. With a large number of astrophysics specific missions in operation (9), development (9), and study (10), along with a growing list of previously archived science data (available from ST ScI, IRSA, HEASEARC, NSSDC, etc.). NASA is aware of the need to federate science data visualization and archival activities. A broad series of report address many of the issues in a federated data and analysis research center: 1982 CODMAC report, 1996 final report of the Task Group on Science Data Management [2], the 1996 – 2000 Senior Review of Astrophysics Mission Operations and Data Analysis Programs [3], the 1997 – 2000 meeting reports of the Astrophysical Working Group [4], most recently the 2001 – 2010 decadal review by the Astronomy and Astrophysics Survey Committee [5]. A combination of these documents highlights either the needs or requirements of the astrophysics community in developing a federated data program. Given the opportunity at hand, NASA can capitalize upon the current NVO call to develop a similar federated structure for laboratory and theoretical research programs. Many of these efforts are already federated to some degree. NASA is required to join the existing components together and create a functional ‘market place’ of astrophysics research which will benefit the scientific community, the nations educators, and the broader tax paying public. In specific, the science community is asking for a unified approach for science data centers and national archives to exchange data from a large number of astrophysical missions. Many researchers find the large number of visualization tools required for archival research and idea generation, proposal preparation and mission planning, data reduction and analysis to be specific to either missions or science centers. Astronomical software developers find that to support the documentation needs of community, the programming needs of the facility, and the algorithmic needs of the instrument teams requires a duplicity of effort in all three areas. Since the earliest reports on data center management have been issued – even as early as 1995 – the connectivity and software modeling tools of the information science community have permanently changed our world. Although most astronomical data center developers (> 80%) use object oriented languages (C/C++/JAVA/Visual Basic) almost half (45%) do not use object oriented modeling methods (e.g. OMT, Booch, or UML) [6]. Of this group, only a small fraction (2%) have a formal degree in computer science although almost half (48%) have doctorates. The needed for a federated program to guide center developers in software management and engineering practices should also be contained within the NASA’s NVO program.
  4. 4. The issues of the NVO are not unique to NASA, nor are they unique to science. NASA must work with university, government, and industrial partners to support the development of federated network of astrophysics and information science centers to address the needs of the science research community, educators, and the public. A National Mandate and Plan for the Virtual Observatory: The NASA vision of astrophysical research must drive the development of the NVO. Creation of a successful federate system – moving forward from existing subsystems – requires vision, a mission, a clear set of values. The vision is a statement of the future – where the community needs to go. The mission serves as a charter for the organization. The values are the principles upon which the organization is based. And a set of specific, measurable, realistic, and time oriented goals are needed to measure the progress in achieving these three. USRA suggests the following. The NASA NVO Vision: The astronomical community needs to chare a common set of values and approaches for software management and development. These values are reflected in the choice of tools and processes by which the community develops applications whose strength and longevity arises from a growing interactive network of scientists, engineers, and managers. The NASA NVO Mission: The NVO is chartered to provide the community with the recommended best practiced for the developments at hand. This spans not only ranges of technology readiness (perhaps focusing on only the highest TR levels) but also addressing the needs of the observational and theoretical astrophysical community. The NASA NVO values: 1. That open and competitive community wide participation between University, Government, and Industrial research centers is essential to the scientific, engineering, and management health of the NVO. 2. That "best practices", "lessons learned", “peer review”, and "appropriate management infrastructures and insight" are the guiding principles of the NVO. 3. That evolving developments in the field of software engineering must sit side-by-side with established mechanical, optical, electrical, and cryogenic disciplines with the country's astronomical community. The NASA NVO Goals: 1. Immediately fund a study team to generate US community buy-in by 2002 on the broadened approach of the NVO. Formulate and issue a NVO lead institution proposal call by 2002. 2. Have an NVO lead institution selected and funded by 2004.
  5. 5. 3. Have the lead institution established as a central facility (like the astrobiology center) and issue a proposal call for NVO development by 2004. 4. Have funded six 'legacy type' proposals - with institutional leads and partners to address the known needs of the community (e.g. astronomical tool visualization, archiving practices, instrument control, observational modeling and data reduction) by 2006. 5. Have each proposal meet a series of deliverables and milestones in community workshops, continued education, product deliverables, measures community satisfaction, etc. each year between 2007 and 2010. 6. Re-assess the direction of the NVO and the current leadership in 2008. Assess the successes of proposal deliverables in 2010. 7. Extend, re-bid, re-asses the success of the lead NVO institution by 2012. 8. Begin phase II of the NVO activity in 2014. A USRA led effort within the NASA NVO program: The Universities Space Research Association (USRA) with its present and recent programs to operate major NASA facilities such as the Stratospheric Observatory for Infrared Astronomy (SOFIA), the Research Institute for Advanced Computer Science (RIACS), the Center for Excellence in Space Data and Information Sciences (CESDIS), and the Lunar and Planetary Institute (LPI) understands the issues of the NVO in detail. Furthermore, USRA foresees the need to involve actively the national university community in the NVO initiative at an early stage. Clearly, the dynamic interaction of the research and education that characterizes the university academic environment will be an essential component of the successful NVO program. USRA, with its 85 member institutions, has the expertise necessary to pull the university community together in support of the NVO program. As an example of this approach, USRA offers the successful collaboration engineered to produce the design of the Data Cycle System (DCS) for SOFIA. We describe how the USRA approach to teaming should benefit the development and foundation of the NVO. The SOFIA Data Cycle System Development: The SOFIA DCS is designed to general needs of an observatory and instantiated for the specific needs of SOFIA. The DCS covers the complete science cycle of the observatory. The cycle starts with research into archives of previous observations, continues through proposal submission, observation, and analysis planning. And includes instrument scripting, data reduction, and analysis. It concludes with interpretation and publication followed by archival storage of the results. The DCS serves as a micro-economic proto-type of the NVO concept. The DCS architecture is engineered as distributed development with an ability for load sharing among networked systems and users. By developing a system that enables fluid access to existing models and
  6. 6. data and which support distributed development, modeling, and analysis, the DCS will facility the planning and decision making process of the observatory science programs. The work done to date on the development of the SOFIA DCS architecture and operating proto-type shows that the core system has the capability to support the broader science cycle Develop Proposal Awareness Publish Review Proposal Refine Proposal Flight Planning Scheduling Analysis Planning Observation Planning Make Observations Analyze Data Scientific Interpretation Figure 2 – The SOFIA Data Cycle System is based upon technologies adopted as current industry standards, platform neutral, and vendor neutral. Using a foundation of CORBA and XML, the architecture is flexible to accommodate future growth and change. planning and decision analysis tool needed to support observatory erations.op An Overview of the USRA SOFIA Data Cycle System: To scope of the DCS is to address the complete astronomical science data cycle. The system has been in design and development since early 1999 and has passed though four successful reviews. USRA leads a team in which the core architecture development is done by the Center for Imaging Science at the Rochester Institute of Technology. The team includes experts from the following organizations: GSFC, ARC, Sterling Software, UCLA, U. Chicago, Cornell and IPAC. The core DCS architecture design has been completed and prototype software has been written and demonstrated to an external review team in November, 2000. The following items comprise the goals of the DCS as presented to the SOFIA Science Council in December 1998:  Complete end-to-end data cycle support maximizes scientific return  Capitalize on lessons learned  Ease of use  Relentless observing  Interfaces with a uniform look and feel
  7. 7. The goals of the DCS were summarized as: More photons + more astronomers + more archival research = more scientific return The gain in SOFIA scientific performance is accomplished by combining:  Excellent community access  Uniform interface for general investigator operations  Uniform approach to facility instrument operations  Uniform interface to data reduction tools  Uniform production of data products for archival research In developing the DCS requirements and the DCS core prototype, USRA finds a clear correspondence to the needs of the NVO. The fact that the design has reached a level of maturity that enables the construction of a full prototype for testing by the summer of 2001 provides a synergistic opportunity with NVO. An illustration of the DCS concept is shown in Figure 2. The goal is to construct a system that enables fluent collaboration between the scientists and the observatory to conduct the complete cycle of observation planning, data collection, data reduction, analysis and archiving. The DCS will enable the scientist to model the expected results of an observation before the observation is made. It will be possible to construct simulated data sets based on instrument models, construct pipeline models based on existing modules written by instrument team and science center experts, access supporting data from archives, and simulate the observation and data reduction cycle. The plan that is produced will include the flight plan, instrument scripts for data acquisition, and the scripts for the data pipeline. The DCS will support final reduction of the data, archiving of all raw and final products, and archiving of publications. Key Software Technologies and Architectures for the DCS: Requirements for the SOFIA observatory include providing twenty years of service to the astronomical community. As change is the only constant in today’s computing industry, it is critically important that the DCS not be tied to any single platform or vendor standard. Therefore, the underlying technologies used in the DCS are those proven to be industry standard, platform neutral, vendor neutral, and flexible enough to accommodate future growth and change. Therefore, the DCS utilizes foundation technologies such as CORBA and XML. Using CORBA and XML as its foundations, then, the Data Cycle System embodies the “continuous improvement” paradigm. As new image reduction algorithms are developed, as new telescope instruments are created, as new ways of planning and executing observations are formulated, the DCS can easily incorporate these improvements. Additionally, these improvements are additions to, not replacements of, the software components that make up the system. It is important to maintain the data and algorithms used in previous practice as a reference when evaluating potential new methodologies. Continuous improvement and mechanisms that enable it have been a part of the DCS since its design.
  8. 8. The DCS is a federation of software modules, implemented using CORBA and communicating via XML. The DCS federation is grouped into five functional domains. These domains are Data Acquisition, Data Reduction, Data Storage, User Interaction, and the Task Library. We show in figure 3 the relationship of each domain. We describe the detail of all five domains in Appendix A. of this paper. The elements within each functional domain, or subsystem, implement a well-defined set of interfaces to the rest of the system. In this way, the implementation details of a given component are not important to other software components in the system; indeed, those details are not known. This object-oriented encapsulation of functionality is instrumental in smoothly connecting software components developed by potentially different developers at different times together into a cohesive whole called the Data Cycle System. The DCS is a flexible, capable, and powerful system that enables the operation of an astronomical observatory. It is adaptable to future improvements in instruments, computers, software, and the science needs of astronomers. DCS Management, Engineering, and Science leadership: Within USRA SOFIA program, the DCS development has designated management, engineering, and science leads. Each lead is partnered with a deputy to provide a solid management foundation during development. The current positions are Bob Hovde (USRA) – Management lead, Peter Sharer (Sterling Software) – Management Deputy, John Graybeal (Sterling Software) – Engineering Lead, Bob Krzaczek (RIT) – Engineering Deputy, Sean Casey (USRA) – Science Lead, and Joel Kastner (RIT) – Science Deputy. For the NVO white paper development, USRA involved participation from Eric Becklin (USRA), Sean Casey (USRA), Ian Gatley (RIT), Joel Kastner (RIT), Jacques L’Heureux (USRA), Bob Krzaczek(RIT), Barry Leiner (USRA), Mark Morris (UCLA), Harvey Rhody (RIT), and Peter Sharer (Sterling Software). USRA’s involvement in SOFIA, RIT’s involvement in the South Pole astronomy program, and UCLA’s participation in the Keck, etc. combined with the joint experiences of the individuals involved provide the foundation for URSA to move forward as a strong lead within NASA’s NVO program. The NVO Development: The NVO will touch on all parts of the process of scientific inquiry and will be the central system through which all data and processing resources are accessed. The NVO will necessarily be distributed, to connect geographically distributed users and geographically distributed data sources. Indeed, a major challenge is the distributed and heterogeneous character of the basic environment. Kepner and McMahon [7] make a case that the NVO address all of the following elements: (1) Acquisition and storage of raw data; (2) Data reduction; (3) Acquisition and storage of detected sources and (4) Multi-sensor/multi- temporal data mining. They note that “the NVO’s core data mining and archive federation activities are heavily dependent on the underlying data pipeline software necessary to translate the raw data into scientifically relevant source detections.”
  9. 9. Furthermore an open architecture is required for any data system serving the NVO; the community must be able to contribute to the growing body of data calibration and analysis knowledge, as expressed in algorithms and/or code, that is available to all NVO users. A definition of established roles within NVO is required. This should necessarily include the assignments of: A) Project Scientist – an astrophysicist with extensive experience in the development of data and archive systems who would lead the top- level science requirements definition for the NVO program based upon input from an established science definition team and the community. B) Project Engineer – an information technology expert who would oversee the design, development, and implementation of an overall architecture that satisfies the specified science requirements of the program. C) Project Manager – an individual experienced in leadership within science and engineering community, able to interact closely with NASA head quarters management, and who is able to guide science and engineering definition and development efforts to meet the goals of the NVO program. The USRA track record of experience and accomplishments: USRA is directly involved in center activities at the NASA Goddard, Marshall, and Johnson Space Flight Centers as well as NASA’s lead center for information technology the Ames Research Center. USRA has a successful track record of working closely with NASA centers and with NASA quarters staff. USRA also has working relationships with external NASA science centers – the Space Telescope Science Institute at Johns Hopkins University and the SIRTF Science Center at the California Institute of Technology. USRA is tied to NSF, other government programs, and other non-profit organizations (e.g. AURA) through its direct University membership consortium. Working teaming relationships include USRA partnering with for-profit companies such as Raytheon (SOFIA) and Lockheed (CSOC). In addition, USRA has a demonstrated record of spinning off appropriate research developments as start-up businesses within the country’s new economy (e.g. Scyld Computing Corporation - the original Beowulf team providing Second Generation Beowulf Clustering [8]). USRA is available to provide strong leadership in organizing the community to speak to NASA’s request for an NVO development roadmap. USRA has before – and is able to repeat past successes in support of NASA’s requests. References [1] Guenter Riegler, “The Lifecycle of Space Science Missions,” Research Division, Office of Space Science, NASA Headquarters, September 13, 2000. http://www.aas.org/policy/NASALifeCycle.htm
  10. 10. [2] Jeffrey Linsky, “Final Report of the Task Group on Science Data Management to the Office of Space Science”, October 23, 1996. http://adc.gsfc.nasa.gov/~gass/linsky/report_available.html [3]NASA Senior Review Reports: 2000 - http://spacescience.nasa.gov/codesr/results/SenRev00.PDF 1998 - http://spacescience.nasa.gov/codesr/results/SenRev98.PDF 1996 - http://spacescience.nasa.gov/codesr/results/SenRev96.PDF [4] The "Astrophysics Working Group" (AWG)meeting reports: http://www.astronomy.ohio-state.edu/~AWG/Meetings/ [5] The Decadal Report: Astronomy and Astrophysics in the New Millennium, Astronomy and Astrophysics Survey Committee, National Research Council - http://books.nap.edu/catalog/9839.html [6] A. F. Granados, “A ‘Scientific’ Approach to Software Project Management: Part II: Results of a Survey of Scientific Computing”, in Astronomical Data Analysis Software and Systems IX, ASP Conference Series, 2000, Vol 216, N. Manset, C. Veillet, and D. Crabtree, eds. [7] Jeremy Kepner and Janice McMahon, “High Speed Interconnects and Parallel Software Libraries: Enabling Technologies for the NVO,” in The evolution of Galaxies on Cosmological Timescales, ASP Conference Series, 1999, J. E. Beckman and T. J. Mahoney, eds. [8] Scyld Computing Corporation http://scyld.com/ Glossary: CORBA, the Common Object Request Broker Architecture, is an industry standard set of standards that define how to distribute an object- oriented software architecture across platforms and networks, and how components of that architecture remotely inter-operate with each other. It enables the communication between software objects that may be located on multiple dissimilar machines and implemented in different languages. CORBA is not a software product. Rather, it is a standard maintained by the Open Management Group, a consortium of hundreds of organizations. Software applications based on CORBA can, by definition, communicate with each other despite being implemented on a variety of vendor platforms. XML, the Extensible Markup Language, is a remarkably flexible yet simple method for representing and structuring information. Its primary strength and advantage over other formats for information exchange is its ability to be extended to accommodate future data structures. Rather than implementing a fixed set of structures for information, developers using XML can extend a given structure repertoire by taking advantage of in-document definitions: an XML document can define its structure along with the data it represents. This gives software developers a very rich and powerful mechanism for both defining information exchange now, as well as accommodating future enhancements and extensions. The use of XML directly addresses problems associated with heterogeneous data structures. Appendix A: Below is a description of the domains which comprise the DCS core prototype. The federation of modules provides a flexible architecture
  11. 11. to support the continuous improvement in SOFIA instruments, computers, software, and the science driven needs of astronomers and the observatory. The Data Acquisition subsystem is responsible for rendering, or translating, a planned observation into the idiosyncratic requirements of a particular instrument. Rather than re-implementing observation software for each instrument supported by the DCS, the system instead maintains observation plans in a flexible neutral internal format. When interacting with an instrument, the DCS applies a collection of software components to gradually refine and translate that observation into the format required by the instrument. This approach minimizes redundant software development and maximizes re-use of code between similar instrument science teams. The Data Reduction subsystem implements algorithm pipelines, converting raw instrument data into useful science products. This reduction is automatic; when raw data is introduced to the DCS, an appropriate pipeline is immediately scheduled and executed at the earliest opportunity. The DCS supports pipelines running on machines not only at the SOFIA Science Center, but distributed across networks and even reaching to a researcher’s desktop. Someone implementing a new pipeline algorithm, or evaluating an improvement to an existing algorithm, can easily use the DCS as a “test harness” for their software. All the data and algorithms available to a general investigator are available to a developer for testing and evaluation of improvements to the DCS. The Data Storage subsystem provides a repository of information to the distributed components of the DCS. More than just “data” is maintained in DCS Storage; experiments, observation plans, flight logs, reduction pipelines, and proposals are all maintained within DCS Storage, along with the expected archives of raw and reduced science data. All the information in this subsystem is maintained in a neutral format that is easily processed by software components within the DCS, and is easily exchanged with external DCS customers and archives. The User Interaction subsystem is a common interface to the functionality provided by the DCS. In fact, the rest of the DCS needs to provide no user interfaces; instead, all user interactions are conducted through a common, yet customizable, interaction layer. In this way, changes made to a user’s DCS experience (based on, perhaps, spoken languages or skill levels) do not require changes in the rest of the system software. The software that drives an instrument, for example, is unaffected by the user’s preference of German over English, or the user’s skill level in using the system. The Task Library works closely with the User Interaction subsystem in supporting the imaging scientist using the DCS. The Task Library provides the perceived intelligence of the DCS, coordinating the actions of the various system resources (acquisition, reduction, storage) in ways that satisfy user’s requests. A task may be simple (“locate any data matching a specified target”) or complex (“conduct a specified observation, reduce its data, store all intermediate and final results, and email the observation’s authors the results of the experiment when available”). Most importantly, the Task Library is where past, present, and future “best practices” are embedded within
  12. 12. the DCS and made available to all of its users in a consistent, uniform manner. Appendix B: We have included this section (Section 3 - PRINCIPLES OF SUCCESSFUL MANAGEMENT OF SCIENTIFIC DATA by Linsky et al.) because of it’s timeless statement of science software requirements. “Through extensive case studies, the Committee on Data Management and Computation (CODMAC) of the Space Science Board, National Research Council, derived a suite of principles or operating guidelines for maximizing the scientific utilization for space data (NRC, 1982). The case studies involved large data centers such as the National Space Science Data Center (NSSDC), data activities associated with missions, and relatively small groups of individuals who were working on generating new products from data long after mission operations had ceased. The CODMAC principles and associated recommendations have been used over the past 15 years to help guide data system development within NASA, including establishing programs at NASA Headquarters, implementation of well-defined data systems within missions that include generation and validation of data products, and the recognition that discipline oriented systems are best suited for maximizing the scientific utilization of space data. “CODMAC also provided detailed recommendations for distributed, discipline-oriented data systems, focusing on expected data sets for each space science discipline, probable growth in networking and computational capabilities, and an examination of how each discipline- oriented community will likely work with data (NRC, 1986)… The principles are listed here, and later in the Report we restate these principles and our recommendations for successful space science data management that flow from these principles and the new case studies that we discuss later in this Report. The Task Group members unanimously concur that the principles remain entirely valid today, and they were used to help guide deliberations and recommendations.” 1. ``SCIENTIFIC INVOLVEMENT. There should be active involvement of scientists from inception to completion of space missions, projects, and programs in order to assure production of, and access to, high- quality data sets. Scientists should be involved in planning, acquisition, processing, and archiving of data. Such involvement will maximize the science return on both science-oriented and applications- oriented missions and improve the quality of applications data for applications users.'' 2. ``SCIENTIFIC OVERSIGHT. Oversight of scientific data-management activities should be implemented through a peer-review process that involves the user community.'' 3. ``DATA AVAILABILITY. Data should be made available to the scientific user in a manner suitable to scientific research needs and have the following characteristics:'' (a) ``The data formats should strike a proper balance between flexibility and economics of nonchanging record structure. They should
  13. 13. be designed for ease of use by the scientist. The ability to compare diverse data sets in compatible form may be vital to a successful research effort.'' (b) ``Appropriate ancillary data should be supplied, as needed, with the primary data.'' (c) ``Data should be processed and distributed to users in a timely fashion as required by the user community. This responsibility applies to Principal Investigators and to NASA and other agencies involved in data collection. Emphasis must be given to ensuring that data are validated.'' (d) ``Proper documentation should accompany all data sets that have been validated and are ready for distribution or archival storage.'' 4. ``FACILITIES. A proper balance between cost and scientific productivity govern the data-processing and storage capabilities provided to the scientist.'' 5. ``SOFTWARE. Special emphasis should be devoted to the acquisition or production of structured, transportable, and adequately-documented software.'' 6. ``SCIENTIFIC DATA STORAGE. Scientific data should be suitably annotated and stored in a permanent and retrievable form. Data should be purged only when deemed no longer needed by responsible scientific overseers.'' 7. ``DATA SYSTEM FUNDING. Adequate financial resources should be set aside early in each project to complete database management and computational activities; these resources should be clearly protected from loss due to overruns in costs in other parts of a given project.''

×