Scientific Information Management at the U.S. Geological Survey


Published on

In: Geoinformatics 2006—Abstracts, Shailaja R. Brady, A. Krishna Sinha, and Linda C. Gundersen (ed.), USGS Scientific Investigations Report 2006-5201.

Published in: Economy & Finance, Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Scientific Information Management at the U.S. Geological Survey

  1. 1. Scientific Information Management at the U.S. Geological Survey: Issues, Challenges, and a Collaborative Approach to Identifying and Applying Solutions David L. Govoni and Thomas M. Gunther USGS Geospatial Information Office Geoinformatics 2006 May 12, 2006 U.S. Department of the Interior U.S. Geological Survey
  2. 2. Geospatial Information Office (GIO) Science Information and Education Office Responsibilities: - Publishing policy and coordination - Libraries and Information Centers - Web infrastructure and content policy - Product Warehouse and distribution - Education and outreach - Knowledge management services - Scientific information management
  3. 3. Geospatial Information Office (GIO) Science Information and Education Office Accomplished in partnership with USGS science and administrative programs through a combination of: - Governance - Consultation - Facilitation - Collaborative development Goal is to enable and support an “Integrated Information Environment” for the USGS
  4. 4. Integrated Information Environment (IIE)
  5. 5. Problems, problems … everywhere Common issues identified from discussions with scientists and others across USGS disciplines: - Search and discovery (especially by place and topic) - Database access and integration - Interoperability of tools and processes - Advanced visualization, modeling, other tools - Archive and preservation Compliance with mandates: - Security, science quality, publishing, records management, accessibility, …
  6. 6. The solution? Good news … bad news Lots of talent, innovation, and motivation, but: Widely scattered geographically and organizationally Many local efforts unknown to others in USGS Duplicative or overlapping in purpose, capabilities Built on multiple platforms in multiple languages Some good, some not so good Some potentially scalable, some not “Costly” to organization as a whole
  7. 7. So how do we … Increase awareness? Identify “best of breed”? Accelerate diffusion? Provide support? Institutionalize? One approach: Communities of Practice (CoPs)
  8. 8. What is a “Community of Practice”? Communities of Practice are groups of people who share a concern or a passion for something they do and learn how to do it better through the process of collective learning as they interact regularly. CoPs are: - Problem driven - Self-organizing, voluntary, and motivated - Not constrained by position in formal organizations - Not formally chartered or accountable through management chains as for teams Modified after Etienne Wenger (
  9. 9. USGS Scientific Information Management (SIM) Workshop Three day Scientific Information Management Workshop, March 2006 150+ people representing all USGS regions and both science and administrative programs Other DOI bureaus, other public and private-sector organizations also participated Explicit focus on intersection of SIM and CoPs
  10. 10. SIM Workshop Three parts: - Overviews of problems and approaches to SIM both inside and outside of the USGS - Introduction to “Community of Practice” concept as a framework for collective learning and collaborative problem solving - Breakouts designed to simultaneously: Identify key issues and needs Explore and encourage the formation of CoPs to develop solutions
  11. 11. Potential communities Data/information management - Field data for small research projects - Large time series data sets - Scientific data from monitoring programs Classification and discovery - Metadata - Knowledge organization systems Delivery - Digital libraries - Portals and frameworks
  12. 12. Potential communities Interoperability and integration - Database networks Preservation and long-term access - Archiving of scientific data and information - Preservation of physical collections Knowledge management - Knowledge capture - Emerging workforce
  13. 13. Outcomes At least 9 of 12 potential communities agreed to continue on as “formal” CoPs Other potential communities proposed, e.g., - Open access - Open source software - Search - Program management Management commitment to support creation of bureau-wide infrastructure to enable current and future CoPs
  14. 14. USGS Communities Network Common gateway to all known USGS CoPs Framework of shared collaborative services and tools available to support interested communities: - Discussion forums - Document management - Digital library and bibliography management - News and Events calendar - Wikis and annotation - RSS feeds - … Initially USGS-only but eventually available to external collaborators and partners
  15. 15. Workshop evaluation Reviews positive: - Met or exceeded expectations: 89% - Change practices as result: 33% - Participate in communities: 72% - Learned new tools or approaches: 50% - Make valuable new contacts: 90% Suggests broad interest and appeal of communities approach (based on ~50% survey response)
  16. 16. What was learned Those “in the trenches” know best: - Cannot implement top-down SIM solutions - Solutions can come from (and be managed from) anywhere One size won’t always fit all, but … - Many issues are common to all USGS disciplines - Local approaches may be broadly applicable, scalable, and cost-effective for the USGS as a whole
  17. 17. Perspectives on SIM … a digression SIM needs to be considered from two distinct, but intimately related perspectives: - “Information life-cycle” or Producer perspective Course of data and information from initial acquisition to final disposition - Consumer perspective How data and information is used to accomplish tasks
  18. 18. Producer perspective refers to refers to refers to refers to Fieldwork Preparation & Analysis, synthesis Preservation & (in situ, in vitro, distribution & interpretation archiving in silico) (via any medium) includes includes includes includes Direct & remote Laboratory Records Publications, data, observation, experiments, management, talks, seminars, monitoring & modeling, data rescue, physical models, libraries recording visualization sample preservation
  19. 19. Consumer perspective
  20. 20. “Metainformation” is critical to both Broadly defined here to encompass both “classic metadata” and “contextual information” (rules, assumptions, ontologies, schema, documentation, etc.) that impart deeper understanding or facilitate use Metainformation: - Critical to our ability to conduct integrated studies - Critical to maintaining long-term access - Should be, but very often is not, formally captured and preserved all along the information life-cycle
  21. 21. Perspectives on SIM End of digression
  22. 22. What was learned … SIM is not easy Despite advances in technology, many tasks: - Remain time-consuming - Require significant involvement by scientists (sometimes at the expense of their science) - Lack incentives to “do the right thing” Volume outpacing resources Legacy data may already be beyond saving
  23. 23. SIM is not an option Good stewardship of data, information, physical artifacts, and associated metainformation is an obligation of the research community: - As a matter of self interest (e.g., as precondition for being viewed as a “trusted source”) - Data and information is of little value if it cannot be found or delivered in a timely or usable condition - Reproducibility of results – a hallmark of the scientific method – may impaired or impossible without it
  24. 24. Meeting the challenges … There is hope! Communities of practice, if encouraged and supported, offer several benefits: - Strength in numbers: Multiple perspectives and insights are brought to bear on problems Yield better solutions, faster - Organizational adaptability: Can coalesce rapidly around issues driven by changing technologies, research needs, or other challenges without time-consuming organizational realignments
  25. 25. There is hope! - Cost-effectiveness: Fewer development “stovepipes” Less likely to “reinvent the wheel” Useful knowledge, tools, and techniques are rapidly distributed throughout the organization Standardization, interoperability more likely - Collective learning: Participation increases knowledge and skills of all participants Overall organizational competence is enhanced Knowledge is more likely to be preserved for the next generation
  26. 26. Thank you. … Questions? Dave Govoni ( Tom Gunther (