RDAP13 Lorrie Johnson: Facilitating Access to Scientific Data


Published on

Lorrie Johnson, U.S. Department of Energy/Office of Science and Technical Information: “Facilitating Access to Scientific Data: The DataCite, Science.gov, and WorldWideScience.org Initiatives”

Panel: Linked data and metadata (co-sponsored by the ASIS&T Digital Libraries SIG)
Research Data Access & Preservation Summit 2013
Baltimore, MD April 4, 2013 #rdap13

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

RDAP13 Lorrie Johnson: Facilitating Access to Scientific Data

  1. 1. Lorrie Apple Johnson Senior Librarian, Information Analysis & ServicesOffice of Scientific and Technical Information (OSTI)Research Data Access & Preservation Summit 2013 Baltimore, MD April 4, 2013
  2. 2. OSTI is a program within the DOE Office of Sciencewith the corporate responsibility for ensuringappropriate access to DOE R&D results. • DOE invests over $10 billion/year in basic sciences, clean energy technology, nuclear research. • The immediate output from this investment is information… knowledge… R&D results. • OSTI’s mission is to accelerate scientific progress by accelerating access to this information. Energy Policy Act of 2005 “The Secretary, through the Office of Scientific and Technical Information, shall maintain within the Department publicly available collections of scientific and technical information resulting from research, development, demonstration, and commercial applications activities supported by the Department.”
  3. 3. Department of EnergyScientific and Technical Information ProgramDOE R&D results are:  Collected from DOE offices, labs, and facilities, as well as university grantees;  Preserved for re-use; and  Made accessible via multiple web outlets. OSTI works to ensure that: • Research results from DOE programs are shared globally plus • DOE-supported researchers have access to scientific discoveries from around the world
  4. 4. • Scientific research is conducted at many agencies across the federal government.• Scientists and researchers produce a lot of information, in many different formats: • Textual – reports, journal articles, conference proceedings, patents • Multimedia– videos, images • Data
  5. 5. Hard to FINDHard to NAVIGATEHard to CITE
  6. 6. Data should be cited in just the same way that other sources ofinformation, such as articles and books, are cited.Data citation can help by: enabling easy reuse and verification of data allowing the impact of data to be tracked creating a scholarly structure that recognizes and rewards data producers
  7. 7. What is DataCite? A global consortium composed of local institutions focused on improving the scholarly infrastructure around datasets and other non-textual information. A service for assigning Digital Object Identification (DOIs) and metadata to datasets. DataCite (www.datacite.org) helps researchers find, access and reuse data.
  8. 8.  Easier identification and access of datasets across the international community of researchers via DataCite’s resolving tools Linkage between DOE’s R&D documents and the underlying datasets generated by the research  Standard format for including data in the accepted bibliographic citation framework  Aid researchers in locating exact datasets used in previous work, thus allowing verification of results or new uses for the data
  9. 9. DOE Data ID Service• DOE/OSTI is the only U.S. federal member of DataCite.• Interagency agreement in place with NIH project; in discussions with seven agencies representing 12 projects.• OSTI Partnered with Oak Ridge National Laboratory to pioneer procedure.• First DOI for a DOE dataset was minted and registered with DataCite on 8/10/2011.• DOE Atmospheric Radiation Measurement (ARM) has now registered over 400 datasets.
  10. 10. •Dataset Type •Originating Research Organization •Dataset Title •Publication/ Issue Date = Data Citation •Dataset Creator/Author or metadata submitted to Principal Investigator •Sponsoring Organization DOE-OSTI •Dataset Product Number •URL where the Dataset is posted for access •DOE Contract/Award Number •Contact information Web 241.6Service AN API Creator/Author, Primary Data Citation Investigator, or submitted to DOI Assigned By search engines Submitter notified of DOE-OSTI for indexing Data Citation availabilityDOE-OSTI submits nightly DOE-OSTI updates feed of new metadata record with DOI DOIs to DataCite creating a full Data Citation DataCite validates DataCite DOI registration with Registers DOI DOE-OSTI
  11. 11. •Dataset Type •Originating Research Organization•Dataset Title •Publication/ Issue Date•Dataset Creator/Authoror Principal Investigator •Sponsoring Organization•Dataset Product Number •URL where the Dataset is posted for access•DOE Contract/AwardNumber •Contact information
  12. 12. Federated SearchingSince science is not bound by agency,organization, or geography…• We integrate or aggregate multiple government R&D-related databases into single-search portals.• Innovative technology drills down to selected databases and websites in parallel, then presents ranked search results.
  13. 13.  Drills into the deep web, where scientific databases reside Finds dynamically generated content living inside those databases; high-quality managed subject-specific content Returns current, real-time results Presents no burden for database owner Allows for fielded searchingPlus Inexpensive to implement No need-to-know for user No searching door-to-door Automatic interoperability achieved
  14. 14.  Parallel Searching Visualization Clustering Relevancy Ranking
  15. 15. Science.gov Integrates Federal Agency R&D ResultsOSTI developed and operates Science.gov…a single search box portal toSTI from 13 federal science agencies.Represents 97 % of the federal research and development budget. • 200 million pages of science information • Over 55 databases • 2,100 select websites Expanding to formats beyond text to multimedia and data.
  16. 16. WorldWideScience.orgEnabling Access to Global R&D Results U.S. research results (Science.gov) plus research results from 70+ countries are searchable via single-query global science portal.• Multilingual translations capability for 10 languages.• More than 400 million pages of scientific and technical information, including: • Text • Multimedia • Data
  17. 17. Thank you! Lorrie Apple Johnson U.S. Department of EnergyOffice of Scientific and Technical Information JohnsonL@osti.gov