Research data: what can libraries do?
Upcoming SlideShare
Loading in...5
×

Like this? Share it with your network

Share
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
185
On Slideshare
185
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
0
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Research data: what can libraries do? Zaven Akopov Deutsches Elektronen-Synchrotron DESY Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 1
  • 2. Contents • Data preservation as an intrinsic part of Open Data • HEP Data: challenges and specifics • Requirements for documentation and long-time storage • Types of documents • high-level (secondary) data • Assignment of metadata and long-time storage in Inspire Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 2
  • 3. Data Preservation as an intrinsic part of Open Data • Data management cycle in HEP • data taking – initial storage – data processing – storage of processed data (high level data) – physics analysis (software) – publication of papers (interpretation) • initially not planned for open access – but limited lifetime of experiments … • providing infrastructure for data preservation might be paving the path to sustainable (open) access to data Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 3
  • 4. HEP Research Data: Motivation • Accelerators, Detectors • Unique experimental data, usually not reproducable in other labs • A lot of resources and investments to build detectors, provide for manpower for data analysis • By the end of experimental data taking still substantial amount of data not analyzed • The data can also be processed and analyzed using eventually new methods and models which developed over time; new approaches. • Bottomline: the HEP data should be made accessible and be preserved Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 4
  • 5. Efforts and models • DPHEP Working group active since 2009: www.dphep.org • 4 “levels” of HEP Research data and its preservation have been identified Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 5
  • 6. Efforts and models • Most collaborations and labs plan the “Level 4 preservation” (raw data) • The requirements of “Level 1” and partially, of “Level 2” can and should be fulfilled using a high-end bibliographic system, with metadata assignment, etc. • This is where the Library can help to close the gap in the complete data preservation cycle: • Identifiable data • Focus data access: preservation is an integral part of data management. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 6
  • 7. Specific cases • To provide the necessary know-how for re-use: not only the (primary) data itself and analysis software needed, but also the associated documentation related to data taking and analysis (technical guides, internal notes,…) • … which provide basis for the corresponding publications – but also substantially more additional information, e.g.: • Details of the data analysis methods (software, simulation, …) • Detectors and their components (pedestals, operating parameters, …) • Need to preserve: secondary data (tables, root scripts, codes, plots) - simplified Data from Level 2 Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 7
  • 8. Long-time storage • Example HERA (DESY): • By the end-of-running, storage on collaboration/IT structures (servers run by the staff or IT) • Websites, AFS space, propietary structures, etc. • Lack of real bibliographic system (metadata, complex search engine, …) • No consistent strategy and no sustainability: these structures would not be preserved by the DESY/IT -> provide infrastructure Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 8
  • 9. Why INSPIRE? • Ingestion of the technical documents and analysis notes - possibility to interlink them with the actual publications (based on…, superceded by…) • The documents are preserved and are not dependent on the life expectancy of the specific experiment and it‘s IT infrastructure • Many Inspire features like fulltext search, data object citation, etc. • The secondary data (high-level data) provide added value to the existing publications Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 9
  • 10. INSPIRE Features • Google-like speed for up to 2M records • Combined search of Metadata, References, Fulltext • Scalability • Flexible metadata (multimedia, secondary data) • Personalisation (claim your paper) • One-stop-shop for HEP information • Fulltext repository • Integration of research data Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 10
  • 11. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 11
  • 12. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 12
  • 13. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 13
  • 14. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 14
  • 15. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 15
  • 16. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 16
  • 17. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 17
  • 18. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 18
  • 19. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 19
  • 20. Long-term access • The documents (Notes) Access Control • Working together with the labs to develop effective access control strategies (short/long-term) • Simple user accounts are live • Curator accounts live • Further options • Flexible user accounts (based on author lists, external authentification SSO), e.g. arXiv account Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 20
  • 21. Summary • • • • • Collaboration of DESY Library (Inspire) with DPHEP • HERMES, ZEUS, H1, ZEUS Experiments • Internationally – DØ, CDF (Fermilab), BaBar (Stanford) First stage completed for all HERA experiments and D0 (Fermilab): all of the internal documentation are stored in Inspire Second Stage comleted: Collaboration curator accounts with modification rights are also live and a success Test phase: ZEUS and BaBar preliminary notes harvested; the rest should follow; High-level Data in Inspire: HEPData, Plots, etc. – Inspire provides longtime preservation. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 21
  • 22. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 22
  • 23. Z. Akopov
  • 24. Zaven Akopov | “Research Data: What can libraries do? | Helmholtz Open Access Workshop June 11, 2013 | Page 24