Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Where is the opportunity for libraries in the collaborative data infrastructure?


Published on

Presentation by Susan Reilly at Bibsys2013 on the opportunties for libraries and their role in the collaborative data infrastructure. Looks at data sharing, authentication, preservation and advocacy.

  • Be the first to comment

Where is the opportunity for libraries in the collaborative data infrastructure?

  1. 1. Where is the opportunity forlibraries in the collaborativedata infrastructure?Susan ReillyProject
  2. 2. Contents About LIBER Some context What is the collaborative data infrastructure? Introducing the researcher to the CDI Introducing the CDI to the researcher Now and next?
  3. 3. LIBER: reinventing the library of the future Largest network of European reseach libraries: 450 in over 40 countries Mission: To provide an information infrastructure to enable research in LIBER institutions to be world class
  4. 4. Key performance areas Scholarly communication and research infrastructures Reshaping the research library Advocacy
  5. 5. LIBER Projects Reshaping The research library Scholarly Communication Advocacy & Research Infrastructure
  6. 6. So why am I here? Reshaping Collaborative data The infrastructure research library Scholarly Communication Advocacy & Research Infrastructure
  7. 7. What is the collaborative data infrastructure(scientific data infrastructure)? …it’s about data
  8. 8. Not just the 20+ petabytes that the LHC at CERN produces every year
  9. 9. Libraries in the data deluge Increasing amount of digitised and born digital content in libraries Increasing emphasis on open access publications and data: mandates, institutional repositories Demand for data management support
  10. 10. What is the collaborative data infrastructure? “a broad, conceptual framework for how different companies, institutes, universities, governments and individuals would interact with the system – what types of data, privileges, authentication or performance metrics should be planned. This framework would ensure the trustworthiness of data, provide for its curation, and permit an easy interchange among the generators and users of data”
  11. 11. Now and Next Authentication & authorisation New skills
  12. 12. Introducing the researcher to the CDI Current situation ODE & linking data to publications Demand for data management support Advocacy
  13. 13. Opportunities for data exchange (ODE) identify, collate, interpret and deliver evidence of emerging best practices in sharing, re-using, preserving and citing data, the drivers for these changes and barriers impeding progress, in forms suited to each audience policy makers, funders, infrastructure operators, data centres, data providers and users, libraries and publishers
  14. 14. Steps to creating the conditions for data sharing Understand data sharing today  Collection of "success stories”, “near misses” and “honourable failures” in data sharing, re-use and preservation Data & scholarly communications  Integrating data and publications  Best practice in data citation  New roles Identify drivers and barriers  Interviews with stakeholder to seek consensus Foto "Bell", Noordewierweg 116, Amersfoort.
  15. 15. Hypotheses “Without the infrastructure that helps scientists manage their data in a convenient and efficient way, no culture of data sharing will evolve.” Stefan Winkler-Nees (German Research Foundation, DFG)
  16. 16. Hypotheses by Category4.Attitudes6.Policies8.Infrastructure10.DMPs,Citability11.Dependency ondiscipline
  17. 17. The DataPublication Pyramid (1) Data contained and explained within the article (2) Further data explanations in any kind of supplementary (3) Data files to articles referenced from the article and held in data centers and (4) Data repositoriespublications, describing available datasets (5) Data in drawers and on disks at the institute
  18. 18. The Pyramid’s likely short term reality: (1) Top of the pyramid is stable but small (2) Risk that supplements to articles turn into Data Dumping (3) Too many places disciplines lack a community endorsed data archive (4) Estimates are that at least 75 % of research data is never made openly avaiable 21
  19. 19. (1) More integration of text and data, viewers and seamless links to interactive datasets The Ideal Pyramid (2) Only if data cannot be integrated in (3) Seamless links article, and only (bi-directional) relevant extra between explanations publications and data, interactive(4) More Data viewers within the Journals that articles describedatasets, datamgt plans anddata methods 22
  20. 20. Issues for researchers Researchers need somewhere to put data and make it safe for reuse Researchers need to control its sharing and access Researchers need the ability to integrate data and publication Researchers need to get creditfor data as a first class researchobject Researchers need someone topay for the costs of data availabilityand re-use
  21. 21. Library support for the researcherLibraries and data centres must support… data as first class research object: Availability publishing, persistent identification/citation of datasets data description, metadata, standards Findability documentation and retrieval proper documentation of data Interpretability long-term data archiving including data curation and preservation Re-usability
  22. 22. Implications for librariesLevel of integration Implication for libraryData contained within the article  Prepare for adequate preservation strategiesData published in supplementary files to  Presentation and preservationarticles mechanisms  Persistent linkDatasets referenced from the articles  Citability of dataset  Persistent link  Perpetual access to datasetData published independently from written  Support publication processpublications (“data publication”)  Curation of datasets  Metadata and documentationData in drawers and on disks at the  Engage in data managementinstitute planning
  23. 23. Demand for data management support
  24. 24. Advocacy “Many researchers do not appear to see the value and benefits of data citation. There is a gap, which could be filled by libraries, in advocacy for data sharing, the use of subject specific repositories, and best practice in data citation. These, if filled, would increase the number of researchers sharing and reusing data.”
  25. 25. Introducing the CDI to the researcher Scoping the researcher’s requirements Collaboration & policy development
  26. 26. The AAA Study: a research passport“evaluate the feasibility of delivering an integrated Authentication and Authorisation Infrastructure, AAI, to help the emergence of a robust platform for access to and preservation of scientific information within a Scientific Data Infrastructure (SDI)”
  27. 27. Now and Next Authentication & authorisation New skills
  28. 28. Methodology
  29. 29. The Google Generation
  30. 30. Collaboration “Networked science is on the rise, the researcher is no longer working alone in his office, he is working virtually with other researchers from around the world. For them it is important that they can use the same software and share and reuse the same content related objects, in a trusted environment.” Heinke Neuroth, Head of Innovation, Goettingen State & University Library
  31. 31. Use Cases1. Creating Data2. Processing Data3. Sharing Data4. Preserving Data5. Multi-disciplinary Data Services6. Analysing Data7. Accessing Data8. Accessing Experiments and Data
  32. 32. Requirements… Tracking of provenance, authenticity, integrity of the material Integration of researcher ID with institutional credentials Researchers’ self registration Securely linking researcher and data identifiers for tracking provenance Delegation of identity management to home institute Attribute provisioning for users participating in specific research projects managed by the specific research groups (VOs) Attribute aggregation Unification and homogenisation of identity federations´ attributes and agreed levels of assurance in order to facilitate authorisation Accreditation of trusted identity Providers (IdPs), based on international standards, depending on the required level of assurance Entitlement management to minimise the occurrence of events where license monies are being paid twice without necessity (e.g., for access to scientific journals).
  33. 33. Technical infrastructure
  34. 34. Legal Recommendations Need to protect the user
  35. 35. Collaboration & policy development Policies for data sharing  Values & Ecosystems  Infrastructure & Technology  Legal & Ethical  Institutional Support
  36. 36. Now & next What should our priorities be?LIBER ten recommendations:
  37. 37. 1. Identify & develop skills
  38. 38. 2.Collaborate Alliance for Permanent Access to the Record of Science in Europe Network (APARSEN)  look across the excellent work in digital preservation which is carried out in Europe and to try to bring it together under a common vision Trust! Sustainability! Usability! Access!
  39. 39. Engage
  40. 40. Thank you! Any questions?