Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policies/Simon Hodson


Published on

Presentation during the 14th Association of African Universities (AAU) Conference and African Open Science Platform (AOSP)/Research Data Alliance (RDA) Workshop in Accra, Ghana, 7-8 June 2017.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policies/Simon Hodson

  1. 1. Open FAIR Data and Open Science: Developing Partnerships, Strategies, Policies Simon Hodson, Executive Director, CODATA African Open Science Platform Project and Research Data Alliance Workshop Association of African Universities Conference Palm Royal Beach Hotel, Accra, Ghana 8 June 2017
  2. 2. Why Open Science, Open Data and FAIR Data?
  3. 3. Why Open Science / FAIR Data? • Good scientific practice depends on communicating the evidence. • Open research data are essential for reproducibility, self-correction. • Academic publishing has not kept up with age of digital data. • Danger of an replication / evidence / credibility gap. • Boulton: to fail to communicate the data that supports scientific assertions is malpractice • Open data practices have transformed certain areas of research. • Genomics and related biomedical sciences; crystallography; astronomy; areas of earth systems science; various disciplines using remote sensing data… • FAIR data helps use of data at scale, by machines, harnessing technological potential. • Research data often have considerable potential for reuse, reinterpretation, use in different studies. • Open data foster innovation and accelerate scientific discovery through reuse of data within and outside the academic system. • Research data produced by publicly funded research are a public asset.
  4. 4. Data Revolution: A World that Counts!  Creating a world that counts: Mobilising the Data Revolution for Sustainable Development.  To meet the new sustainablity goals ‘there is an urgent need to mobilise the data revolution for all people and the whole planet in order to monitor progress, hold governments accountable and foster sustainable development.’  Without immediate action, gaps between developed and developing countries, between information-rich and information- poor people, and between the private and public sectors will widen, and risks of harm and abuses of human rights will grow.  Data quality and integrity  Data disaggregation (no-one should be invisible)  Data timeliness  Data transparency and openness  Data usability and curation  Data protection and privacy  Data governance and independence  Data resources and capacity  Data rights
  5. 5. Data Revolution: how can we improve … with open data?  GODAN-ODI Report: improving agriculture, food and nutrition with open data.  ‘Although the amount of data openly available is constantly increasing, there are still challenges related to data management, licensing, interoperability and exploitation. There is a need to evolve policies, practices and ethics around closed, shared, and open data.’  Enabling more efficient and effective decision making > lowers cost of accessing information and underpins tools that farmers themselves can use.  Fostering innovation to benefit everyone > an opportunity that must not be missed for creating new businesses and jobs in ‘new data- powered innovation ecosystems’.  Driving organisational and sector change through transparency > open data is essential to understanding complex systems, interventions, targets, change.  Availability is not enough > essential that the data be interoperable and machine-readable.  Problem oriented and solution-based data strategies.  Develop infrastructure and human capacity.
  6. 6. The Value of Open Data Sharing  Report by CODATA for GEO, the Group on Earth Observation.  Provides a concise, accessible, high level synthesis of key arguments and evidence of the benefits and value of open data sharing.  Particular, but not exclusive, reference to Earth Observation data.  Benefits in the areas of:  Economic Benefits  Social Welfare Benefits  Research and Innovation Opportunities  Education  Governance  Available at  GEO DSWG is building on this work with further examples: would be valuable to work with this community.
  7. 7. Africa, Data and Open Science  21st century is the century of data.  Data skills and infrastructure will be essential for economic advancement and for sustainable development.  We need to create a ‘world that counts’ that gathers data and uses data to understand itself.  Open data is essential to increase impact of research and translation for practitioners.  African governments, research and education systems and universities have an interest in developing data skills and infrastructure.  African universities have an essential role to play as educators of a data savvy generation and as the stewards of the data created by African research.  The data from many research projects conducted in Africa is not looked after in African institutions.  African institutions need to present their research outputs, including data, as a shop window and a record of their activities, achievements, impact.
  8. 8. Open Science  What is Open Science:  Open access to research literature.  Data that is as Open as possible, as closed as necessary.  FAIR Data (Findable, Accessible, Interoperable, Reusable).  A shop window and repository of all research outputs.  A culture and methodology of open discussion and enquiry (including methodology, lab notebooks, pre-prints)  Research data is evidence: it is fundamental to the validity and reproducibility of science.  Those research disciplines that have leapt forward in the past 15-20 years are those that have shared and analysed data at scale: genomics, astronomy, disciplines using remote sensing data etc.  African research institutions have an opportunity to build their reputation around research specialisation: and this requires data specialisation and FAIR data collections.
  9. 9. The Case for Open Data in a Big Data World • Science International Accord on Open Data in a Big Data World: • Supported by four major international science organisations. • Presents a powerful case that the profound transformations mean that data should be: • Open by default • Intelligently open • Lays out a framework of principles, responsibilities and enabling practices for how the vision of Open Data in a Big Data World can be achieved. • Campaign for endorsements: over 150 organisations so far. • Please consider endorsing the Accord:
  10. 10. National and Institutional Data Strategies
  11. 11. Opportunities for Research Institutions and for National Research Base Open and FAIR Research Data Presents Major Opportunities for Research Systems  Research intensive RPOs will be data intensive RPOs.  Supporting researchers’ use of data is a key strategic mission and enabler: world class research environment includes support for data stewardship.  An RPO’s reputation is increasingly built on all research outputs and wider societal and economic impact: data is core to this.  Development of significant data collections of research intensive universities. Leading departments / research groups will be characterised by excellence in data, by Open FAIR data collections.  The way in which the contribution to research of both the individual researcher and the institution will increasingly be measured on the basis of data outputs as well as research articles.  Policies less and less ambiguous – data stewardship, RDM is necessary for grant funding success.  Avoid reputational damage through data loss.
  12. 12. Challenges for Research Systems  Policy development: unpicking Open and FAIR data  Supporting data through the lifecycle.  Culture and incentives: what’s in it for us?  Skills gaps: training and support.  Technical systems and infrastructure.  Developing culture of conscious data stewardship: what to keep and what to discard.  Supporting the long term stewardship of research data: finding niche in data ecosystem, clarifying division of responsibility between institutional national and international repositories.  Sustainability and finance.  Sustainability and finance.  Sustainability and finance.  …
  13. 13. Framework for National and Institutional Data Strategies  National / Institutional Open and FAIR Data Strategy.  Open data policies and guidance at national and institutional level.  Clarify the boundaries of open (particularly privacy, IPR).  Mechanisms (infrastructure and policy) to ensure concurrent publication of data as research output.  Data ‘publication’ and citations of data included in assessment of research contribution.  Promotion of data skills (researchers and data stewards).  Development of institutional infrastructure for research collaboration and data stewardship/RDM.  Collaborative infrastructures for certain research disciplines, nationally, regionally to pool expertise and lower costs.
  14. 14. Data Management Planning Managing Active Data Processes for selection and retention Deposit / Handover Data Repositories/ Catalogues Components of RDM support services RDM Policy and Roadmap Business Plan and Sustainability Guidance, Training and Support Research Data Registry / Infrastructure 14 Institutional Research Data Management Policies: esources/policy-and- legal/institutional-data- policies/uk- institutional-data- policies
  15. 15. Group Work 1: Materials  Consider the principles laid out in the Science International Accord on Open Data in a Big Data World: (short version) or  Consider Section D of the Accord on Enabling Practices: (long version) or
  16. 16. Group Work 1: Activity 1. Endorse the Accord: Please consider endorsing the Accord and take and action to discuss it in your institution: 2. Responsibilities. Do you agree with the description of responsibilities in the Accord? Who are the key national stakeholders? Who are the key stakeholders in the research institution? What are their roles and responsibilities? What would their roles and responsibilities be ideally? What needs to be done to achieve this? (30 minutes) 3. Enabling Practices. What are the most important enabling practices? What things should a national or institution data strategy address? Critique the framework for National and Institutional Data Strategies. (30 minutes) 4. Reporting Back (20 minutes)
  17. 17. Open and FAIR Data Policy at National and Institutional Level
  18. 18. Resources: Current Best Practice for Research Data Management Policies  Expert report commissioned by CODATA member:  Provides comprehensive summary of best practice in funder data policies.  Identifies key elements to be addressed: 1. Summary of policy drivers 2. Intelligent openness 3. Limits of openness 4. Definition of research data 5. Define data in scope 6. Criteria for selection 7. Summary of responsibilities 8. Infrastructure and costs 9. DMP requirements 10. Enabling discovery and reuse 11. Recognition and reward 12. Reporting requirements, compliance monitoring
  19. 19. Resources: Current Best Practice for Research Data Management Policies  See also RECODE Report, Annex on Policy Development:  LEARN Project Toolkit:  FOSTER Knowledge Base on Open Science:
  20. 20. Group Work 2: Materials  Consider the framework for data policies in ‘Current Best Practice for Research Data Management Policies’ or  Consider the elements of data policies in the LEARN project ‘Developing a Research Data Policy: Core Elements of the Content of a Research Data Management Policy’:
  21. 21. Group Work 2: Activity 1. Develop the Outline of an Institutional (University or RPO) Research Data Management Policy (60 minutes) • What elements need to be included? • What could the institution say about these issues? • What would the process be for developing and adopting a data policy? • What are the key dependencies? • How would you go about it? 2. Reporting back (20 minutes)
  22. 22. Simon Hodson Executive Director CODATA Email: Twitter: @simonhodson99 Tel (Office): +33 1 45 25 04 96 | Tel (Cell): +33 6 86 30 42 59 CODATA (ICSU Committee on Data for Science and Technology), 5 rue Auguste Vacquerie, 75016 Paris, Thank you for your attention!
  23. 23. The Open Data Iceberg The Technical Challenge The Ecosystem Challenge The Funding Challenge The Support Challenge The Skills Challenge The Incentives Challenge The Mindset Challenge Processes & Organisation People Geoffrey Boulton (CODATA) - developed from an idea by Deetjen, U., E. T. Meyer and R. Schroeder A National Infrastructure Technology
  24. 24. Where should research data go? • Earth observation data; • Genetic data; • Social science survey data… Homogenous data collections essential for research • Significant data outputs from funded projects; • Raw and analysed experimental data… Significant data outputs of publicly funded research • Raw and analysed data for reproducibility (evidence); • Data behind the graph… Data underpinning research publications National and international data archives National or institutional data archives; data papers Dedicated data archives (e.g. Dryad)
  25. 25. Boundaries of Open  For data created with public funds or where there is a strong demonstrable public interest, Open should be the default.  As Open as Possible as Closed as Necessary.  Proportionate exceptions for:  Legitimate commercial interests (sectoral variation)  Privacy (‘safe data’ vs Open data – the anonymisation problem)  Public interest (e.g. endangered species, archaeological sites)  Safety, security and dual use (impacts contentious)  All these boundaries are fuzzy and need to be understood better!  There is a need to evolve policies, practices and ethics around closed, shared, and open data.
  26. 26. Emerging Policy Consensus? FAIR Data • FAIR Data (see original guiding principles at • Findable: have sufficiently rich metadata and a unique and persistent identifier. • Accessible: retrievable by humans and machines through a standard protocol; open and free; authentication and authorization where necessary. • Interoperable: metadata use a ‘formal, accessible, shared, and broadly applicable language for knowledge representation’. • Reusable: metadata provide rich and accurate information; clear usage license; detailed provenance. • FAIR Data now at the heart of H2020 policy, European Open Science Cloud etc. • Under the revised version of the 2017 work programme, the Open Research Data pilot has been extended to cover all the thematic areas of Horizon 2020. • Current EC Guidance at oa-data-mgt_en.pdf and infographic_072016.pdf • European Commission Expert Group (chaired by Simon Hodson, CODATA; Sarah Jones, DCC, Rapporteur) producing implementation guidelines for FAIR Data for EC Funded Programmes: draft report end 2017, final report March 2018:
  27. 27. FAIR Guiding Principles (1) • To be Findable: • F1. (meta)data are assigned a globally unique and persistent identifier • F2. data are described with rich metadata (defined by R1 below) • F3. metadata clearly and explicitly include the identifier of the data it describes • F4. (meta)data are registered or indexed in a searchable resource • To be Accessible: • A1. (meta)data are retrievable by their identifier using a standardized communications protocol • A1.1 the protocol is open, free, and universally implementable • A1.2 the protocol allows for an authentication and authorization procedure, where necessary • A2. metadata are accessible, even when the data are no longer available (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data,
  28. 28. FAIR Guiding Principles (2) • To be Interoperable: • I1. (meta)data use a formal, accessible, shared, and broadly applicable language for knowledge representation. • I2. (meta)data use vocabularies that follow FAIR principles • I3. (meta)data include qualified references to other (meta)data • To be Reusable: • R1. meta(data) are richly described with a plurality of accurate and relevant attributes • R1.1. (meta)data are released with a clear and accessible data usage license • R1.2. (meta)data are associated with detailed provenance • R1.3. (meta)data meet domain-relevant community standards (Mons, B., et al., The FAIR Guiding Principles for scientific data management and stewardship, Scientific Data,