Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open Science Globally: Some Developments/Dr Simon Hodson


Published on

Presented during the SA-EU Dialogue on Open Science on 15 & 16 May 2018.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Open Science Globally: Some Developments/Dr Simon Hodson

  1. 1. Open Science Globally: Some Developments Simon Hodson, Executive Director, CODATA SA-EU Open Science Dialogue Workshop Centurion Lake Hotel 15 May 2018
  2. 2. Why Open Science / FAIR Data? • Good scientific practice depends on communicating the evidence. • Open research data are essential for reproducibility, self-correction. • Academic publishing has not kept up with age of digital data. • Danger of an replication / evidence / credibility gap. • Boulton: to fail to communicate the data that supports scientific assertions is malpractice • Open data practices have transformed certain areas of research. • Genomics and related biomedical sciences; crystallography; astronomy; areas of earth systems science; various disciplines using remote sensing data… • FAIR data helps use of data at scale, by machines, harnessing technological potential. • Research data often have considerable potential for reuse, reinterpretation, use in different studies. • Open data foster innovation and accelerate scientific discovery through reuse of data within and outside the academic system. • Research data produced by publicly funded research are a public asset.
  3. 3. Policy Push for Open Research Data • Bits of Power: Issues in Global Access to Scientific Data (1997) • The three Bs (Budapest, Berlin and Bethesda) and Open Access, 2002-3 • OECD Principles and Guidelines on Access to Research Data, 2004, 2007 • UK Funder Data Policies, from 2001, but accelerates from 2009 • NSF Data Management Plan Requirements, 2010 • Royal Society Report ‘Science as an Open Enterprise’, 2012 • OSTP Memo ‘Increasing Access to the Results of Federally Funded Scientific Research’, Feb 2013 • G8 Science Ministers Statement, June 2013 • G8 Open Data Charter and Technical Appendix, June 2013 • EC H2020 Open Data Policy Pilot, 2014; Adoption of FAIR Data Principles, 2017. • Science International Accord on Open Data in a Big Data World, Dec 2015:
  4. 4.  Bill and Melinda Gates Foundation, Open Access and Open Data Policy  ‘Data Underlying Published Research Results Will Be Accessible and Open Immediately. The foundation will require that data underlying the published research results be immediately accessible and open. This too is subject to the transition period and a 12-month embargo may be applied.’  MSF Data Sharing Policy:  ‘MSF recognizes the ethical imperative it has to share its data openly, transparently and in a timely manner for the greater public health good.’  Appropriate restrictions for consent, privacy, etc.  European Commission Data Policy: ‘as open as possible, as closed as necessary’, FAIR Data  Wellcome Trust: strong support for Open Data sharing, with appropriate restrictions. Developments: Donor Policies
  5. 5.  Dryad Joint Data Archiving Policy, Feb 2010:  This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity.  PLOS Data Availability Policy, revised Feb 2014:  PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exceptions.  Springer Nature initiative to standardise policies:  RDA Interest Group developing standardised journal data policies. Developments: Journal Policies
  6. 6.
  7. 7. Open Science, FAIR Data: Commons, Clouds, Platforms…  Commons: ‘collectively owned and managed by a community of users’  Clouds: European Open Science Cloud (not just European, not entirely Open, not just for science and not exclusively cloud technology)…  Platform Approaches:  brokerage for discovery and access, reinforced by the development of common standards and principles or policies (e.g. GEOSS, Research Data Australia);  brokerage of services: approaches for discovery and access, augmented by the provision of services for particular research disciplines, including the promotion of skills, training, competences, standards, tools for analysis etc (e.g. Elixir, CESSDA and other ESFRIs, CGIAR on a global scale);  platform environment: utilizing the capacity of Cloud Computing for efficiency, access management, analysis across vast numbers of datasets, marketisation of services in a platform economy in which standards and common rules minimize vendor lock-in (e.g. NIH Data Commons, European Open Science Cloud).
  8. 8. EOSC Declaration  Data culture, Open access by default, Skills, Data stewardship, Rewards and incentives, FAIR principles, Standards, FAIR Data governance, Implementation and transition to FAIR, Research data repositories, Accreditation / certification, Data Management Plans, Technical implementation, Citation system, Common catalogues, Semantic layer, FAIR tools and services, Data expert organisations, Legal aspects, EOSC architecture, Implementation, Legacy, User needs, Service provision, Service deployment, Thematic areas, Research infrastructures, EU-added value and coordination, HPC and the EOSC, Innovation, Governance model, Governance framework, Executive board, Coordination structure, Long-term sustainability, Funding, Global aspects.  [FAIR principles] Implementation of the FAIR principles must be pragmatic and technology-neutral, encompassing all four dimensions: findability, accessibility, interoperability and reusability. FAIR principles are neither standards nor practices. The disciplinary sectors must develop their specific notions of FAIR data in a coordinated fashion and determine the desired level of FAIR-ness. FAIR principles should apply not only to research data but also to data-related algorithms, tools, workflows, protocols, services and other kinds of digital research objects.
  9. 9. Emerging Policy Consensus? FAIR Data • FAIR Data (see original guiding principles at • Findable: have sufficiently rich metadata and a unique and persistent identifier. • Accessible: retrievable by humans and machines through a standard protocol; open and free by default; authentication and authorization where necessary. • Interoperable: metadata use a ‘formal, accessible, shared, and broadly applicable language for knowledge representation’. • Reusable: metadata provide rich and accurate information; clear usage license; detailed provenance.
  10. 10. European Commission Expert Group on FAIR Data Core Deliverables 1. To develop recommendations on what needs to be done to turn each component of the FAIR data principles into reality 2. To propose indicators to measure progress on each of the FAIR components 3. Actively support the creation of the FAIR Data Action Plan, by proposing a list of concrete actions as part of its Final Report 4. Draft for consultation, released 11 June 2018, final report October 2018. 5. Support Commission in presentation of FAIR Data Action Plan in Autumn 2018. Report Structure 1. Concepts: Why FAIR? 2. Creating a culture of FAIR data 3. Making FAIR data a reality: technical perspective 4. Skills and capacities for FAIR data 5. Measuring Change 6. Facilitating Change: a FAIR Data Action Plan
  11. 11. International ‘ecosystem’ of open science and FAIR data components  Open Science infrastructure is not just the network, storage and compute.  Ecosystem of components which are created and governed internationally.  Reporting Research Outputs: information systems for research output reporting (CRIS), metadata standards e.g. CERIF, managed by euroCRIS.  Persistent and Unique Identifiers: DOIs for articles (CrossRef); DOIs for data sets (DataCite); author IDs (ORCID).  Data and Metadata Standards: CIF in crystallography, FITS in astronomy, DDI in social science surveys, Darwin Core in biodiversity, etc, etc.  DCC Registry of Metadata Standards ; now maintained by RDA IG  Data Repositories: listed in Re3Data, registry of data repositories:  Trusted Data Repositories: Core Trust Seal, a merger of Data Seal of Approval and the World Data System criteria.  Criteria for Trustworthy Digital Archives (DIN 31644)  Audit and certification of trustworthy digital repositories (ISO 16363)
  12. 12. Components of a FAIR ecosystem 12
  13. 13.  Vision of a coordinating activity to help put in place and link the enabling practices, capacities and technologies for Open Science.  Pan African in ambition.  Current three-year pilot funded by Department of Science and Technology via National Research Foundation; delivered by ASSAf, directed by CODATA.  Ambitious programme of engagement with a number of African countries, key stakeholders (universities, academies, NRENs, emerging research infrastructures.  Preparing the foundations for a broader initiative in terms of outputs and network building.  Successful first strategy workshop (March 2018) followed by a stakeholder workshop (Sept 2018) to prepare the platform initiative.  Aim for this to be launched at Science Forum South Africa, Dec 2018. African Open Science Platform
  14. 14.  Key deliverables of the pilot project will be foundations for the platform in these four key area: 1. Frameworks and guidance to assist policy development at national and institutional level. 2. Study and recommendations to reduce barriers and provide constructive incentives for Open Science. 3. Framework for data science training (including RDM, data stewardship and science of data); curriculum framework, training materials, recommendations for training initiatives. 4. Framework and roadmap for data infrastructure development: emphasising partnerships and de-duplication between national systems, economies of scale, institutions and domain initiatives. Framework for Policies, Incentives, Training and Technical Infrastructures
  15. 15. African Open Science Platform: Suggested Phase Two Activities 1. Registry of African data initiatives, collections and services 2. Coordination and provision of network, compute and storage (building on current work of NRENs, targeting needs of Open Science, achieving economies of scale). 3. A virtual space for scientists to find, deposit, manage, share and reuse data, software and metadata (i.e. support for / or provision of FAIR data components, data stewardship and Research Infrastructures). 4. An African Data Science Institute (to develop African capacities at the international cutting edge of research in data analytics, artificial intelligence, machine learning and data stewardship). 5. Major data-intensive programmes in science areas where Africa is data-asset rich (process for identifying these areas, obtaining funding, ensuring that RIs are in place). 6. Network for Education and Skills in Data and Information (training programmes in data science, data stewardship, data literacy, targeted at all stages of education). 7. Network for Open Science Access and Dialogue (building full engagement and joint action in transdisciplinary and citizen science initiatives as an essential component of Open Science).
  16. 16. INTERNATIONAL DATA WEEK IDW 2018 Gaborone, Botswana: 5-8 November 2018 Information: Deadline for abstracts, 31 May:
  17. 17. International Data Week Keynotes  Joy Phumaphi, former Minister of Health, Botswana; co-chair of WHO Group on Family and Community Health.  Rob Adam, Director of SKA South Africa, a major African science and data initiative.  Ismail Serageldin, founding Director of the new Biblioteca Alexandrina, noted thinker on science policy issues.  Elizabeth Marincola, former CEO of PLOS; now leading the African Academy of Sciences publication initiatives (see AAS Open Research).  Tshilidzi Marwala, VC of University of Johannesburg, noted thinker in Big Data and AI.
  18. 18. CODATA-RDA School of Research Data Science • Annual foundational school at ICTP, Trieste (with the objective to build a network of partners, train-the- trainers). • Advanced workshops, ICTP, Trieste, following the foundational school. • National or regional schools, organised with local partners. 2018 • Next #DataTrieste Summer School, 6-17 August 2018. • Next #DataTrieste Advanced Workshops 20-24 August 2018. • Call for applications, deadline 21 May: • Schools in Brisbane (UQ and Australian Academy of Sciences); ICTP Kigali (October); ICTP São Paulo (December)
  19. 19. Simon Hodson Executive Director CODATA Email: Twitter: @simonhodson99 Tel (Office): +33 1 45 25 04 96 | Tel (Cell): +33 6 86 30 42 59 CODATA (ICSU Committee on Data for Science and Technology), 5 rue Auguste Vacquerie, 75016 Paris, Thank you for your attention!
  20. 20. CODATA Prospectus: Principles, Policies and Practice Capacity Building Frontiers of Data Science Data Science Journal CODATA 2017, Saint Petersburg 8-13 Oct 2017
  21. 21. SciDataCon part of International Data Week  SciDataCon aims to help this community ensure that it has a concrete scientific record of its work: peer reviewed abstracts > presentations > Special Collection in the Data Science Journal.  Themes and Scope: see  Approved Sessions:  Incredibly rich range of topics. If you do not find a topic there you can submit an abstract to the general submissions.  Abstracts can be submitted to Approved Sessions or to General Submissions. Will be peer reviewed and distributed into the programme.  Abstracts for presentations and lightning talks/posters.  Deadline is 31 May:
  22. 22. CODATA-RDA School of Research Data Science • Contemporary research – particularly when addressing the most significant, interdisciplinary research challenges – increasingly depends on a range of skills relating to data. • These skills include the principles and practice of Open Science; research data management and curation, how to prepare a data management plan and to annotate data; software and data carpentry; principles and practices of visualisation; data analysis, statistics and machine learning; use of computational infrastructures. The ensemble of these skills, relating to data in research, can usefully be called ‘Research Data Science’.
  23. 23. DataTrieste Film on Vimeo: Call for applications, deadline 21 May: