Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open Science - Global Perspectives/Simon Hodson

36 views

Published on

Presented during the SA-EU Science Workshop

Published in: Science
  • Be the first to comment

  • Be the first to like this

Open Science - Global Perspectives/Simon Hodson

  1. 1. Open Science: Global Perspectives Simon Hodson, Executive Director, CODATA www.codata.org SA-EU Open Science Dialogue Workshop Birchwood Hotel & OR Tambo Conference Centre Johannesburg, South Africa 30 November 2017
  2. 2. Why Open Science / FAIR Data? • Good scientific practice depends on communicating the evidence. • Open research data are essential for reproducibility, self-correction. • Academic publishing has not kept up with age of digital data. • Danger of an replication / evidence / credibility gap. • Boulton: to fail to communicate the data that supports scientific assertions is malpractice • Open data practices have transformed certain areas of research. • Genomics and related biomedical sciences; crystallography; astronomy; areas of earth systems science; various disciplines using remote sensing data… • FAIR data helps use of data at scale, by machines, harnessing technological potential. • Research data often have considerable potential for reuse, reinterpretation, use in different studies. • Open data foster innovation and accelerate scientific discovery through reuse of data within and outside the academic system. • Research data produced by publicly funded research are a public asset.
  3. 3. Policy Push for Open Research Data • The three Bs (Budapest, Berlin and Bethesda) and Open Access, 2002-3 • OECD Principles and Guidelines on Access to Research Data, 2004, 2007 • UK Funder Data Policies, from 2001, but accelerates from 2009 • NSF Data Management Plan Requirements, 2010 • Royal Society Report ‘Science as an Open Enterprise’, 2012 • OSTP Memo ‘Increasing Access to the Results of Federally Funded Scientific Research’, Feb 2013 • G8 Science Ministers Statement, June 2013 • G8 Open Data Charter and Technical Appendix, June 2013 • EC H2020 Open Data Policy Pilot, 2014; Adoption of FAIR Data Principles, 2017. • Science International Accord on Open Data in a Big Data World, Dec 2015: http://bit.ly/opendata-bigdata
  4. 4. Developments: Journal Data Policies  Dryad Joint Data Archiving Policy, Feb 2010: http://datadryad.org/jdap  This journal requires, as a condition for publication, that data supporting the results in the paper should be archived in an appropriate public archive, such as GenBank, TreeBASE, Dryad, or the Knowledge Network for Biocomplexity.  PLOS Data Availability Policy, revised Feb 2014: http://www.plosone.org/static/policies.action#sharing  PLOS journals require authors to make all data underlying the findings described in their manuscript fully available without restriction, with rare exceptions.  Springer Nature initiative to standardise policies: http://www.springernature.com/gp/group/data-policy/policy-types  FAIRsharing https://fairsharing.org ;  RDA Interest Group to encourage development and adoption of journal data policies  AGU COPDESS activity to promote greater data availability in earth system sciences.
  5. 5. Resources: Current Best Practice for Research Data Management Policies  Expert report commissioned by CODATA member.  Provides comprehensive summary of best practice in funder data policies.  Identifies key elements to be addressed: 1. Summary of policy drivers 2. Intelligent openness 3. Limits of openness 4. Definition of research data 5. Define data in scope 6. Criteria for selection 7. Summary of responsibilities 8. Infrastructure and costs 9. DMP requirements 10. Enabling discovery and reuse 11. Recognition and reward 12. Reporting requirements, compliance monitoring  Zenodo: http://dx.doi.org/10.5281/zenodo.27872
  6. 6. CODATA Data Policy Activities  New Data Policy Committee, chaired by Paul Uhlir, international expert in Data Policies and member of CODATA Executive Committee.  Current Best Practice for Research Data Management Policies http://dx.doi.org/10.5281/zenodo.27872  The Value of Open Data Sharing, report for GEO http://dx.doi.org/10.5281/zenodo.33830  Legal Interoperability, Principles and Implementation Guidelines https://doi.org/10.5281/zenodo.162241  FAIR Data  Simon Hodson is chairing the European Commission’s Expert Group on FAIR Data: http://bit.ly/FAIR_Data_Expert_Group  OECD Global Science Forum and CODATA Project on Business Models for Sustainable Data Repositories: http://www.codata.org/working-groups/oecd-gsf- sustainable-business-models
  7. 7. The Case for Open Data in a Big Data World • Science International Accord on Open Data in a Big Data World: http://www.science-international.org/ • Supported by four major international science organisations. • Takes a global approach: data revolution and Open Science are phenomena with global ramifications. Repeatedly stresses the opportunities for LMICs and the negative consequences of being left outside a system of data intensive research. • Presents a powerful case that the profound transformations mean that data should be: • Open by default • Intelligently open, FAIR data • Lays out a framework of principles, responsibilities and enabling practices for how the vision of Open Data in a Big Data World can be achieved. • Campaign for endorsements: over 150 organisations so far. • Please consider endorsing the Accord: http://www.science- international.org/#endorse
  8. 8. The “Science International” Accord: principles of open data (www.icsu.org/science-international) Responsibilities 1-2. Scientists 3. Research institutions & universities 4. Publishers 5. Funding agencies 6. Scholarly societies and academies 7. Libraries & repositories 8. Boundaries of openness Enabling practices 9. Citation and provenance 10. Interoperability 11. Non-restrictive re-use 12. Linkability
  9. 9. Framework for Regional, National and Institutional Data Strategies  National / Institutional Open Science and FAIR Data Strategy  Consultative forum, stakeholder engagement.  Open data policies and guidance at national and institutional level.  Clarify the boundaries of open (particularly privacy, IPR).  Clarify the data in scope, guidelines on selection.  Develop incentives and reward systems.  Mechanisms (infrastructure and policy) to ensure concurrent publication of data as research output.  Data ‘publication’ and citations of data included in assessment of research contribution.  Promotion of data skills:  Essential data skills for researchers.  Develop skills and competencies for data stewards, data scientists.
  10. 10. Framework for Regional, National and Institutional Data Strategies  Scope, roadmap and implement data infrastructure.  Key components of national and regional infrastructure (network / NREN, economies of scale for storage and compute).  Development of regional, national and institutional infrastructure(s) for research collaboration and data stewardship/RDM, generic research platforms/environments, trusted digital repositories.  Collaborative infrastructures for certain research disciplines, nationally, regionally to pool expertise and lower costs.  International infrastructure / data ecosystem components: permanent identifiers, metadata standards.
  11. 11. Establish African Open Data Forum / Platform Funded Research Data Infrastructure Initiatives Funded, co-designed transdisciplinary research projects Co-design African Open Data Policies Develop Incentives Frameworks Develop Research Data Science Training African Research Data Infrastructure Roadmap Activities require low funding for coordination, secondment, contributions in kind and evaluation. Activities require higher investment for coordination, co-design implemenatation and evaluation. African Open Science Platform Pilot Project Workpackages
  12. 12. International ‘ecosystem’ of open science components  Open Science infrastructure is not just the network, storage and compute.  Ecosystem of components which are created and governed internationally.  Reporting Research Outputs: information systems for research output reporting (CRIS), metadata standards e.g. CERIF, managed by euroCRIS.  Persistent and Unique Identifiers: DOIs for articles (CrossRef); DOIs for data sets (DataCite); author IDs (ORCID).  Data and Metadata Standards: CIF in crystallography, FITS in astronomy, DDI in social science surveys, Darwin Core in biodiversity, etc, etc.  DCC Registry of Metadata Standards http://www.dcc.ac.uk/resources/metadata-standards ; now maintained by RDA IG http://rd-alliance.github.io/metadata-directory/  Data Repositories: listed in Re3Data, registry of data repositories: https://www.re3data.org/  Trusted Data Repositories: Core Trust Seal https://www.coretrustseal.org/, a merger of Data Seal of Approval and the World Data System criteria.  Criteria for Trustworthy Digital Archives (DIN 31644) http://www.data- archive.ac.uk/curate/trusted-digital-repositories/standards-of-trust?index=3  Audit and certification of trustworthy digital repositories (ISO 16363) http://www.data- archive.ac.uk/curate/trusted-digital-repositories/standards-of-trust?index=2
  13. 13. Global Registry of Data Repositories Country coverage in Re3Data.org (registry of data repositories.
  14. 14. Data Seal of Approval Location of repositories having acquired Data Seal of Approval
  15. 15. CODATA 2017 Session: ‘World Tour of Open Data and Open Science’  Presentations from USA, Finland, Japan, China, Australia, Canada, Israel, South Africa, Kenya.  Presentations covered: National Policies, Key Policy Players, Most Significant Projects, Barriers.  Intention is for the session to be written up as a series of surveys.
  16. 16. Australia: Policies  Australia Research Council (ARC)  Encourages researchers to deposit data arising from research in publicly accessible repositories.  Since 2014, requires researchers to provide a DMP as part of the application process  Australian Code for the Responsible Conduct of Research (2007) include the proper management of research data  “Research data should be made available for use by the other researchers unless this is prevented by ethical, privacy or confidentiality matters.”  Reference OECD Principles and Guidelines for Access to Research Data from Public Funding (2007)  National Health and Medical Research Council (NHMRC)  Acknowledges the importance of making data publicly accessible.  Encourages data sharing and providing access to data and other research outputs arising from NHMRC supported research.  2016 National Research Infrastructure Roadmap  Recommends Australian Research Data Cloud Slide Credit: Jane Hunter
  17. 17. Australia: Key Initiatives  ANDS – Australian National Data Service  Research Data Australia https://researchdata.ands.org.au/ (currently 133539 records)  ANDS has been a major player nationally and internationally (72M AUS$ over 3.5 years from 2009- 13)  NeCTAR – National eResearch Collaborative Tools and Resources  RDS – Research Data Services  CSIRO data activities; Data61 (digital and data innovation group).  Various Data Initiatives – particularly strong in ecosystem / biodiversity studies  TERN Terrestrial Ecosystem Research Network http://www.tern.org.au/  Atlas of Living Australia https://www.ala.org.au/  IMOS, Integrated Marine Observing System, a National Collaborative Research Infrastructure Slide Credit: Jane Hunter
  18. 18. Canada: Policies  Canada does not have an overarching policy per se but a patchwork of policy initiatives by important players is starting to emerge.  “Government and its information must be open by default. Simply put, it is time to shine more light on government to make sure it remains focused on the people it was created to serve – Canadians.“(PM Trudeau)  The 3 federal granting councils (NSERC, SSHRC, CIHR) have drafted a policy embracing the FAIR principles for research data and are entering a consultation process. They embraced open publications in 2015  Nine provinces/territories and 55 cities have embraced open data  Federal publications are available as part of the Depository Services Program.  Some individual universities are developing data repositories and support services to assist researchers in research data management.  Open journal publishing at an embryonic stage but is strongly supported by academic libraries. Slide Credit: Jon Broome and Ernie Boyko
  19. 19. Canada: Policies  Canada’s Portage Network works with research libraries and other stakeholders to coordinate expertise, services, and technology in research data management: https://portagenetwork.ca/  Federated Research Data Repository A scalable federated platform for digital research data management and the discovery of Canadian research data https://www.frdr.ca/repo/?locale=en  Research Data Canada a stakeholder-driven and supported organization dedicated to improving the management of research data in Canada.  Datacite Canada Canada's data registration service provided by National Research Council.  GENOME Canada  Canada Astronomy Data Centre Thirty years of open data research  GeoGratis Was an early example of open data when Natural Resources Canada made its mapping data open for use by the pubic  FACETS is Canada's first multidisciplinary open access science journal which will have a data section edited by Chuck Humphrey, Canada’s preeminent data scientist Slide Credit: Jon Broome and Ernie Boyko
  20. 20. Indian National Data Sharing and Accessibility Policy  Global experience has demonstrated convincingly that access to data leads to breakthroughs in scientific understanding as well as to economic and public good, in addition to several benefits to civil society. Given the deployment of substantial level of investment of public funds in collection of data and the untapped potentials of benefits to social society, it has become important to make available non-sensitive data for legitimate and registered use.  National Data Sharing and Accessibility Policy (NDSAP), March 2012: http://www.dst.gov.in/NDSAP.pdf  Places emphasis on a negative list of sensitive data types, rather than a positive list of data to be released: i.e. the default is open, unless the data is on the ‘stop’ list.  Allows for data to be Open, accessible to registered users and under restricted access.
  21. 21. Indian National Data Sharing and Accessibility Policy  Implementation Guidelines, Feb 2014: http://data.gov.in/sites/default/files/NDSAP_ Implementation_Guidelines_2.2.pdf  Deal mostly with public data, but research data produced by government institutes of funded by the government is in scope.  For data to be reused, it needs to be adequately described and linked to services that disseminate the data to other researchers and stakeholders. The current methods of storing data are as diverse as the disciplines that generate it. It is necessary to develop institutional repositories, data centers on domain and national levels that all methods of storing and sharing have to exist within the specific infrastructure to enable all users to access and use it.
  22. 22. India: Policies and Initiatives Policies  Government of India National Data Sharing and Accessibility Policy (NDSAP), March 2012  Followed by Implementation Guidelines, Feb 2014  Very unclear the level of implementation: lack of data infrastructure and cultural barriers. Key Initiatives  Lots of Open Access initiatives: e.g. in major universities; Librarians’ Digital Library (LDL) (https://drtc.isibang.ac.in); OpenMed@NIC (http://openmed.nic.in/) Open Access self-archive for medical and allied sciences.  Specific initiatives: e.g. National Mission for Manuscripts (http://namami.nic.in/) ; lots of activities around multilingual computing.  Little national infrastructure: activities around specific projects of institutions and portals? Slide Credit: Devika Madalli
  23. 23. China: Policies  MOST (Ministry of Science and Technology) has taken one year to draft a regulation on open research data, estimated it will be published in 2018  Previously there have been some data sharing polices on program level, for example, NSTIC (National Science and Technology Infrastructure), CAS (Chinese Academy of Science) scientific data program.  NSFC is pushing open research products, mainly focus on OA for research papers. Slide Credit: Jianhui LI
  24. 24. China: Key Initiatives  National Science and Technology Infrastructure, supported by MOST (the Ministry of Science and Technology)  From an the early stage (from 2001), supported the creation of 13 scientific data centres/sharing platforms covering the fields of agriculture, forestry, seismicity, meteorology, marine science, earth system, population and health, biology, chemistry, materials, energy, etc.  CAS Practice-Scientific Data Programme  Long-term mission started in 1986, funded by CAS  Many institutes involved; long-term, large-scale collaboration; data from research, for research  Collecting multi-discipline research data and promoting data sharing  More than 350 research databases and 1350 datasets by 61 institutes  Over 600TB data available to open access and download  CAS Big Data Earth programme  The strategic Priority Research Program (Category A)  Almost 1.8 billion YMB for 5 years;  Biodiversity, ecology system, environment, natural resources, earth science Slide Credit: Jianhui LI
  25. 25. RDA and Open Science RDA is a unique global platform to synchronize between the national, European, disciplinary and sector stakeholders to support the transition towards Open Science, an Open World and Open Innovation. Our open, neutral social platform where international research data experts meet, covering 130 countries and more than 6000 individual members, facilitates the exchange of views and alignment on topics at the heart of Open Science, e.g. social hurdles on data culture, data stewardship and training challenges, data management plans and certification of data repositories to name just a few of the priorities addressed. RDA is building the social and technical bridges that enable open sharing of data to achieve its vision of researchers and innovators openly sharing data across technologies, disciplines, and countries to address the grand challenges of society. rd-alliance.org/about-rda WWW.RD-ALLIANCE.ORG @RESDATALL CC BY-SA 4.0
  26. 26. CODATA Prospectus: https://doi.org/10.5281/zenodo.165830 Principles, Policies and Practice Capacity Building Frontiers of Data Science Data Science Journal CODATA 2017, Saint Petersburg 8-13 Oct 2017
  27. 27. INTERNATIONAL DATA WEEK IDW 2018 Gaborone, Botswana: 22–26 October 2018
  28. 28. Digital Frontiers of Global Science Frontier issues for research in a global and digital age. Applications, progress and challenges of data intensive research. Data infrastructure and enabling practices for international and collaborative research. Data, development and innovation: data as an interface between research, industry, government, society and development.
  29. 29. Botswana, Africa and the World! Stable, safe, modern and exciting country
  30. 30. Simon Hodson Executive Director CODATA www.codata.org http://lists.codata.org/mailman/listinfo/codata-international_lists.codata.org Email: simon@codata.org Twitter: @simonhodson99 Tel (Office): +33 1 45 25 04 96 | Tel (Cell): +33 6 86 30 42 59 CODATA (ICSU Committee on Data for Science and Technology), 5 rue Auguste Vacquerie, 75016 Paris, Thank you for your attention! Slide Credits: Geoffrey Boulton, Jane Hunter, John Broome, Ernie Boyko, Devika Madalli, LI Jianhui
  31. 31. What Are the Barriers Reasons why respondents are hesitant to share their data, n=7082
  32. 32. What Are the Barriers in Australia to Advancing Open Data/Open Science 32 • Future funding situation – long term sustainable business models for “Aust Research Data Cloud” • Simple, fast, clear easy-to-use processes & services • Data curation is expensive & time consuming – Other priorities, no incentives • University repositories vs national repositories (RDA) vs discipline repositories ???? • Data licensing & agreements – govt & agency data
  33. 33. What are the key barriers in Canada to advancing OR/OS? • The lack of a national data services strategy. There have been previous articulations of a strategy but all that exists is a patchwork. • The tenure and promotion procedures used by research institutions  Do not recognize data as a research product  Competitive nature of obtaining grants • Researcher reluctance is causing Tri-Agencies to move slowly on implementing research data policies linking grants to data deposit (even if data are not shared). • Incomplete understanding of the issues by senior university administration. • Insufficient expertise and financial support to prepare data for reuse. • A shortage of suitable data repositories.
  34. 34. DataTrieste Film on Vimeo https://vimeo.com/232209813
  35. 35. Göttingen-CODATA Symposium 18-20 March 2018  The critical role of university RDM infrastructure in transforming data to knowledge  http://conference.codata.org/2018-Goettingen-RDM/  An opportunity to share experiences, research and insights in the development implementation of RDM services in research institutions.  Special collection of Data Science Journal.  Themes: services and solutions; strategy; measuring success; skills and support; sustainability; shared services and outsourcing / consortiums; service level, trust and FAIR; champions and engaging with researchers.  Announcement of call for papers in the next week or so.  Deadline for abstracts, 15 December.

×