Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open data pilot

Talk given at the FOT-NET data workshop in Brussels

  • Login to see the comments

Open data pilot

  1. 1. The Horizon 2020 Open Data Pilot Sarah Jones Digital Curation Centre, Glasgow sarah.jones@glasgow.ac.uk Twitter: @sjDCC Fot-Net Data Stakeholder Meeting on Open Data and Data Re-use in Horizon 2020, 10th March 2015, ERTICO, Brussels Funded by:
  2. 2. What is the Digital Curation Centre? “a centre of expertise in digital information curation with a focus on building capacity, capability and skills for research data management across the UK's higher education research community” www.dcc.ac.uk
  3. 3. Benefits and drivers WHY SHARE DATA (OPENLY)? Image CC-BY-NC-SA by Wonderwebby www.flickr.com/photos/wonderwebby/2723279491
  4. 4. It’s part of good research practice
  5. 5. Science as an open enterprise https://royalsociety.org/policy/projects/science-public-enterprise/Report “Much of the remarkable growth of scientific understanding in recent centuries is due to open practices; open communication and deliberation sit at the heart of scientific practice.” The Royal Society report calls for ‘intelligent openness’ whereby data are accessible, intelligible, assessable and usable.
  6. 6. Faster scientific breakthroughs www.nytimes.com/2010/08/13/health/research/13alzheimer.html?pagewanted=all&_r=0 “It was unbelievable. Its not science the way most of us have practiced in our careers. But we all realised that we would never get biomarkers unless all of us parked our egos and intellectual property noses outside the door and agreed that all of our data would be public immediately.” Dr John Trojanowski, University of Pennsylvania
  7. 7. Increased use and economic benefit UP TO 2008 Sold through the US Geological Survey for US$600 per scene Sales of 19,000 scenes per year Annual revenue of $11.4 million SINCE 2009 Freely available over the internet Google Earth now uses the images Transmission of 2,100,000 scenes per year. Estimated to have created value for the environmental management industry of $935 million, with direct benefit of more than $100 million per year to the US economy Has stimulated the development of applications from a large number of companies worldwide The case of NASA Landsat satellite imagery of the Earth’s surface: http://earthobservatory.nasa.gov/IOTD/view.php?id=83394&src=ve
  8. 8. HORIZON 2020 OPEN DATA PILOT Image CC-BY-NC-SA by Tom Magllery www.flickr.com/photos/lwr/13442910354
  9. 9. Why open access and open data? “The European Commission’s vision is that information already paid for by the public purse should not be paid for again each time it is accessed or used, and that it should benefit European companies and citizens to the full.” http://ec.europa.eu/research/participants/data/ ref/h2020/grants_manual/hi/oa_pilot/h2020-hi-oa- pilot-guide_en.pdf
  10. 10. H2020 open data pilot • Seven areas are participating in the pilot, which correspond to about €3 billion or 20% of the overall Horizon 2020 budget in 2014 and 2015. • Projects in other areas can opt in on a voluntary basis Guidelines on Data Management in Horizon 2020 http://ec.europa.eu/research/participants/data/ref/h2020/grants_manual/hi/oa_pi lot/h2020-hi-oa-data-mgt_en.pdf • Participants can opt out at proposal stage or during the lifetime of the project • Reasons for exemption to be explained in the DMP
  11. 11. Which data does the pilot apply to? Data, including associated metadata, needed to validate the results in scientific publications Other curated and/or raw data, including associated metadata, as specified in the DMP Doesn’t apply to all data (researchers to define as appropriate) Don’t have to share data if inappropriate – exemptions apply
  12. 12. Key requirements of the open data pilot 1. Deposit in a research data repository 2. Make it possible for third parties to access, mine, exploit, reproduce and disseminate data – free of charge for any user 3. Provide information on the tools and instruments needed to validate the results (or better still provide the tools) Image CC-BY-NC-SA by adesigna www.flickr.com/photos/adesigna/4090782772
  13. 13. Data Management Plans Projects participating in the pilot will be required to develop a Data Management plan (DMP), in which they will specify what data will be open. • What types of data will the project generate/collect? • What standards will be used? • How will this data be shared/made available? If not, why? • How will this data be curated and preserved? Note that the Commission does NOT require applicants to submit a DMP at the proposal stage. DMPs are a deliverable for those participating in the pilot.
  14. 14. Good practice, tools, infrastructure & services SUPPORT FOR IMPLEMENTATION
  15. 15. Data sharing: degrees of openness Open Restricted Closed Content that can be freely used, modified and shared by anyone for any purpose Limits on who can use the data, how or for what purpose - Charges for use - Data sharing agreements - Restrictive licences - Peer-to-peer exchange - …  online under an open licence  structured data  non-proprietary formats  use URIs to denote things  link data to provide context Five star open data http://5stardata.info Unable to share Under embargo
  16. 16. How to make data open? 1. Choose your dataset(s) What can you may open? You may need to revisit this step if you encounter problems later. 2. Apply an open license Determine what IP exists. Apply a suitable licence e.g. CC-BY or CC0 3. Make the data available Provide the data in a suitable format. Use repositories. 4. Make it discoverable Post on the web, register in catalogues… https://okfn.org
  17. 17. www.dcc.ac.uk/resources/how-guides/license-research-data Data licensing This DCC how-to guide outlines pros and cons of each approach and gives practical advice on how to implement your licence. • Do you own the rights or have permission to redistribute? • Do you need to place restrictions on who can use the data or how?
  18. 18. EUDAT licensing wizard http://ufal.github.io/lindat-license-selector Search / browse through a list of possible licences Or answer questions to determine which is most suitable
  19. 19. Metadata standards • Good metadata is key for research data access and re-use • Many disciplines have formalised community metadata standards • Use relevant standards for interoperability www.dcc.ac.uk/resources/metadata-standards
  20. 20. Data catalogues Institutional services e.g. DataFinder at the University of Oxford National services e.g. Research Data Australia and RDDS pilot in the UK Data centres and community initiatives e.g. FOT Data Catalogue, B2FIND etc
  21. 21. Joining up data catalogues
  22. 22. Data repositories http://databib.org http://service.re3data.org/search Zenodo • Joint effort by OpenAIRE- CERN • Multidisciplinary repository • Multiple data types – Publications – Long tail of research data • Citable data (DOI) • Links funding, publications, data & software www.zenodo.org • Does your publisher or funder suggest a repository? • Are there data centres or community databases for your field? • Does your university offer support for long-term preservation?
  23. 23. EUDAT services EUDAT offers a pan-European solution, providing a generic set of services to ensure minimum level of interoperability Building common data services in close collaboration with 25+ communities www.eudat.eu
  24. 24. EUDAT B2 service suite Covering both access and deposit, from informal data sharing to long-term archiving, and addressing identification, discoverability and computability of both long-tail and big data, EUDAT’s services will address the full lifecycle of research data
  25. 25. Institutional RDM support services Diagram courtesy of Sally Rumsey, University of Oxford University of Edinburgh Research Data Management Roadmap www.ed.ac.uk/schools- departments/information- services/about/strategy- planning/rdm-roadmap Research Data Oxford http://researchdata.ox.ac.uk
  26. 26. Support on Data Management Plans • Checklist on what to include • How to guide on developing a plan • Guidance on assessing plans (forthcoming) • Webinars and training materials • DMPonline tool • Example DMPs www.dcc.ac.uk/resources/data-management-plans
  27. 27. DMPonline • Presents requirements from funders • Guidance from funder, uni, discipline… • Example answers • Ability to share plans with collaborators • Export into a variety of formats • … https://dmponline.dcc.ac.uk
  28. 28. Thanks for listening DCC guidance, tools & case studies: www.dcc.ac.uk/resources Follow us on twitter: @digitalcuration and #ukdcc

×