Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science Agenda: where we are and where we are going?

129 views

Published on

Slides for keynote talk at the Big Data Europe workshop nr 3 on 11.9.2017 in Amsterdam co-located with SEMANTiCS2017 conference by Ron Dekker, Director CESSDA: European Open Science Agenda: where we are and where we are going?

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science Agenda: where we are and where we are going?

  1. 1. Ron Dekker Director CESSDA European Open Science Agenda: where we are and where we are going?
  2. 2. Contents Open Science EC Agenda Re-use of Data CESSDA
  3. 3. Open Science - Definition Michael Nielsen "Open science is the idea that scientific knowledge of all kinds should be openly shared as early as is practical in the discovery process." scientific knowledge of all kinds: includes journal articles, data, code, online software tools, questions, ideas, and speculations; anything which can be considered knowledge. as is practical: very often there are other factors (legal, ethical, social, etc) that must be considered.
  4. 4. TRENDS 1. SCIENCE WILL OPEN UP • Data-driven • Reproducibility • Better connect within science and with society 2. INFORMATION SOCIETY 3. PLATFORMS
  5. 5. TRENDS 1. SCIENCE WILL OPEN UP 2. INFORMATION SOCIETY • Data is the new oil • Re-usable 3. PLATFORMS
  6. 6. TRENDS 1. SCIENCE WILL OPEN UP 2. INFORMATION SOCIETY 3. PLATFORMS • Value-creating interactions between producers & users • No ownership by provider
  7. 7. Open Science EC Agenda Re-use of Data CESSDA
  8. 8. FAIR Findable Easy to find by both humans and computer systems Based on mandatory description of the metadata; Accessible Stored for long term Easy access /download well-defined license and access conditions At the level of metadata, or at the level of the actual data content Interoperable Ready to be combined with other datasets  By humans as well as computer systems Reusable Ready to be used for future research
  9. 9. FAIR Findable Easy to find by both humans and computer systems Based on mandatory description of the metadata; Accessible Stored for long term Easy access /download well-defined license and access conditions At the level of metadata, or at the level of the actual data content Interoperable Ready to be combined with other datasets  By humans as well as computer systems Reusable Ready to be used for future research CATALOGUE META DATA F+A+I = R ?
  10. 10. Data Management Plans A Data Management Plan provides information on: •The data the research will generate •How to ensure its •curation, •preservation and •sustainability •What parts of that data will be open (and how)
  11. 11. DMPs: mainly sticks … Sticks • Obligations by many stakeholders • risk of fragmentation and red tape Carrots • Tools • DMPonline https://dmponline.dcc.ac.uk • “Lab Journal” software/tools that • ensure Reproducibility • reserve a DOI for the data • upload the DMPs to publishing platforms • Change: Publish and Curate your data • different mindset LabFolder integrates with Mendeley
  12. 12. European Cloud Initiative 3 pillars (COM 2016/178 - 19 April 2016) European Data Infrastructure (EDI) Development and deployment of large-scale European HPC, data and network infrastructure Widening access SMEs, Industry at large, Government European Open Science Cloud (EOSC) Researchers have seamless access to all relevant data
  13. 13. European Open Science Cloud Connect with Open Science • EOSC is part of Europe´s ambition to support the transition to Open Science and make the most of data-driven science. Efficient • It's cost-effective, • Covers privacy & IPR-conscious • Combine existing infrastructure • Federation of existing and emerging infrastructures Added value • Scale, data-driven science, inter-disciplinary, • Data - to - knowledge - to - innovation
  14. 14. Clouds are already existing NIH Commons NSF Open Science Cloud Microsoft Azure Amazon Web Services
  15. 15. How to proceed? • Let 1000 flowers bloom, or top-down • By Nation or Discipline • Pipelines (silo’s) or Platforms COMPUTING NETWORKS SOFTWARE CONTENT
  16. 16. It’s a cultural challenge How to … • Create a safe & secure environment • Realise authentication - of users, of producers • Deal with sensitive data • Ensure quality of the data • Stimulate sharing data • Bring trust
  17. 17. Open Science EC Agenda Re-use of Data CESSDA
  18. 18. Open Science A systemic change in the modus operandi of science and research Affecting the whole research cycle and its stakeholders Commissioner Carlos Moedas Open Science Presidency Conference Amsterdam, 4 April 2016
  19. 19. European Open Science Agenda 1.Reward systems 2.Altmetrics: measuring quality and impact 3.New models for publishing 4.FAIR open data 5.Open Science Cloud 6.Research integrity 7.Citizen Science 8.Open education and skills
  20. 20. European Open Science Agenda 1.Reward systems 2.Altmetrics: measuring quality and impact 3.New models for publishing 4.FAIR open data 5.Open Science Cloud 6.Research integrity 7.Citizen Science 8.Open education and skills
  21. 21. Open Science Monitor
  22. 22. Developments Data • Data Management Plans • FAIR & Secure and Safe • European Open Science Cloud
  23. 23. Open Science EC Agenda National Policies Re-use of Data CESSDA
  24. 24. CESSDA Mission • Provide a distributed and sustainable research infrastructure that enables the research community to conduct high-quality research in the social sciences Vision • Platform to provide seamless access to FAIR social science research data in a safe & secure way
  25. 25. Stakeholders Members (Funders) • Governments, Research Funding Organisations • Universities, other Research Performing Organisations Service Providers • Data Services • IT Infrastructure (computing, network, software) • Research Libraries • Publishers Data Producers • Researchers & Research Performing Organisations Data Re-Users • Researchers, Professionals, Citizens
  26. 26. CESSDA Strategy • Technology • CESSDA Catalogue (Findable) • Pathfinder Projects on FAIR, Secure/Safe/Seamless • Trust • Safe & Secure Data Infrastructure • incl. Single Sign On, Different Access Modes • CESSDA Providers as Trusted Repositories • Training & Tools • Train the Trainers & Train the Researchers • Tools, e.g. for data management plans
  27. 27. SYNOPSIS
  28. 28. From Vision to Action
  29. 29. Large Scale Data Centers are ca. $ 1 Billion each MB GB TB PB EX
  30. 30. Big Data Europe BDI Components Used in this Pilot • Apache Flume (data ingestion) • Apache Kafka (messaging) • Apache Spark (distributed analysis, transformation) • Apache HDFS (raw storage) • SWC PoolParty Semantic Suite (data consolidation, curation) • OpenLink Virtuoso (triple store) • Apache HTTP (linked data serving) • PoolParty Semantic Graph Search Server (visualisation and data browsing)
  31. 31. Big Data Europe BDI Components Used in this Pilot • Apache Flume (data ingestion) • Apache Kafka (messaging) • Apache Spark (distributed analysis, transformation) • Apache HDFS (raw storage) • SWC PoolParty Semantic Suite (data consolidation, curation) • OpenLink Virtuoso (triple store) • Apache HTTP (linked data serving) • PoolParty Semantic Graph Search Server (visualisation and data browsing) A LOT OF WORK - VOLUME & DYNAMICS COMPLEX MAINTENANCE BUSINESS MODELS - WHO PAYS FOR WHAT
  32. 32. Big Data or AI? What if machines take over the tedious and dirty work? • Machines, Platforms and Crowd • Dr. Watson • Homo Deus Chan Zuckerberg Initiative • Human Cell Project In the news • New AI can guess whether you're gay or straight from a photograph • Elon Musk says AI could lead to third world war • Report shows that AI is more important to IoT than big data insights
  33. 33. Thank you Ron.Dekker@CESSDA.EU WWW.CESSDA.EU Twitter @CESSDA_DATA

×