The European Open Science Cloud Pilot
Brian Matthews
Science and Technology Facilities Council
03/07/17
Why Europe is not fully tapping
into the potential of data:
Data not always open and lack of incentives and rewards for data sharing
Lack of interoperability required for data sharing … noting deep-rooted
walls between disciplines.
Fragmentation between data infrastructures that are split by scientific and
economic domains, countries and governance models
Surging demand for High Performance Computing at a scale above single
member state resources
Data reuse employing advance analysis techniques adequate protection of
personal data considering forthcoming revision of Copyright legislation.
Proposed a European Open
Science Cloud
Make all scientific data produced by the Horizon 2020 programme open by
default.
Raise awareness and change incentive structures for academics industry and
public services to share their data.
Develop specification for interoperability and data sharing across disciplines
and infrastructures
Create a fit-for-purpose pan-European governance structure to federate
scientific data infrastructures and overcome fragmentation.
Develop cloud based services for Open science supported by the necessary
data infrastructure
Enlarge the scientific user base to researchers and innovators from all
disciplines.
High Level Expert Group for the "European
Open Science Cloud".
http://ec.europa.eu/research/openscience/pdf/hleg/hleg-eosc-first-report_(draft).pdf
Definitions
European:
research and innovation are global - EOSC cannot be built exclusively in and for
Europe
Europe, is in a strong position to lead this initiative as already distributed and
collaborative
Open:
not all data and tools can be open. E.g. confidentially and privacy.
Open is also often confused with ‘for free'. Free data and services do not exist.
Intelligently open is what we mean,
Science:
explicitly includes all disciplines including the arts and humanities,
Also societal innovation and productivity,
support broad societal participation in Open Innovation and Open Science.
Cloud:
It can be misinterpreted to indicate that the EOSC is mostly about hard ICT
infrastructure
But it is much more a commons of data, software, standards, expertise and
policy related to data-driven science and innovation.
Evolution of infrastructure
What do we need to do?
03/07/17
Technical Challenges: developing technical solutions
that meet the scientific needs
9
EOSCpilot Challenges
Scientific Challenges are really Opportunities
Technical Challenges are Barriers to overcome
Cultural Challenges are also Barriers
Scientific Challenges: deploying the EOSC to deliver
Open Science
Cultural Challenges: adopting new, more open ways
of working
Three types of challenges addressed by the EOSCpilot:
Actions to bring about an EOSC
• Bring the current Research Infrastructures together
• We do not want to replace their work
• Bring the e-Infrastructure projects together
• GEANT , PRACE
• EGI, EUDat, OpenAire
• Interoperate between their services
• Catalogue of services
• Allow people to select services to build new infrastructures
• Set up appropriate rules of engagement
• Allow access to data and compute
• Interoperate between their data
• FAIR data catalogues – accessible outside their discipline.
• Interoperable standards and metadata
• Allow new resources to be added
• Cloud providers, HPC providers, data providers, service providers
• Within the common governance and resourcing processes
• And a skills and competencies framework
• Need some set of core services and processes to hold the EOSC together
03/07/17
EOSC-Pilot Project
Setting the EOSC in the right direction
First of the EOSC projects
10M€ over 2 years
• Jan 2017 – Dec 2018
33 Partners + 15 3rd parties
• Led by STFC
• A range of e-Infrastructure providers, research institutes, research consortia, across
disciplines.
• EGI, EUDat, OpenAire, PRACE, GEANT
• ELIXIR, ICOS, ECRIN, BBMRI, DESY, CERN, XFEL, CEA
• STFC, CNR, DANS, DCC, BSC, MPG, CNRS
Try to answer some basic questions
• What is the EOSC going to provide?
• How is the EOSC going to operate ?
• How is the EOSC going to change how science is done ?
www.eoscpilot.
eu
11
EOSCpilot: High Level Aims
The EOSCpilot project will support the first phase in the
development of the EOSC.
Establish the governance framework for the EOSC and
contribute to the development of European open science
policy and best practice;
Develop a number of demonstrators functioning as high-
profile pilots that integrate services and infrastructures to
show interoperability and its benefits in a number of
scientific domains;
Engage with a broad range of stakeholders, crossing
borders and communities, to build the trust and skills
required for adoption of an open approach to scientific
research.
1. Governance
• Propose a governance framework
2. Policy
• Devise a policy environment
3. Demonstrators
• Use real demonstrators to drive the
requirements for the EOSC
4. Services
• Specify service architecture, catalogue and
pilot services
5. Interoperability
• Identify interfaces and standards to drive
interoperability
6. Skills
• Specify a skills and competencies framework
for the EOSC
7. Engagement
• involve as many stakeholders as possible.
13
Workpackages
Science Demonstrators
First 5 Demonstrators
• Environmental & Earth Sciences - ENVRI
Radiative Forcing Integration to enable
harmonised data access and integration
across multiple research communities
• High Energy Physics - WLCG: large-scale, long-
term preservation and re-use of HEP data in
the EOSC open to other researchers
• Humanities – TEXTCROWD: Collaborative
semantic enrichment of text-based datasets
by make new software available on the EOSC.
• Life Sciences - Pan-Cancer Analyses & Cloud
Computing within the EOSC to accelerate
genomic analysis on the EOSC
• Physics - The photon-neutron community to
improve the community’s computing facilities
by creating a virtual platform for all users
Second 5 Demonstrators
• HPCaaS for Fusion - Culham Science Centre, UK
• Life Science Leveraging EOSC to offload
updating and standardizing life sciences
datasets and to improve studies
reproducibility, reusability and interoperability-
CRG, Spain
• Seismology: EPOS Virtual Earthquake and
Computational Earth Science e-science
environment in Europe- University of Liverpool,
UK
• CryoEM Linking distributed data and data
analysis resources as workflows in Structural
Biology with cryo-Electron Microscopy:
Interoperability and reuse CSIC, Spain
• Astronomy Open Science Cloud access to
LOFAR data - ASTRON, NL
• 5 more demonstrators to be selected in the autumn.
www.eoscpilot.e
u
14
Some Key Deliverables of EOSC Pilot
Governance framework
propose how the EOSC might be operated
Policies it needs to run the EOSC
Architecture of the EOSC
Systems of Systems approach
Rules of engagement
Catalogue of services
Skills and Competencies framework to work with the EOSC
Interoperability
Service Interoperability
Data Interoperability
www.eoscpilot.
eu
The European Open Science Cloud for Research pilot project is
funded by the European Commission, DG Research & Innovation
under contract no. 739563
15
Service Interoperability
Propose an architecture, validated technical solutions and
best practices for enabling interoperability across multiple
federated e-infrastructures, overcoming current gaps
expressed by user communities and resource providers.
www.eoscpilot.
eu
The European Open Science Cloud for Research pilot project is
funded by the European Commission, DG Research & Innovation
under contract no. 739563
16
Reasons
for GAPs
Gap1:
Diversity and
incompatibility
of the AAIs
Gap5: Low
awareness
of the e-
infrastructur
es and
services
Gap2:
Network
services
Gap4:
Diversity of
access
policies
Gap3:
Diversity of
services and
providers
Gap6: Lack
of expertise,
training,
easy tools,
human
networks
Bridging
the GAPs
Gap1:
Global AAI
Gap5:
Common
vocabulary,
global
services
catalogue,
disseminatio
n
Gap2:
Network
services
improveme
nt
Gap4:
Multidiscipli
nary
mutualised
space
Gap3:
Services
technical
interoperabil
ity
Gap6: Foster
adoption,
expertise
sharing, user
friendly
tools, human
networks
Data Interoperability
Establish principles and develop mechanisms that
enable the EOSC to provide research and data
interoperability across the diversity of existing (and
potential future) research communities, research
infrastructures and other research organisations
Consider some examples of data interoperability in
research infrastructures
Develop guidelines to ensure data interoperability for
FAIR data
Define minimal metadata to allow cross-walks
Test in some trial scenarios
www.eoscpilot.
eu
The European Open Science Cloud for Research pilot project is
funded by the European Commission, DG Research & Innovation
under contract no. 739563
17
Thank you!
Brian.Matthews@stfc.ac.uk

OSFair2017 Workshop | The European Open Science Cloud Pilot

  • 1.
    The European OpenScience Cloud Pilot Brian Matthews Science and Technology Facilities Council
  • 2.
  • 3.
    Why Europe isnot fully tapping into the potential of data: Data not always open and lack of incentives and rewards for data sharing Lack of interoperability required for data sharing … noting deep-rooted walls between disciplines. Fragmentation between data infrastructures that are split by scientific and economic domains, countries and governance models Surging demand for High Performance Computing at a scale above single member state resources Data reuse employing advance analysis techniques adequate protection of personal data considering forthcoming revision of Copyright legislation.
  • 4.
    Proposed a EuropeanOpen Science Cloud Make all scientific data produced by the Horizon 2020 programme open by default. Raise awareness and change incentive structures for academics industry and public services to share their data. Develop specification for interoperability and data sharing across disciplines and infrastructures Create a fit-for-purpose pan-European governance structure to federate scientific data infrastructures and overcome fragmentation. Develop cloud based services for Open science supported by the necessary data infrastructure Enlarge the scientific user base to researchers and innovators from all disciplines.
  • 5.
    High Level ExpertGroup for the "European Open Science Cloud". http://ec.europa.eu/research/openscience/pdf/hleg/hleg-eosc-first-report_(draft).pdf
  • 6.
    Definitions European: research and innovationare global - EOSC cannot be built exclusively in and for Europe Europe, is in a strong position to lead this initiative as already distributed and collaborative Open: not all data and tools can be open. E.g. confidentially and privacy. Open is also often confused with ‘for free'. Free data and services do not exist. Intelligently open is what we mean, Science: explicitly includes all disciplines including the arts and humanities, Also societal innovation and productivity, support broad societal participation in Open Innovation and Open Science. Cloud: It can be misinterpreted to indicate that the EOSC is mostly about hard ICT infrastructure But it is much more a commons of data, software, standards, expertise and policy related to data-driven science and innovation.
  • 7.
  • 8.
    What do weneed to do? 03/07/17
  • 9.
    Technical Challenges: developingtechnical solutions that meet the scientific needs 9 EOSCpilot Challenges Scientific Challenges are really Opportunities Technical Challenges are Barriers to overcome Cultural Challenges are also Barriers Scientific Challenges: deploying the EOSC to deliver Open Science Cultural Challenges: adopting new, more open ways of working Three types of challenges addressed by the EOSCpilot:
  • 10.
    Actions to bringabout an EOSC • Bring the current Research Infrastructures together • We do not want to replace their work • Bring the e-Infrastructure projects together • GEANT , PRACE • EGI, EUDat, OpenAire • Interoperate between their services • Catalogue of services • Allow people to select services to build new infrastructures • Set up appropriate rules of engagement • Allow access to data and compute • Interoperate between their data • FAIR data catalogues – accessible outside their discipline. • Interoperable standards and metadata • Allow new resources to be added • Cloud providers, HPC providers, data providers, service providers • Within the common governance and resourcing processes • And a skills and competencies framework • Need some set of core services and processes to hold the EOSC together 03/07/17
  • 11.
    EOSC-Pilot Project Setting theEOSC in the right direction First of the EOSC projects 10M€ over 2 years • Jan 2017 – Dec 2018 33 Partners + 15 3rd parties • Led by STFC • A range of e-Infrastructure providers, research institutes, research consortia, across disciplines. • EGI, EUDat, OpenAire, PRACE, GEANT • ELIXIR, ICOS, ECRIN, BBMRI, DESY, CERN, XFEL, CEA • STFC, CNR, DANS, DCC, BSC, MPG, CNRS Try to answer some basic questions • What is the EOSC going to provide? • How is the EOSC going to operate ? • How is the EOSC going to change how science is done ? www.eoscpilot. eu 11
  • 12.
    EOSCpilot: High LevelAims The EOSCpilot project will support the first phase in the development of the EOSC. Establish the governance framework for the EOSC and contribute to the development of European open science policy and best practice; Develop a number of demonstrators functioning as high- profile pilots that integrate services and infrastructures to show interoperability and its benefits in a number of scientific domains; Engage with a broad range of stakeholders, crossing borders and communities, to build the trust and skills required for adoption of an open approach to scientific research.
  • 13.
    1. Governance • Proposea governance framework 2. Policy • Devise a policy environment 3. Demonstrators • Use real demonstrators to drive the requirements for the EOSC 4. Services • Specify service architecture, catalogue and pilot services 5. Interoperability • Identify interfaces and standards to drive interoperability 6. Skills • Specify a skills and competencies framework for the EOSC 7. Engagement • involve as many stakeholders as possible. 13 Workpackages
  • 14.
    Science Demonstrators First 5Demonstrators • Environmental & Earth Sciences - ENVRI Radiative Forcing Integration to enable harmonised data access and integration across multiple research communities • High Energy Physics - WLCG: large-scale, long- term preservation and re-use of HEP data in the EOSC open to other researchers • Humanities – TEXTCROWD: Collaborative semantic enrichment of text-based datasets by make new software available on the EOSC. • Life Sciences - Pan-Cancer Analyses & Cloud Computing within the EOSC to accelerate genomic analysis on the EOSC • Physics - The photon-neutron community to improve the community’s computing facilities by creating a virtual platform for all users Second 5 Demonstrators • HPCaaS for Fusion - Culham Science Centre, UK • Life Science Leveraging EOSC to offload updating and standardizing life sciences datasets and to improve studies reproducibility, reusability and interoperability- CRG, Spain • Seismology: EPOS Virtual Earthquake and Computational Earth Science e-science environment in Europe- University of Liverpool, UK • CryoEM Linking distributed data and data analysis resources as workflows in Structural Biology with cryo-Electron Microscopy: Interoperability and reuse CSIC, Spain • Astronomy Open Science Cloud access to LOFAR data - ASTRON, NL • 5 more demonstrators to be selected in the autumn. www.eoscpilot.e u 14
  • 15.
    Some Key Deliverablesof EOSC Pilot Governance framework propose how the EOSC might be operated Policies it needs to run the EOSC Architecture of the EOSC Systems of Systems approach Rules of engagement Catalogue of services Skills and Competencies framework to work with the EOSC Interoperability Service Interoperability Data Interoperability www.eoscpilot. eu The European Open Science Cloud for Research pilot project is funded by the European Commission, DG Research & Innovation under contract no. 739563 15
  • 16.
    Service Interoperability Propose anarchitecture, validated technical solutions and best practices for enabling interoperability across multiple federated e-infrastructures, overcoming current gaps expressed by user communities and resource providers. www.eoscpilot. eu The European Open Science Cloud for Research pilot project is funded by the European Commission, DG Research & Innovation under contract no. 739563 16 Reasons for GAPs Gap1: Diversity and incompatibility of the AAIs Gap5: Low awareness of the e- infrastructur es and services Gap2: Network services Gap4: Diversity of access policies Gap3: Diversity of services and providers Gap6: Lack of expertise, training, easy tools, human networks Bridging the GAPs Gap1: Global AAI Gap5: Common vocabulary, global services catalogue, disseminatio n Gap2: Network services improveme nt Gap4: Multidiscipli nary mutualised space Gap3: Services technical interoperabil ity Gap6: Foster adoption, expertise sharing, user friendly tools, human networks
  • 17.
    Data Interoperability Establish principlesand develop mechanisms that enable the EOSC to provide research and data interoperability across the diversity of existing (and potential future) research communities, research infrastructures and other research organisations Consider some examples of data interoperability in research infrastructures Develop guidelines to ensure data interoperability for FAIR data Define minimal metadata to allow cross-walks Test in some trial scenarios www.eoscpilot. eu The European Open Science Cloud for Research pilot project is funded by the European Commission, DG Research & Innovation under contract no. 739563 17
  • 18.