The Science Cloud Users:
Challenges and Needs
ESA‐ESPI Workshop on “Space Data & Cloud Computing 
Infrastructures: Policies and Regulations”
7 July 2017
Bob Jones
CERN
Bob.Jones <at> cern.ch
Helix Nebula – The Science Cloud
Helix Nebula – The Science Cloud with Grant Agreement 687614 is a Pre‐Commercial Procurement Action 
funded by H2020 Framework Programme
Accelerating Science and InnovationAccelerating Science and Innovation
Data in High-Energy Physics
Based on DPHEP Study Group (2009). Data Preservation in High Energy Physics. http://arxiv.org/abs/0912.0255
Patricia Herterich
5EPFL & SDSC visit 2017-03-24
CERN Open Data Portal
• 2015
• 40 TB of 2010 data
• 2016
• 320 TB of 2011 data
• Curation, release of
• Simulated data (MC)
• Trigger information
• Configuration files
http://github.com/cernopendata
The Worldwide LHC Computing Grid
Tier-1:
permanent storage,
re-processing,
analysis
Tier-0 (CERN): data
recording,
reconstruction and
distribution
Tier-2:
Simulation,
end-user analysis
2 million jobs/day
700 PB of storage
nearly 170 sites,
40+ countries
WLCG:
An International collaboration to distribute and analyse LHC data
Integrates computer centres worldwide that provide computing and storage
resource into a single infrastructure accessible by all LHC physicists 6
D. Giordano WLCG Workshop 9/10/2016
CERN cloud procurements 2015-2016
7
The Hybrid Cloud Model
Brings together
• research organisations,
• data providers,
• publicly funded e‐
infrastructures,
• commercial cloud service 
providers
In a hybrid cloud with 
procurement and governance 
approaches suitable for the 
dynamic cloud market In‐house
06/07/2017
A common approach
https://www.eiroforum.org/science‐policy/eiroforum‐directors‐meet‐european‐commissioner‐carlos‐moedas/
Bob Jones (CERN)
EIROforum Directors meet the European Commission in Brussels. 
From left to right: ESRF DG, Francesco Sette, DG Research and 
Innovation, EC, Robert‐Jan Smits, ILL Director, Helmut Schober, 
CERN DG, Fabiola Gianotti, Chair of the European XFEL 
Management Board, Robert Feidenhans’l, Commissioner for 
Research and Innovation, Carlos Moedas, EUROfusion Programme 
Manager, Tony Donné, ESO DG and EIROforum Chair, Tim de Zeeuw, 
EMBL Director International Relations, Silke Schumacher and ESA 
DG, Jan Woerner. (Credit: Mark McCaughrean)
HNSciCloud Joint Pre‐Commercial Procurement
Procurers: CERN, CNRS, DESY, EMBL‐EBI, ESRF, 
IFAE, INFN, KIT, STFC, SURFSara
Experts: Trust‐IT & EGI.eu
The group of procurers have committed
• Procurement funds
• Manpower for testing/evaluation
• Use‐cases with applications & data
• In‐house IT resources
Resulting services will be made available to end‐
users from many research communities
Co‐funded via H2020 Grant Agreement 687614
Total procurement budget >5M€
06/07/2017
What will be procured
A hybrid cloud platform for the European research community
7/6/2017 11
HNSciCloud
PCP
Source: CloudComputingfor Govies, DLT Solutions,
David Blankenhorn, Van Ristauand Caron Beesley
Combining services at the IaaS level to support science workflows
The R&D services to be developed are to be integrated with
Resources in data centres operated by the buyers group
GEANT network
Challenges
Innovative IaaS level cloud services integrated with procurers in‐
house resources and public e‐infrastructure to support a range of 
scientific workloads
Compute and Storage
support a range of virtual machine and container configurations including HPC 
working with datasets in the petabyte range
Network Connectivity and Federated Identity Management
provide high‐end network capacity via GEANT for the whole platform with 
common identity and access management
Service Payment Models
explore a range of purchasing options to determine those most appropriate for 
the scientific application workloads to be deployed
Bob Jones, CERN 12
The Pre‐Commercial Procurement process
06/07/2017
06/07/2017
Top 10 challenges for RIs and EOSC to work together
1. Scalability of services - catering for the needs of small research groups from public and
private sectors, as well as very large experiments
2. Integration - services linked by a supported federated identity scheme covering more of the
research life cycle where users access data, sw, IT capacity and the expertise for performing
analyses
3. Hybrid model - should not compete with but rather profit from ease of use and rates of
innovation of commercial service providers
4. Provenance, citation and use of data & software that respects intellectual property rights
5. Software licence models that allow flow of data across different infrastructures without
buying licences for each one
6. Confidentiality of data that is still under embargo for publication or intellectual property
reasons
7. Cyber security vulnerabilities must not compromise participating organisations
8. GDPR compatibility for all services
9. Adoption - Making end users aware of the services and encouraging them to use them
10.A Governance model that ensures end‐users and procurers drive the decision making
process
Result must be sustained via funding
models that take a long-term view

The Science Cloud Users: Challenges and Needs