During this talk, Giuseppe will introduce the EGI Federated Cloud Infrastructure, a federation of private and public clouds, offering a scalable and flexible e-Infrastructure to the European research community. The service is implemented as a hybrid 'Infrastructure as a Service' (IaaS) cloud, composed of multiple clouds that are federated into a scalable compute and storage platform using EGI core infrastructure services. The Federated Cloud serves scientific applications, long-running services and data- and compute-intensive workloads worldwide. The federated cloud also serves as a reference infrastructure for structured scientific communities who want to build their own, cloud federations from partner sites and with open source federation software and standards. The talk and the following demonstration will explain how research workloads can be spread between EGI and EUDAT services, integrating storage, compute and PID solutions from these two network of providers
Visit: https://www.eudat.eu/eudat-summer-school
Using the EGI Fed-Cloud for Data Analysis - EUDAT Summer School (Giuseppe La Rocca, EGI)
1. www.eudat.eu
EUDAT receives funding from the European Union's Horizon 2020 programme - DG CONNECT e-Infrastructures. Contract No. 654065
Using the EGI Fed-Cloud for
Data Analysis
Giuseppe La Rocca
giuseppe.larocca@egi.eu
Technical Outreach Expert
2. EUDAT Summer School, 3-7 July 2017, Crete
Agenda
• Background information about EGI
– Mission and infrastructure
– Members & partners
– EGI services
• Introduction of the EGI Federated Cloud Infrastructure
– Architecture
• Linking EUDAT service to the EGI Fed-Cloud
3. EUDAT Summer School, 3-7 July 2017, Crete
EGI: A sustainable e-Infrastructure
for Open Science
• Major national e-Infrastructures: 22 NGIs + 1 EIRO (CERN)
EGI is a federation of over 300 computing and data centres spread across 56 countries in Europe
and worldwide
www.egi.eu/about/egi-foundation/
EGI Foundation
4. EUDAT Summer School, 3-7 July 2017, Crete
Africa and Arabia
Council for Scientific and
Industrial Research, South Africa
India Centre for
Development of
Advanced Comp.
China Inst. Of HEP
Chinese Academy
of Sciences
Latin America
Universida de Federal do
Rio de Janeiro
Ukraine
Ukrainian National
Grid
USA
Canada
Asia Pacific Region
Academia Sinica
at Taiwan
International Partnerships
5. EUDAT Summer School, 3-7 July 2017, Crete
23 Cloud
providers,
300+ HTC
providers
15 types of
services
1.7 Million
jobs/day
2.6 Billion
CPU
hours/year
48,000+
users
EGI today
6. EUDAT Summer School, 3-7 July 2017, Crete
ESFRIs,
FET flagships
Size of
individual
groups
Multinational communities,
(e.g. H2020 projects)
‘Long tail of science’
WLCG
ELI
CTA
ELIXIR
EPOS
EISCAT_3D
BBMRI
CLARIN
LOFAR
EMSO
LifeWatch
ICOS
CORBEL
ENVRIplus
…
VRE projects
OpenDreamKit
WeNMR
DRIHM
VERCE
MuG
AgINFRA
CMMST
LSGC
SuperSites Exploitation
Environmental sci.
neuGRID
…
PeachNote
CEBA Galaxy eLab
Semiconductor design
Main-belt comets
Quantum pysics studies
Virtual imaging (LS)
Bovine tuberculosis spread
Convergent evol. in genomes
Geography evolution
Seafloor seismic waves
3D liver maps with MRI
Metabolic rate modelling
Genome alignment
Tapeworms infection on fish
…
Industry,
SMEs
Agroknow
CloudEO
CloudSME
Ecohydros
gnubila
Sinergise
SixSq
TEISS
Terradue
Ubercloud
…
EGI serves researchers
and innovations
7. EUDAT Summer School, 3-7 July 2017, Crete
VO 1
(site a, b, c)
VO 2
(site x, y, z, b)
1. Generic VOs – such as fedcloud.egi.eu Test VOs, “incubator” for new users
2. Community/discipline-specific VOs – e.g. Chipster, Highthroughtputseq, EISCAT, etc.
3. Training VO = training.egi.eu Running hands on training in the cloud (about any software!)
Browse and search VOs at http://operations-portal.egi.eu/vo/search
Access to EGI resources:
Virtual Organisations
VO memberships and resources
access with X.509 certificates
8. EUDAT Summer School, 3-7 July 2017, Crete
Project/Community
representing the VO
Negotiator Grid
provider
Cloud
provider
Operation
Level
Agreement
Service Level
Agreement
Satisfaction review
(every 6 months)
Storage
provider
Service
requirements Conditions
Applic.
provider
Performance reports
SupportTraining
Type, number, size,
cost, availability, etc.
Resources allocation to Virtual
Organisation (VOs)
Send list of
publications
10. EUDAT Summer School, 3-7 July 2017, Crete
The EGI services – A wide offer of
services for Research and
Innovation
11. EUDAT Summer School, 3-7 July 2017, Crete
The EGI Service Catalogue
www.egi.eu/services
12. EUDAT Summer School, 3-7 July 2017, Crete
Execute thousands of computational tasks to analyse
large datasets
• Access to high-quality computing resources
• Integrated monitoring and accounting tools to provide information
about the availability and resource consumption
• Workload and data management tools to manage all computational
tasks
• Large amounts of processing capacity over long periods of time
See High-Throughput Compute for service information and request
Main features of High-Throughput Compute:
High-Throughput Compute
13. EUDAT Summer School, 3-7 July 2017, Crete
Powered by High-Throughput Compute
HADDOCK
A web portal offering tools for structural biologists
Used to model the structure of proteins and other molecules
So far, HADDOCK processed + 130,000 submissions from over 7,500 scientists.
Read more... World-wide: > 120’000 CPU cores from 41 sites (EGI & OSG)
HADDOCK
Portal
EGI Clusters
(CPU and GPU)
Workload
manager
(DIRAC)
14. EUDAT Summer School, 3-7 July 2017, Crete
Run virtual machines on-demand with complete control over
the computing resources
• Execute compute- and data-intensive workloads
• Host long-running services (e.g. web servers or databases)
• Create disposable testing and development environments
• Select virtual machine configurations to fit your requirements
• Manage your Cloud Compute resources in a flexible way with
integrated monitoring and accounting capabilities
Cloud Compute
See Cloud Compute for service information and request
15. EUDAT Summer School, 3-7 July 2017, Crete
When a human cell meets Salmonella
K. Förstner, Univ. Würzburg, used Cloud
Compute to run a pipeline for the analysis
of sequencing data.
Nature (doi:10.1038/nature16547)
The EXTraS project
Implement four software
pipelines to harvest data
collected on-board ESA’s space
observatory XMM-Newton.
Powered by Cloud Compute
16. EUDAT Summer School, 3-7 July 2017, Crete
Run Docker containers in a lightweight virtualised environment
• On-demand provisioning
• Lightweight environment for maximised performance
• Standard interface to deploy on multiple service providers
• Interoperable and transparent
• Removes friction between development and operations
environments.
See Cloud Container Compute for service information and request
Main features of Cloud Container Compute:
Cloud Container Compute
17. EUDAT Summer School, 3-7 July 2017, Crete
Summary and comparison of the
“Compute services”
High-Throughput Cloud Compute Container Cloud
• For batch compute
“jobs”
• Jobs must be grid-
enabled
• To run parallel-
based applications
on large scale
resource providers
• For compute- or data-
intensive tasks and host
online services
• For batch and interactive
compute
• Full flexibility with SW
• Lower IT costs, reduce
infrastructure complexity,
enhance flexibility and
delivery high-level services
• Easily to scale up according
to customer’s need
• For compute- or data-
intensive tasks and host
online services
• Most light-weight
• Fast VM/application start-up
• Container isolate
applications from the
underlying infrastructure
18. EUDAT Summer School, 3-7 July 2017, Crete
Store, share and access your files and their metadata on
a global scale
• Assign global identifiers to files
• Access highly-scalable storage from anywhere
• Control the data you share
• Organise your data using a flexible hierarchical structure
Online Storage
See Online Storage for service information and request
Main features of Online Storage:
19. EUDAT Summer School, 3-7 July 2017, Crete
Back-up your data for the long term and future use in
a secure environment
Archive Storage
Main features of Archive Storage:
• Store large amount of data
• Free up your online storage
• Store data for long-term retention
See Archive Storage for service information and request
20. EUDAT Summer School, 3-7 July 2017, Crete
The EGI Federated Cloud
Infrastructure
21. EUDAT Summer School, 3-7 July 2017, Crete
The EGI Federated Cloud
Infrastructure
• Grid of clouds!
• Unified user interfaces
• Harmonised operational
behaviour
• Clouds and their
interconnections are based
on open standards, open
technologies
22. EUDAT Summer School, 3-7 July 2017, Crete
Benefits, technologies
Harmonised
operation
Cloud registry
Information system
Virt. Machine marketpl.
Usage accounting
Access control
Uniform
user interfaces
- On every site
OpenStack Nova - On OS sites
CDMI - on any site
• OpenStack SWIFT – on OS sites
VM and block storage management: Object storage management (optional):Standard-based
federation
OpenStack
federation
23. EUDAT Summer School, 3-7 July 2017, Crete
Federated Cloud Model
EGI Federation services:
Accounting, Monitoring, Configuration Database, Information Discovery, VM Marketplace
EGI AAI
Cloud Management
Framework
IaaS API
Cloud Management
Framework
IaaS API
Cloud Management
Framework
IaaS API
IaaS Federated Access Tools
Community PlatformsAppDB VMOps
24. EUDAT Summer School, 3-7 July 2017, Crete
A view on the current infrastructure
Today:
• 23 providers from 14 NGIs
• 15 OpenStack
• 7 OpenNebula
• 1 Synnefo
• VOs: 34
• Catch-all VOs: 7
• Domain-spec: NGS, …
25. EUDAT Summer School, 3-7 July 2017, Crete
Different modus operandi
• Compute and data intensive workloads
• Batch and interactive (e.g. Jupiter Notebooks) with scalable and
customized environments
• Examples: The Genetics of Salmonella Infections, The Chipster platform
• Service Hosting
• Long-running services (e.g. web server, database, application server)
• Examples: NBIS Web Services, Peachnote analysis platform, The VERCE
platform
• Datasets repository
• Store and manage large datasets (in a storage volume)
• Disposable and testing environments
• Host training environments, test applications
• Examples: Events conducted on the cloud-based EGI Training Infrastructure
26. EUDAT Summer School, 3-7 July 2017, Crete
How to access the EGI FedCloud ?
Access to the resources:
Obtain a personal X.509 access certificate
from a recognised Certification Authority.
Terena Certificate Service: (online)
https://www.digicert.com/sso
Join the fedcloud.egi.eu VO serves as a test
ground for users to try the EGI cloud and to
prototype and validate applications.
VIRTUAL
ORGANISATION
CA
VO manager
Obtain certificate: Once
Renew certificate: Annually
User database
Cloud sites
Membership
service
Join VO: Once
DB replication
(once a day)
You
Register
Use
resources
Remarks:
After the 6-month long membership in
the fedcloud.egi.eu VO, you will need to move to a
production VO, or establish a new VO.
27. EUDAT Summer School, 3-7 July 2017, Crete
• Open Standards Realm
• Uses OCCI 1.2 interface
• Ruby and Java SDKs available
• Simple CLI tool for managing resources
• OpenStack Realm
• Native OpenStack API with VOMS AuthN/AuthZ
• Plugin for python SDK and OpenStack CLI
How to interact with the EGI FedCloud ?
28. EUDAT Summer School, 3-7 July 2017, Crete
A typical workflow
VO Manager:
Endorses available images
Includes images in the VO
29. EUDAT Summer School, 3-7 July 2017, Crete
The EGI Applications Database
• The EGI Application DataBase (AppDB) is a central service that stores
and provides information about:
• Software solutions in the form of native software products, virtual
appliances and/or software appliances,
• Programmers and the scientists who are involved, and
• Publications derived from
the registered solutions.
Virtual Appliances
30. EUDAT Summer School, 3-7 July 2017, Crete
Two different storage solutions
EGI FedCloud Storage
Block Storage
Object Storage
The EGI Federated Cloud Infrastructures offers two different storage
solutions
31. EUDAT Summer School, 3-7 July 2017, Crete
Block Storage
Persistent block level storage to use with VMs
• Use as any other block device
from VMs
• Snapshotable
Simple usage
• Consistent and low-latency
performance
• SSDs (in some sites)
High
Performance
• From GB to TB
• Create and attach to VMs on
demand
Scale to your
needs
VM
32. EUDAT Summer School, 3-7 July 2017, Crete
Object Storage
Data storage infrastructure for storing and retrieving data from
anywhere at any time
• Simple REST APIs for
managing and accessing data
API Access
• Store as much data as needed.
• Get accounted only for the
space used.
Scalable
• Define ACLs on each object,
share publicly your data
Sharing
33. EUDAT Summer School, 3-7 July 2017, Crete
Block Storage vs Object Storage
Block Storage Object Storage
Access
only from within a VM
only at the same site the VM is
located
from any device
connected to the
internet.
Sharing not possible
possible (data can be
kept private or public)
Accounting
for the entire volume,
regardless how much of it is
actually used in the VM
only for the data
stored
Integration
POSIX access, easy with any
application capable to
write/read file from a local disk
requires a client to be
integrated within the
application
34. EUDAT Summer School, 3-7 July 2017, Crete
• OCCI (Open Cloud Computing Interface) is a OGF
standard API to facilitate interoperable access to
cloud resources
• Block storage in FedCloud is managed via OCCI:
• Create/Delete volumes
• Attach/Detach (link/unlink in OCCI terms) to VMs
• Once attached, use as other disk in VM
Block Storage: OCCI
35. EUDAT Summer School, 3-7 July 2017, Crete
Object Storage: CDMI
• FedCloud object storage is managed via CDMI
(Cloud Data Management Interface)
• RESTful API for operations on storage objects
• Developed by SNIA, now ISO/IEC 17826
• Very flexible API, based on capabilities:
• Object basic capabilities (create/get/delete/list)
• Object ACLs
• Import from external sources, export as Filesystems
36. EUDAT Summer School, 3-7 July 2017, Crete
State of the art: Block Storage
• Block storage is supported on all FedCloud CMFs and sites
OpenStack OpenNebula Synnefo
OCCI Basic
Operations
Yes Yes Yes
OCCI advanced
(resize,
snapshot)
No No No
Native API
advanced
Yes Partial Yes
37. EUDAT Summer School, 3-7 July 2017, Crete
State of the art: Object Storage
• CDMI support
• CDMI server framework by Synnefo
• On going effort to support OpenStack
• Basic client available
• Native APIs allow basic and advanced capabilities
OpenStack Synnefo OpenNebula
CDMI Basic
Operations
In Progress Yes N/A
Native API Yes Yes N/A
38. EUDAT Summer School, 3-7 July 2017, Crete
How to manage datasets in the
EGI Federated Cloud ?
Data providers
Local
dataset
Local
dataset
Local
dataset
VO Manager:
Endorses available images
Includes images in the VO
39. EUDAT Summer School, 3-7 July 2017, Crete
The EGI DataHub
A Data as a Service (DaaS) to implement the EGI Open Data
Platform (ODP)
• EGI Open Data Platform (ODP)
– Support EC Open Data Cloud vision
– Integrate different data repositories available in a distributed
environment
– Offer the functionalities to make data open and link them to
Open Data Catalogues
• OneData
– Software stack for distributed data management platform
40. EUDAT Summer School, 3-7 July 2017, Crete
Open Data Platform – The big picture
EGI User 1 (VO x) Anonymous
User 1
EGI User 2
(Onedata space)
Anonymous
User 2
Space
Manager
DOI Registrar
(e.g. DataCite)
Community
Portal
Open Data Platform
Web GUI POSIX HTTP OAI-PMH CDMI REST
REST
Generatore AIP
package for abc
EGI Site 1 EGI Site 2 EGI Site 3 Cloud storage EUDAT
Space Manager Open Data Manager Metadata Registry OAI-PMH Data
Provider
Authentication and
Authorization
Long Term
Retention
41. EUDAT Summer School, 3-7 July 2017, Crete
Open Data Platform - Interfaces
GUI
Web based
Easy data
management
and sharing,
access control
Publication of
data items
and
collections
REST
Advanced
data and
collection
management
API for
integration
with
community
tools and
portals
CDMI
Standard data
management
operations
Advanced
metadata
queries
Integration
with future
data
management
applications
POSIX
Enable direct
mounting of
spaces in the
local
filesystem
without full
data transfer
OAI-
PMH
OAI Data
Provider
interface
Dublin Core
metadata by
default
More complex
metadata can
be registered
in ODP
manually
HTTP
Direct
download of
open data
from URL’s
42. EUDAT Summer School, 3-7 July 2017, Crete
Linking EUDAT services to
the EGI Federated Cloud
VM
43. EUDAT Summer School, 3-7 July 2017, Crete
How to link EUDAT services to EGI FedCloud
Create your VM topology with the EGI VMOps dashboard
– Access the EGI VMOps dashboard and create your VM to interact with EUDAT
– Select the proper VO
– Select the VM image
– Select on of the
available providers
The first time you access
the EGI VMOps
dashboard you need to
set up your profile
44. EUDAT Summer School, 3-7 July 2017, Crete
Select the VM flavour
Start the VM and wait until it is in “running” status
When the VM is in Running status click on View Details
How to link EUDAT services to EGI FedCloud
45. EUDAT Summer School, 3-7 July 2017, Crete
Check VM details
Download the SSH key, change its permission and access the VM
How to link EUDAT services to EGI FedCloud
46. EUDAT Summer School, 3-7 July 2017, Crete
Install the EUGridPMA PGP key for apt:
]$ sudo su -
]$ wget -q -O - https://dist.eugridpma.info/distribution/igtf/current/GPG-KEY-EUGridPMA-
RPM-3 | apt-key add -
Add the following line to your /etc/apt/sources.list file for apt
#### EGI Trust Anchor Distribution ####
deb http://repository.egi.eu/sw/production/cas/1/current egi-igtf core
Install ca-policy-egi-core package
]$ sudo apt-get update
]$ sudo apt-get install -y ca-policy-egi-core
How to link EUDAT services to EGI FedCloud
47. EUDAT Summer School, 3-7 July 2017, Crete
Install a clients in the VM to perform manipulations on file
]$ sudo apt-get install software-properties-common
]$ sudo add-apt-repository ppa:maarten-kooyman-6/ppa
]$ sudo apt-get update
]$ sudo apt-get install uberftp
]$ sudo apt-get install globus-gass-copy
Copy certificate under /tmp/ to access EUDAT server
• You need to have granted access to the B2STAGE/B2SAFE instances
– Send the DN of your digital certificate to B2STAGE and B2SAFE support teams
Manipulating files on B2STAGE/B2SAFE with UberFTP
]$ uberftp eudat-b2stage.pdc.kth.se
For more details, please refer to: https://linux.die.net/man/1/uberftp
How to link EUDAT services to EGI FedCloud
48. EUDAT Summer School, 3-7 July 2017, Crete
Manipulating files on B2STAGE/B2SAFE with globus-url-copy
• Create a simple text file in your $HOME and save it as text.txt
Upload files from the VM to the B2STAGE instance
]$ globus-url-copy -vb -cred <X509_USER_PROXY>
file:///home/cloudadm/text.txt
gsiftp://eudat-b2stage.pdc.kth.se/eudat.se/projects/eudat-summerschool/text.txt
Download files from B2STAGE to the VM
]$ globus-url-copy -vb -cred <X509_USER_PROXY>
gsiftp://eudat-b2stage.pdc.kth.se/eudat.se/projects/eudat-summerschool/text.txt
file:///home/cloudadm/text2.txt
How to link EUDAT services to EGI FedCloud
49. EUDAT Summer School, 3-7 July 2017, Crete
Delete your VM when you have done!
How to link EUDAT services to EGI FedCloud
50. EUDAT Summer School, 3-7 July 2017, Crete
Documentations and wiki
https://wiki.egi.eu/wiki/Federated_Cloud_user_support
Do you need any support ? Please, contact us at: support@egi.eu