1
Federated Cloud Computing
Dr David Wallom
Associate Director
Overview
• The Oxford e-Research Centre
• Historical Cloud Federation
• The EGI Federated Cloud
Innovative digital methods
transforming research
A collaborative
research hub for
Digital Oxford
A full department in a
leading science division
Research
Problem
Create or
adapt
technology
Disseminate
and reuse
HPC Engine and Storage
Next Generation
Infrastructure
The Smart
Grid
High Speed Communications System
Service
Restorati
on
Voltag
e
Contro
l
Condition
Monitori
ng/Data
Mining
Distributio
n
System
State
Estimation
SCADA & Distribution Management System
13
The roadmap to federated Cloud
UK NGS Cloud Activities (2010 – 2012)
• NGS Agile Deployment Environments
EPSRC funded, 2 years
• Staff:
– David Wallom (OeRC, Oxford);
– David Fergusson (NeSC, Edinburgh);
– Steve Thorn (NeSC, Edinburgh);
– Matteo Turilli (OeRC, Oxford).
• Goals:
– EC2 compatible, open source solution;
– Small physical systems physically distributed
– development of a dedicated pool of images, supporting both end
user and NGS requirements such as training;
– collecting data about feasibility, costs, stability;
– identify use cases and gather further requirements.
NGS Cloud Usage
• >120 registered users: uptake has been very fast
and users stayed engaged throughout the whole
testing period;
• 26 institutions: 23 HEI both universities and colleges,
3 companies;
• 30 projects;
• 10 research areas.
Life sciences
Teaching
Mathematics
Cloud R&D
Physics
Ecology
Geography
Medicine
Social
Science
Engineering
Exemplar Case Studies
• Evolutionary Genomics: “analysis and Information management of
Next Generation Sequencing (NGS) of Genomic data poses many
challenges in terms of time and size. We are exploring the translation of
high quality NGS scientific analysis pipelines to make best use of Cloud
infrastructure”;
• Geospatial Science: “geospatial data is a mix of raster and vector data.
As rasterizing is CPU-hungry process, and all maps displayed on the
screen of the final user are rasters, it is more efficient to do the process on
the server side. I am investigating how this process can be dispersed
across many, if not unlimited instances in a cloud”;
• Agent-based modelling of crime: “at the moment I have a tomcat
server that hosts some web services used to run social simulation model, it
needs access to the file system to run fortran scripts, create files etc. There
are loads of problems with running our own server at uni and I think a
virtual machine that I could have control over would be much better”.
Flexible Services for the Support of Research (FleSSR)
6 Partners
• Academic and industrial;
• 3 cloud infrastructures.
Goals
Building federated cloud infrastructure,
extending the use of UK NGS central services
with cloud brokering and accounting.
Use cases
• Multi Platform Software Development;
• On demand Research data storage.
FleSSR Architecture
Oxford Reading
Eduserv
Zeel/i Broker
STFC/NGS
Accounting
Database
FleSSR Infrastructure
• Local/Global: services depends either on local or global
access. Cloud brokering is not mandatory for AWS-like
service access;
• Multiple identities: every user may have multiple identities,
both local and global;
• Only personal identities: group identities are not
implemented. The management of every single identity is
left to the legally responsible user;
• Multiple AA technologies: AA may differ depending on
local and global policies/technologies;
• Multiple accounting: every single identity is accounted for
its usage. Every individual may get multiple invoices.
FleSSR Use Case: Multi Platform Software Development
Zeel/i Broker
Instance configuration
manager
FleSSR cloud
Build manager
CVS / SVN
repository
Build
instance 1
Build
instance 2
Build
instance 3
Build
instance 4
Build
instance 5
FleSSR Use Case: On demand Research data storage
Zeel/i Broker Volume Manager
FleSSR cloud
VM EBS
Interface
EBS
Volume
FleSSR Output
Code
• Instance configuration and build manager: Perl command line
utility + Java client utilising the Zeel/I API;
• Personal EBS volume manager: web-based, Java client for EBS
volumes handling + tailored VM image with multiple data interfaces
(SFTP, WebDAV, GlusterFS, rsync, ssh);
• Eucalyptus open-source accounting system: Perl aggregators
and parsers for standard eucalyptus open-source log files +
MySQL accounting database + PHP accounting client.
Use cases
• SKA community testing of Use case;
• Institutional ICT team testing WEB-DAV, GridFTP & GlusterFS
solution as Use case 2.
www.egi.euEGI-InSPIRE RI-261323
EGI-InSPIRE
www.egi.euEGI-InSPIRE RI-261323
The EGI Federated Cloud solution
2011 - date
www.egi.euEGI-InSPIRE RI-261323
Rationale
Growth of Providers
• High Throughput Platform
– Academic resource providers
• Federated Cloud Platform
– Diversity of resource providers
Tens of 1000’s Millions
Few related use cases
Single application model
Many diverse use cases
& application models
Growth of Research Communities
24
www.egi.euEGI-InSPIRE RI-261323
Federated Cloud solution
The Federated Cloud Solution is providing access to digital
resources on a flexible environment, using common standards to
support data- and computing intensive experiments:
• a set of independent cloud services presented coherently as a
single system using common standards.
• allows the user to choose freely among a broader range of service
providers
• allow use of applications already developed by people within their
own community whom they trust, and from other communities who
have an independent badge of quality.
Target groups:
• individual researchers
• larger research communities or groups
25
www.egi.euEGI-InSPIRE RI-261323
Principles of Federation
• Standards and validation: Recommended and
common open standards for the interfaces and images
– OCCI, CDMI, OVF, GLUE2, AAI
• Resource integration: Cloud Computing to be
integrated into the existing production infrastructure.
• Heterogeneous implementation: no mandate on the
cloud technology.
• Provider agnosticism: the only condition to federate
resources is to expose the chosen interfaces and
services.
26
www.egi.euEGI-InSPIRE RI-261323
EGI Cloud Infrastructure
27
EGI Core Platform
Federated
AAI
Service
Registry
Monitoring Accounting
EGI Cloud Infrastructure Platform
Instance
Mgmt
Information
Discovery
Storage
Management
Help and
Support
Security Co-
ordination
Training and
Outreach
EGICollaborationTools
EGIApplication
DB
Image
Repository
EGICloudServiceMarketplace
Sustainable
Business
Models
User Community
Monitoring and control of utilisation
Technical Consultancy and Support
Uniform interfaces to Cloud
Compute and Storage
Cloud Management Stacks
(OpenStack, OpenNebula, Synnefo, …)
SecureendorsedApplicationand
ServiceDeployment
GSIGLUE2
Cloudinit CDMI
SAM UR
OVF
OCCI
www.egi.euEGI-InSPIRE RI-261323
Using open standards ffor VM
Management
rOCCI-server
28
www.egi.euEGI-InSPIRE RI-261323
Partnership
Resources
– 13 NGIs provide 22 certified resources
– 4 NGIs currently integrating resources
– 5 NGIs with interested resource providers
– Worldwide interest & integration
• Australia* (NeCTAR)
• South Africa* (SAGrid)
• South Korea* (KISTI)
• United States* (NIST, NSF A.C. Centres)
* Not shown on map
Usage since launch
- 562k VMs
- 42M CPU hours
29
www.egi.euEGI-InSPIRE RI-261323
User Model
• The offer to our users:
• Total control over deployed applications
• Elastic resource consumption based on real needs
• Workloads processed on-demand
• Endorsed and accredited applications available
from multiple different communities shared
• Single sign-on at multiple, independent providers
• Centralised access to service information across
multiple providers
30
www.egi.euEGI-InSPIRE RI-261323
Virtual Appliance Catalogue
• Registered Virtual Appliances: 30, ref
• Supporting Sites: 21, ref
• Supported Virtual Organizations: 9, ref
• atlas,
• biomed,
• cms,
• demo.fedcloud.egi.eu,
• drihm.eu,
• fedcloud.egi.eu,
• highthroughputseq.egi.eu,
• lhcb,
• vo.chain-project.eu
[Operation of the AppDB Cloud MP, officially started on June/2014]
31
www.egi.euEGI-InSPIRE RI-261323
Distribution of Virtual Appliances
Research Community
32
www.egi.euEGI-InSPIRE RI-261323
Virtual Appliances distribution
Technical function
33
www.egi.euEGI-InSPIRE RI-261323
EGI FedCloud Launch Communities
(May 2014)
• Ecology – BioVeL: Biodiversity Virtual e-Laboratory
• Structural biology – WeNMR: a worldwide e-Infrastructure for NMR and structural biology
• Linguistics – CLARIN: ‘British National Corpus’ service (BNCWeb)
• Earth Observation – SSEP: European Space Agency’s Supersites Exploitation Platform for
volcano and earthquakes monitoring (Collaboration with Helix Nebula)
• Software Engineering – SCI-BUS: simulated environments for portal testing
• Software Engineering – DIRAC: deploying ready-to-use distributed computing systems
• Software Engineering – Catania Science Gateway Framework
• Musicology – Peachnote: dynamic analysis of musical scores
• Earth Observation – ENVRI: Common Operations of Environmental Research
infrastructures (collaboration with EISCAT3D)
• Geology – VERCE: Virtual Earthquake and seismology Research
• Ecology – LifeWatch: E-Science European Infrastructure for Biodiversity and Ecosystem
Research
• High Energy Physics – CERN ATLAS: ATLAS processing cluster via HelixNebula
More info: https://wiki.egi.eu/wiki/Fedcloud-tf:Users
34
www.egi.euEGI-InSPIRE RI-261323
Current use case status
35
59 in total
35
www.egi.euEGI-InSPIRE RI-261323
36
EGI FedCloud Use Cases
Discipline Classification
www.egi.euEGI-InSPIRE RI-261323
Strengthening the underpinning platform
Continuing a Technology Evolution
• Broader support for open standards in Cloud
management frameworks
– Utilisation of rOCCI for interfaces to commercial cloud
frameworks
– Completion of high quality reference implementation
for CDMI
• New feature additions to foundational tools
depending on requests
– Accounting, monitoring, service discovery, Image
Management
37
ValueaddedservicesforUser
Communities
www.egi.euEGI-InSPIRE RI-261323
Federated Cloud Services
Federated IaaS Cloud
38
Tier 1:
Reliable
Infrastructure Cloud
Tier 4:
Zero ICT
Infrastructures
Tier 3:
Platform as a Service
Tier 2:
General-purpose
platform services
PaaS
PaaS
DBaaS
Hadoop
aaS
VRE
Secure storage
KeyMgmt
Encryption
ACLmgmt
Virtual
eLaboratory
Conclusions
• Utilisation of virtual infrastructure is the only scalable method to support large number
of disparate user communities with widely differing application models,
• Federation as robust and scalable model of national/European cloud infrastructure for
research, though part of an ecosystem of e-infrastructure not e-infrastructure alone.
• Federation is only possible by the availability of open standards,
• Successful pilot tests of multiple prototypes of cloud infrastructure allowed a quicker
development of the final model for EGI,
• EGI Federated Cloud is attracting new communities belonging to various scientific
domains
– 26 communities and 59 use cases currently supported, 5 from commercial organisations
• Paving the way for a global federated cloud marketplace led through European
Innovation
– Established best practice
– Illustrating leadership
– Open standards, open technology
– Open membership, open processes
– Open competition
• Oxford and OeRC has led this activity from its inception in 2010 through to now
An IT Services and Oxford e-Research Centre Partner facility
An IT Services and Oxford e-Research Centre Partner facility
Thank you & Questions

Federated Cloud Computing

  • 1.
    1 Federated Cloud Computing DrDavid Wallom Associate Director
  • 2.
    Overview • The Oxforde-Research Centre • Historical Cloud Federation • The EGI Federated Cloud
  • 3.
    Innovative digital methods transformingresearch A collaborative research hub for Digital Oxford A full department in a leading science division
  • 5.
  • 8.
    HPC Engine andStorage Next Generation Infrastructure The Smart Grid High Speed Communications System Service Restorati on Voltag e Contro l Condition Monitori ng/Data Mining Distributio n System State Estimation SCADA & Distribution Management System
  • 13.
    13 The roadmap tofederated Cloud
  • 14.
    UK NGS CloudActivities (2010 – 2012) • NGS Agile Deployment Environments EPSRC funded, 2 years • Staff: – David Wallom (OeRC, Oxford); – David Fergusson (NeSC, Edinburgh); – Steve Thorn (NeSC, Edinburgh); – Matteo Turilli (OeRC, Oxford). • Goals: – EC2 compatible, open source solution; – Small physical systems physically distributed – development of a dedicated pool of images, supporting both end user and NGS requirements such as training; – collecting data about feasibility, costs, stability; – identify use cases and gather further requirements.
  • 15.
    NGS Cloud Usage •>120 registered users: uptake has been very fast and users stayed engaged throughout the whole testing period; • 26 institutions: 23 HEI both universities and colleges, 3 companies; • 30 projects; • 10 research areas. Life sciences Teaching Mathematics Cloud R&D Physics Ecology Geography Medicine Social Science Engineering
  • 16.
    Exemplar Case Studies •Evolutionary Genomics: “analysis and Information management of Next Generation Sequencing (NGS) of Genomic data poses many challenges in terms of time and size. We are exploring the translation of high quality NGS scientific analysis pipelines to make best use of Cloud infrastructure”; • Geospatial Science: “geospatial data is a mix of raster and vector data. As rasterizing is CPU-hungry process, and all maps displayed on the screen of the final user are rasters, it is more efficient to do the process on the server side. I am investigating how this process can be dispersed across many, if not unlimited instances in a cloud”; • Agent-based modelling of crime: “at the moment I have a tomcat server that hosts some web services used to run social simulation model, it needs access to the file system to run fortran scripts, create files etc. There are loads of problems with running our own server at uni and I think a virtual machine that I could have control over would be much better”.
  • 17.
    Flexible Services forthe Support of Research (FleSSR) 6 Partners • Academic and industrial; • 3 cloud infrastructures. Goals Building federated cloud infrastructure, extending the use of UK NGS central services with cloud brokering and accounting. Use cases • Multi Platform Software Development; • On demand Research data storage.
  • 18.
    FleSSR Architecture Oxford Reading Eduserv Zeel/iBroker STFC/NGS Accounting Database
  • 19.
    FleSSR Infrastructure • Local/Global:services depends either on local or global access. Cloud brokering is not mandatory for AWS-like service access; • Multiple identities: every user may have multiple identities, both local and global; • Only personal identities: group identities are not implemented. The management of every single identity is left to the legally responsible user; • Multiple AA technologies: AA may differ depending on local and global policies/technologies; • Multiple accounting: every single identity is accounted for its usage. Every individual may get multiple invoices.
  • 20.
    FleSSR Use Case:Multi Platform Software Development Zeel/i Broker Instance configuration manager FleSSR cloud Build manager CVS / SVN repository Build instance 1 Build instance 2 Build instance 3 Build instance 4 Build instance 5
  • 21.
    FleSSR Use Case:On demand Research data storage Zeel/i Broker Volume Manager FleSSR cloud VM EBS Interface EBS Volume
  • 22.
    FleSSR Output Code • Instanceconfiguration and build manager: Perl command line utility + Java client utilising the Zeel/I API; • Personal EBS volume manager: web-based, Java client for EBS volumes handling + tailored VM image with multiple data interfaces (SFTP, WebDAV, GlusterFS, rsync, ssh); • Eucalyptus open-source accounting system: Perl aggregators and parsers for standard eucalyptus open-source log files + MySQL accounting database + PHP accounting client. Use cases • SKA community testing of Use case; • Institutional ICT team testing WEB-DAV, GridFTP & GlusterFS solution as Use case 2.
  • 23.
  • 24.
    www.egi.euEGI-InSPIRE RI-261323 Rationale Growth ofProviders • High Throughput Platform – Academic resource providers • Federated Cloud Platform – Diversity of resource providers Tens of 1000’s Millions Few related use cases Single application model Many diverse use cases & application models Growth of Research Communities 24
  • 25.
    www.egi.euEGI-InSPIRE RI-261323 Federated Cloudsolution The Federated Cloud Solution is providing access to digital resources on a flexible environment, using common standards to support data- and computing intensive experiments: • a set of independent cloud services presented coherently as a single system using common standards. • allows the user to choose freely among a broader range of service providers • allow use of applications already developed by people within their own community whom they trust, and from other communities who have an independent badge of quality. Target groups: • individual researchers • larger research communities or groups 25
  • 26.
    www.egi.euEGI-InSPIRE RI-261323 Principles ofFederation • Standards and validation: Recommended and common open standards for the interfaces and images – OCCI, CDMI, OVF, GLUE2, AAI • Resource integration: Cloud Computing to be integrated into the existing production infrastructure. • Heterogeneous implementation: no mandate on the cloud technology. • Provider agnosticism: the only condition to federate resources is to expose the chosen interfaces and services. 26
  • 27.
    www.egi.euEGI-InSPIRE RI-261323 EGI CloudInfrastructure 27 EGI Core Platform Federated AAI Service Registry Monitoring Accounting EGI Cloud Infrastructure Platform Instance Mgmt Information Discovery Storage Management Help and Support Security Co- ordination Training and Outreach EGICollaborationTools EGIApplication DB Image Repository EGICloudServiceMarketplace Sustainable Business Models User Community Monitoring and control of utilisation Technical Consultancy and Support Uniform interfaces to Cloud Compute and Storage Cloud Management Stacks (OpenStack, OpenNebula, Synnefo, …) SecureendorsedApplicationand ServiceDeployment GSIGLUE2 Cloudinit CDMI SAM UR OVF OCCI
  • 28.
    www.egi.euEGI-InSPIRE RI-261323 Using openstandards ffor VM Management rOCCI-server 28
  • 29.
    www.egi.euEGI-InSPIRE RI-261323 Partnership Resources – 13NGIs provide 22 certified resources – 4 NGIs currently integrating resources – 5 NGIs with interested resource providers – Worldwide interest & integration • Australia* (NeCTAR) • South Africa* (SAGrid) • South Korea* (KISTI) • United States* (NIST, NSF A.C. Centres) * Not shown on map Usage since launch - 562k VMs - 42M CPU hours 29
  • 30.
    www.egi.euEGI-InSPIRE RI-261323 User Model •The offer to our users: • Total control over deployed applications • Elastic resource consumption based on real needs • Workloads processed on-demand • Endorsed and accredited applications available from multiple different communities shared • Single sign-on at multiple, independent providers • Centralised access to service information across multiple providers 30
  • 31.
    www.egi.euEGI-InSPIRE RI-261323 Virtual ApplianceCatalogue • Registered Virtual Appliances: 30, ref • Supporting Sites: 21, ref • Supported Virtual Organizations: 9, ref • atlas, • biomed, • cms, • demo.fedcloud.egi.eu, • drihm.eu, • fedcloud.egi.eu, • highthroughputseq.egi.eu, • lhcb, • vo.chain-project.eu [Operation of the AppDB Cloud MP, officially started on June/2014] 31
  • 32.
    www.egi.euEGI-InSPIRE RI-261323 Distribution ofVirtual Appliances Research Community 32
  • 33.
    www.egi.euEGI-InSPIRE RI-261323 Virtual Appliancesdistribution Technical function 33
  • 34.
    www.egi.euEGI-InSPIRE RI-261323 EGI FedCloudLaunch Communities (May 2014) • Ecology – BioVeL: Biodiversity Virtual e-Laboratory • Structural biology – WeNMR: a worldwide e-Infrastructure for NMR and structural biology • Linguistics – CLARIN: ‘British National Corpus’ service (BNCWeb) • Earth Observation – SSEP: European Space Agency’s Supersites Exploitation Platform for volcano and earthquakes monitoring (Collaboration with Helix Nebula) • Software Engineering – SCI-BUS: simulated environments for portal testing • Software Engineering – DIRAC: deploying ready-to-use distributed computing systems • Software Engineering – Catania Science Gateway Framework • Musicology – Peachnote: dynamic analysis of musical scores • Earth Observation – ENVRI: Common Operations of Environmental Research infrastructures (collaboration with EISCAT3D) • Geology – VERCE: Virtual Earthquake and seismology Research • Ecology – LifeWatch: E-Science European Infrastructure for Biodiversity and Ecosystem Research • High Energy Physics – CERN ATLAS: ATLAS processing cluster via HelixNebula More info: https://wiki.egi.eu/wiki/Fedcloud-tf:Users 34
  • 35.
    www.egi.euEGI-InSPIRE RI-261323 Current usecase status 35 59 in total 35
  • 36.
    www.egi.euEGI-InSPIRE RI-261323 36 EGI FedCloudUse Cases Discipline Classification
  • 37.
    www.egi.euEGI-InSPIRE RI-261323 Strengthening theunderpinning platform Continuing a Technology Evolution • Broader support for open standards in Cloud management frameworks – Utilisation of rOCCI for interfaces to commercial cloud frameworks – Completion of high quality reference implementation for CDMI • New feature additions to foundational tools depending on requests – Accounting, monitoring, service discovery, Image Management 37 ValueaddedservicesforUser Communities
  • 38.
    www.egi.euEGI-InSPIRE RI-261323 Federated CloudServices Federated IaaS Cloud 38 Tier 1: Reliable Infrastructure Cloud Tier 4: Zero ICT Infrastructures Tier 3: Platform as a Service Tier 2: General-purpose platform services PaaS PaaS DBaaS Hadoop aaS VRE Secure storage KeyMgmt Encryption ACLmgmt Virtual eLaboratory
  • 39.
    Conclusions • Utilisation ofvirtual infrastructure is the only scalable method to support large number of disparate user communities with widely differing application models, • Federation as robust and scalable model of national/European cloud infrastructure for research, though part of an ecosystem of e-infrastructure not e-infrastructure alone. • Federation is only possible by the availability of open standards, • Successful pilot tests of multiple prototypes of cloud infrastructure allowed a quicker development of the final model for EGI, • EGI Federated Cloud is attracting new communities belonging to various scientific domains – 26 communities and 59 use cases currently supported, 5 from commercial organisations • Paving the way for a global federated cloud marketplace led through European Innovation – Established best practice – Illustrating leadership – Open standards, open technology – Open membership, open processes – Open competition • Oxford and OeRC has led this activity from its inception in 2010 through to now
  • 40.
    An IT Servicesand Oxford e-Research Centre Partner facility
  • 41.
    An IT Servicesand Oxford e-Research Centre Partner facility
  • 42.
    Thank you &Questions