An Open-source Infrastructure for
      Cloud Computing


                               Tudor Zaharia
Overview of the presentation
  What is a Cloud?
  Public Clouds
  Open Source Cloud Infrastructure
  Eucalyptus
   What is it?
   Goals
   Features
   Design and architecture
   Experiment
  Conclusions
What is a Cloud?




SLAs

                          Web Services




                               Virtualization
Public Clouds (Now)
Large scale infrastructure available on a rental basis
   Operating System virtualization (e.g. Xen) provides CPU
   isolation
   Network isolation
   Locally specific storage abstractions

Fully customer self-service
   Service Level Agreements (SLAs) are advertized
   Requests are accepted and resources granted via web
   services
   Customers access resources remotely via the Internet

Accountability is e-commerce based
   Web-based transaction
   “Pay-as-you-go” and flat-rate subscription
   Customer service, refunds, etc.
Open Source Cloud Infrastructure
Simple
    Transparent
    Scalable => complexity often limits scalability

Extensible
    New application classes and service classes may require new features
    Clouds are new => need to extend while retaining useful features

Commodity-based
    Must leverage extensive catalog of open source software offerings
    New, unstable, and unsupported infrastructure design is a barrier to
    uptake, experimentation, and adoption

Easy
    To install => system administration time is expensive
    To maintain => system maintaining time is really expensive
Eucalyptus
Eucalyptus




Elastic Utility Computing Architecture Linking Your Programs To
                         Useful Systems
Eucalyptus - what does it brings?

• Web services based implementation of elastic/utility/cloud
computing infrastructure
  — Linux image hosting ala Amazon
• How do we know if it is a cloud?
  — Try and emulate an existing cloud: Amazon AWS
• Functions as a software overlay
  — Existing installation should not be violated (too much)
• Focus on installation and maintenance
  — “System Administrators are people too.”
Design goals
Extensibility




Non-intrusiveness
Features
Current release is version 1.5.1:
   Interface compatibility with EC2 (both Web service and Query
   interfaces)
   Support for the KVM hypervisor
   Stand-alone RPMs for non-Rocks RPM based systems
   Secure internal communication using SOAP with WS-security
   Elastic Block Store (EBS) compatible storage service
   Basic "Cloud Administrator" tools for system management and
   user accounting
   The ability to configure multiple clusters, each with private
   internal network addresses, into a single Cloud.
Installation prerequisites

Support XEN virtualization




Host Web services
Architecture
Starting point:



                  Firewall
Architecture
Hierarchical design reflecting underlying resources:
Architecture
Hierarchical design reflecting underlying resources:



                                                   Cloud controller
Cloud Controller


  user/admin entry point
  high-level VM instance scheduling decisions
  processing SLAs
  maintaining system and user metadata
Cloud Controller as a service provider




  handle user requests and authentication
  handle persistent system and user metadata (VM images and ssh
  keys)
  manage and monitor VM instances
  Web interfaces and Query interfaces
Architecture
Hierarchical design reflecting underlying resources:




                                                       Cluster controller
Cluster Controller

  groups collections of Node Controllers that logically belong
  together
  gathers state information from its nodes
  schedules incoming VM instance execution requests
  manages the configuration of public and private instance networks
  described as WSDL
Architecture
Hierarchical design reflecting underlying resources:




                                                       Node controller
Node controller

  executes on the physical resources
  instance startup
  inspection
  shutdown
  cleanup
  only one NC per physical machine is needed (regardless on the
  number of VM's on that machine)
  calls the hypervisor (XEN) to control and inspect running
  instances.
Virtual networking
Things to address:
    Performance



   Isolation



   Connectivity
Virtual network connectivity

Each instance is given:
   one "public" network interface
       handles communication to "outside" world
       bridged to real network device
   one "private" network interface
       used only for inter - VM communication

CC deals with set-up and tear down of virtual network
Virtual network isolation

  instances are assigned tags
  used as VLAN identifiers
  tag incomming traffic
  forward packets that have the same VLAN
Experiment
Small research Linux cluster:
  System description:
       7(compute nodes) + 1(head node) of Intel Xeon 3.2GHz, 3GB
       of RAM, 40GB single SCSI disk
       single CLC
       single CC
       one NC per node
  Access and restrictions
       Granted by the CLC through user signup web page
       Allocation is limited to 4 instances killed after 6 hours
       Reverse firewall for preventing EPC accesing external network
       addresses (except Linux distribution sites)
  Ambient induced load
Comparison - Instance throughput
Comparison - Instance throughput
Comparison - Instance throughput
Comparison - Network performance




             Eucalyptus vs. EC2
Conclusions

  Eucalyptus was designed from the ground up to be easy to install
  and as non-intrusive as possible (can be installed on a laptop for
  experimentation)
  Highly modular
  External interface is based on highly popular API of Amazon
  (industry standard interface)
  Unique among the open-source offerings in providing a virtual
  network overlay
Questions?




             Anyone?
Challenges

Extensibility
   Simple architecture and open internal APIs

Client-side interface
   Amazon’s EC2 interface and functionality (familiar and testable)

Networking
   Virtual private network per cloud
   Must function as an overlay => cannot supplant local networking

Security
   Must be compatible with local security policies

Packaging, installation, maintenance
   system administration staff is an important constituency for
   uptake
EC2 Compatibility
Interface is based on Amazon’s published WSDL
   2008 compliant except for
       static IP address assignment
       Security groups
   “Availability” zones correspond to individual clusters
   Uses the EC2 command-line tools downloaded from Amazon
   REST interface

S3 support/emulation: not yet, but on its way
   Images accessed by file system name instead of S3 handle for
   the moment
       Unless user wants to use the actual S3 and pay for the
       egress charges

System administration is different
   Eucalyptus defines its own Cloud Admin. tool set for user
Networking
Eucalyptus does not assume that all worker nodes will have
publicly routable IP addresses
   Each cloud allocation will have one or more public IP addresses
   All cloud images have access to a private network interface

Two types of networks internal to a cloud allocation
   Virtual private network
       Uses VDE interfaced to Xen that is set up dynamically
       Substantial performance hit within a cluster
       Allows a cloud allocation to span clusters
   High-performance private network (availability zone)
       Bypasses VDE and uses local cluster network for each
       allocation
       Runs at “native” network speed (I.e. with Xen)
       Cloud allocations cannot span clusters
Performance of the Virtual Network
Security
All Eucalyptus components use WS-security for authentication
   Encryption of inter-component communication is not enabled by
   default
       Configuration option

Ssh key generation and installation ala EC2 is implemented
   Cloud controller generates the public/private key pairs and installs
   them

User sign-up is web based
   User specifies a password and submits sign-up request
   Cert is generated but withheld until admin. approves request
   User gains access to cert. through password-protected web page
       Similar to EC2 model without the credit cards
Packaging, Installation, and Deployment
Rocks
   “One-button” install per cluster
   Requires Rocks V (the most current release) for Xen
   support
   If you know what you are doing, RPMs can be extracted
   and installed manually
   Multiple clusters requires a configuration file
        Multi-cluster configuration tools ala Rocks not readily
        available
Build-from-source
   “Many-button” install
        Instructions, scripts, rsync, and perseverance
Single-machine “cloud”

Eucalyptus - An Open-source Infrastructure for Cloud Computing

  • 1.
    An Open-source Infrastructurefor Cloud Computing Tudor Zaharia
  • 2.
    Overview of thepresentation What is a Cloud? Public Clouds Open Source Cloud Infrastructure Eucalyptus What is it? Goals Features Design and architecture Experiment Conclusions
  • 3.
    What is aCloud? SLAs Web Services Virtualization
  • 4.
    Public Clouds (Now) Largescale infrastructure available on a rental basis Operating System virtualization (e.g. Xen) provides CPU isolation Network isolation Locally specific storage abstractions Fully customer self-service Service Level Agreements (SLAs) are advertized Requests are accepted and resources granted via web services Customers access resources remotely via the Internet Accountability is e-commerce based Web-based transaction “Pay-as-you-go” and flat-rate subscription Customer service, refunds, etc.
  • 5.
    Open Source CloudInfrastructure Simple Transparent Scalable => complexity often limits scalability Extensible New application classes and service classes may require new features Clouds are new => need to extend while retaining useful features Commodity-based Must leverage extensive catalog of open source software offerings New, unstable, and unsupported infrastructure design is a barrier to uptake, experimentation, and adoption Easy To install => system administration time is expensive To maintain => system maintaining time is really expensive
  • 6.
  • 7.
    Eucalyptus Elastic Utility ComputingArchitecture Linking Your Programs To Useful Systems
  • 8.
    Eucalyptus - whatdoes it brings? • Web services based implementation of elastic/utility/cloud computing infrastructure — Linux image hosting ala Amazon • How do we know if it is a cloud? — Try and emulate an existing cloud: Amazon AWS • Functions as a software overlay — Existing installation should not be violated (too much) • Focus on installation and maintenance — “System Administrators are people too.”
  • 9.
  • 10.
    Features Current release isversion 1.5.1: Interface compatibility with EC2 (both Web service and Query interfaces) Support for the KVM hypervisor Stand-alone RPMs for non-Rocks RPM based systems Secure internal communication using SOAP with WS-security Elastic Block Store (EBS) compatible storage service Basic "Cloud Administrator" tools for system management and user accounting The ability to configure multiple clusters, each with private internal network addresses, into a single Cloud.
  • 11.
    Installation prerequisites Support XENvirtualization Host Web services
  • 12.
  • 13.
  • 14.
    Architecture Hierarchical design reflectingunderlying resources: Cloud controller
  • 15.
    Cloud Controller user/admin entry point high-level VM instance scheduling decisions processing SLAs maintaining system and user metadata
  • 16.
    Cloud Controller asa service provider handle user requests and authentication handle persistent system and user metadata (VM images and ssh keys) manage and monitor VM instances Web interfaces and Query interfaces
  • 17.
    Architecture Hierarchical design reflectingunderlying resources: Cluster controller
  • 18.
    Cluster Controller groups collections of Node Controllers that logically belong together gathers state information from its nodes schedules incoming VM instance execution requests manages the configuration of public and private instance networks described as WSDL
  • 19.
    Architecture Hierarchical design reflectingunderlying resources: Node controller
  • 20.
    Node controller executes on the physical resources instance startup inspection shutdown cleanup only one NC per physical machine is needed (regardless on the number of VM's on that machine) calls the hypervisor (XEN) to control and inspect running instances.
  • 21.
    Virtual networking Things toaddress: Performance Isolation Connectivity
  • 22.
    Virtual network connectivity Eachinstance is given: one "public" network interface handles communication to "outside" world bridged to real network device one "private" network interface used only for inter - VM communication CC deals with set-up and tear down of virtual network
  • 23.
    Virtual network isolation instances are assigned tags used as VLAN identifiers tag incomming traffic forward packets that have the same VLAN
  • 24.
    Experiment Small research Linuxcluster: System description: 7(compute nodes) + 1(head node) of Intel Xeon 3.2GHz, 3GB of RAM, 40GB single SCSI disk single CLC single CC one NC per node Access and restrictions Granted by the CLC through user signup web page Allocation is limited to 4 instances killed after 6 hours Reverse firewall for preventing EPC accesing external network addresses (except Linux distribution sites) Ambient induced load
  • 25.
  • 26.
  • 27.
  • 28.
    Comparison - Networkperformance Eucalyptus vs. EC2
  • 29.
    Conclusions Eucalyptuswas designed from the ground up to be easy to install and as non-intrusive as possible (can be installed on a laptop for experimentation) Highly modular External interface is based on highly popular API of Amazon (industry standard interface) Unique among the open-source offerings in providing a virtual network overlay
  • 30.
    Questions? Anyone?
  • 31.
    Challenges Extensibility Simple architecture and open internal APIs Client-side interface Amazon’s EC2 interface and functionality (familiar and testable) Networking Virtual private network per cloud Must function as an overlay => cannot supplant local networking Security Must be compatible with local security policies Packaging, installation, maintenance system administration staff is an important constituency for uptake
  • 32.
    EC2 Compatibility Interface isbased on Amazon’s published WSDL 2008 compliant except for static IP address assignment Security groups “Availability” zones correspond to individual clusters Uses the EC2 command-line tools downloaded from Amazon REST interface S3 support/emulation: not yet, but on its way Images accessed by file system name instead of S3 handle for the moment Unless user wants to use the actual S3 and pay for the egress charges System administration is different Eucalyptus defines its own Cloud Admin. tool set for user
  • 33.
    Networking Eucalyptus does notassume that all worker nodes will have publicly routable IP addresses Each cloud allocation will have one or more public IP addresses All cloud images have access to a private network interface Two types of networks internal to a cloud allocation Virtual private network Uses VDE interfaced to Xen that is set up dynamically Substantial performance hit within a cluster Allows a cloud allocation to span clusters High-performance private network (availability zone) Bypasses VDE and uses local cluster network for each allocation Runs at “native” network speed (I.e. with Xen) Cloud allocations cannot span clusters
  • 34.
    Performance of theVirtual Network
  • 35.
    Security All Eucalyptus componentsuse WS-security for authentication Encryption of inter-component communication is not enabled by default Configuration option Ssh key generation and installation ala EC2 is implemented Cloud controller generates the public/private key pairs and installs them User sign-up is web based User specifies a password and submits sign-up request Cert is generated but withheld until admin. approves request User gains access to cert. through password-protected web page Similar to EC2 model without the credit cards
  • 36.
    Packaging, Installation, andDeployment Rocks “One-button” install per cluster Requires Rocks V (the most current release) for Xen support If you know what you are doing, RPMs can be extracted and installed manually Multiple clusters requires a configuration file Multi-cluster configuration tools ala Rocks not readily available Build-from-source “Many-button” install Instructions, scripts, rsync, and perseverance Single-machine “cloud”