EUCALYPTUS: An Open Source Infrastructure for Elastic Computing Research Rich Wolski Chris Grzegorczyk, Dan Nurmi, Graziano Obertelli, Shriram Rajagopalan, Sunil Soman, Lamia Youseff, Dmitrii Zagorodnov Computer Science Department University of California, Santa Barbara
Exciting Weather Forecasts
Commercial Cloud Formation
What is a Cloud? SLAs Web Services Virtualization
How do they work? What can and cannot easily be hosted in a cloud? What extensions or modifications are required to support a wider variety of services and applications? Scientific computing Data assimilation Multiplayer gaming How can cloud computing be coupled with other distributed software systems and infrastructure? How should clouds and mobile devices (e.g. cell phones) interact? Open Source Cloud Simple Extensible Based on widely available and popular technologies Easy to install and maintain
The Skies are Opening Nimbus (Freeman and Keahey, University of Chicago) Client-side cloud-computing interface to Globus-enabled TeraPort cluster at U of C Based on GT4 and the Globus Virtual Workspace Service Lots of cool features Great if local resources are GT4 proficient Tutorials and documentation in “grid space” Enomalism Start-up company distributing open source  REST APIs User “dashboard” Multi-virtulaization support Lost of extended cloud services Beta version now available for download from SourceForge
E lastic  U tility  C omputing  A rchitecture  L inking  Y our  P rograms  T o  U seful  S ystems Web services based implementation of elastic/utility/cloud computing infrastructure Linux image hosting ala Amazon  How do we know if it is a cloud? Try and emulate an existing cloud: EC2 + S3 Works with command-line tools from Amazon w/o modification Enables leverage of emerging EC2 value-added service venues (e.g. Rightscale) Functions as a software overlay Existing installation should not be violated (too much) “ One-button” install using Rocks “ System Administrators are people too.”
Goals for Eucalyptus Foster research in elastic/cloud/utility computing  models of service provisioning, scheduling, SLA formulation, hypervisor portability and feature enhancement, etc. Experimentation vehicle prior to buying commercial services “ Tech Preview” using local machines with local system administration support Provide a debugging and development platform for EC2 (and other clouds) Allow the environment to be set up and tested before it is instantiated in a for-fee environment Provide a basic software development platform for the open source community E.g. the “Linux Experience” Not a designed as a replacement technology for EC2 or any other cloud service
Challenges Extensibility Simple architecture and open internal APIs Client-side interface Amazon’s EC2 interface and functionality (familiar and testable) Networking Virtual private network per cloud Must function as an overlay => cannot supplant local networking Security Must be compatible with local security policies Packaging, installation, maintenance system administration staff is an important constituency for uptake
Eucalyptus Architecture: WS-Cloud Client-side API Translator Cloud Controller Cluster Controller Node Controller Amazon EC2 Interface Database
EC2 Compatibility Interface is based on Amazon’s published WSDL 2008 compliant except for  static IP address assignment Security groups “ Availability” zones correspond to individual clusters Uses the EC2 command-line tools downloaded from Amazon REST interface  S3 support/emulation: not yet, but on its way Images accessed by file system name instead of S3 handle for the moment Unless user wants to use the actual S3 and pay for the egress charges System administration is different Eucalyptus  defines its own Cloud Admin. tool set for user accounting and cloud management
Networking Eucalyptus  does not assume that all worker nodes will have publicly routable IP addresses Each cloud allocation will have one or more public IP addresses All cloud images have access to a private network interface Two types of networks internal to a cloud allocation Virtual private network Uses VDE interfaced to Xen that is set up dynamically Substantial performance hit within a cluster Allows a cloud allocation to span clusters High-performance private network (availability zone) Bypasses VDE and uses local cluster network for each allocation Runs at “native” network speed (I.e. with Xen) Cloud allocations cannot span clusters Availability zone approach fits with Amazon’s high-level semantics
Virtual Network: Ethernet Overlay ssl vde vde vde vde vde vde vde vde
Performance of the Virtual Network
Security All  Eucalyptus  components use WS-security for authentication Encryption of inter-component communication is not enabled by default Configuration option Ssh key generation and installation ala EC2 is implemented Cloud controller generates the public/private key pairs and installs them  User sign-up is web based User specifies a password and submits sign-up request Cert is generated but withheld until admin. approves request User gains access to cert. through password-protected web page Similar to EC2 model without the credit cards
Packaging, Installation, and Deployment Rocks “ One-button” install per cluster Requires Rocks V (the most current release) for Xen support If you know what you are doing, RPMs can be extracted and installed manually Multiple clusters requires a configuration file Multi-cluster configuration tools ala Rocks not readily available Build-from-source “ Many-button” install Instructions, scripts, rsync, and perseverance Single-machine “cloud” All components run in dom0 Need to resolve port-conflicts by hand
What’s it Made Out Of? Axis2 and Axis2c version 1.4.0 Hibernate 3.2.2 HSQLDB 1.8.0 jetty 6.1.9 JiBX (March 30th sourceforge) Mule 2.0.1 Rampart version 1.3 libvirt version 0.4.2 socat-1.6.0 VDE version 2.2.0-pre2
Eucalyptus Public Cloud Free, time limited access to a  Eucalyptus  installation at UCSB Only installed images can be run (i.e. no image uploading) 4 VM limit 6 hour limit Reverse firewall Configuration 8 Pentium Xeon processors (3.2 GHz) 2.5 GB of memory per image 36 GB of disk space 1 Gb enet interconnect Local availability zone only (i.e. no VDE) Debian 4.0, Linux v2.6.18-xen-3.1 Xen 3.2 Demo
EC2 and EPC Throughput
EC2 and EPC RTT
Single Instance
Four Instances
Eight Instances
Version History Eucalyptus  version 1.0 became available for public release 5/28/08 (Rocks binary only) Version is 1.1 shipped 7/1/2008 Bug fixes Decent WS-security implementation REST interface Source code release Build-from source “guidance” scripts and instructions Version 1.2 shipped 8/1/2008 Primarily a bug-fix release Upgrade mechanism (instead of re-install) Version 1.3 shipped 8/23/2008 Amazon changed their client-side tools
Next Releases Version 1.4 (expected 11/5/2008) S3 support uses local file system Administrator definable SLAs Cross cluster layer 2 networking Elastic IPs and security groups, metadata service User-defined image management and registration Version 1.5 (expected 1/1/09) Elastic Block Store (EBS) VLAN safe layer 3 networking Credential federation support DB managed configuration support Distributed DB state management (maybe) Should be fully 2008 interface compatible in Release 1.5
Next Generation Eucalyptus Networking Multiple networking implementations Open Source + academic environment == overlay or nothing Some sites are willing to tolerate a more invasive networking approach in exchange for performance and scalability Three different approaches Exploit Xen network interface isolation and VLANS + software only approach - will make Eucalyptus more Xen dependent IP-tables and NATs + high-level software only approach - possible conflicts with existing IP-tables configuration(s) Hardware-supported VLANs and trunking + fast and scalable - requires on-line access to VLAN configuration interface
More Plans Hypervisor religiosity and secularism Current implementation uses a subset of the libvirt interface Xen, VMWare, kvm Eucalyptus + Xen + VMWare “works” but is clearly not the right answer HyperV Initial study makes it look quite doable for virtualization support Understanding the networking is next on the list Port of the Eucalyptus components to .Net UCSB Campus Cloud(s) UC Cyberinfrastructure pilot Test installation up at California Nanosystems Institute (CNSI) Leverage UCSB VMWare installation and Eucalyptus installation at SDSC Requires a very rich user accounting system
Ancillary Projects Google App Engine AppDrop will run App Engine inside EC2 Port AppDrop to Eucalyptus Port App Engine to Hbase and/or Hypertable Should provide an interesting research vehicle Rightscale Local enterprise focused on providing Ruby-on-Rails infrastructure for EC2 “ Turing Test” for Eucalyptus Can Rightscale “tell” that it isn’t talking to EC2? Requires that the REST interface be solid Testing now against the EPC
Clouds Versus Grids Clouds and Grids are distinct Cloud Full private cluster is provisioned Individual user can only get a tiny fraction of the total resource pool No support for cloud federation except through the client interface Opaque with respect to resources Grid Built so that individual users can get most, if not all of the resources in a single request Middleware approach takes federation as a first principle Resources are exposed, often as bare metal These differences mandate different architectures for each
Lessons Learned so Far Open source for cloud computing constrains design more than we thought it would More of the technical challenge centers on dealing with local configuration choices Multi-cluster service ensemble really isn’t a typical open source tool Do we really need a laptop edition? Administrators in the “real world” still build clusters by hand We thought the use of Rocks early on would make us heroes -- it hasn’t In HPC space, admin time is *really* expensive There are few, if any, cloud configuration tools available Red Hat, Debian, CentOS, Ubuntu => linux packaging and deployment Rocks => cluster packaging and deployment ??? => cloud packaging and deployment?
Thanks, More Information, and Help! National Science Foundation VGrADS Project SDSC, CNSI, IU, Rice University RightScale.com  The Eucalyptus Development Team at UCSB is Chris Grzegorczyk Dan Nurmi Graziano Obertelli Shriram Rajagopalan Sunil Soman Lamia Youseff Dmitrii Zagordnov Rich<no_spam>@cs.ucsb.edu http://eucalyptus.cs.ucsb.edu

Eucalyptus: Open Source for Cloud Computing

  • 1.
    EUCALYPTUS: An OpenSource Infrastructure for Elastic Computing Research Rich Wolski Chris Grzegorczyk, Dan Nurmi, Graziano Obertelli, Shriram Rajagopalan, Sunil Soman, Lamia Youseff, Dmitrii Zagorodnov Computer Science Department University of California, Santa Barbara
  • 2.
  • 3.
  • 4.
    What is aCloud? SLAs Web Services Virtualization
  • 5.
    How do theywork? What can and cannot easily be hosted in a cloud? What extensions or modifications are required to support a wider variety of services and applications? Scientific computing Data assimilation Multiplayer gaming How can cloud computing be coupled with other distributed software systems and infrastructure? How should clouds and mobile devices (e.g. cell phones) interact? Open Source Cloud Simple Extensible Based on widely available and popular technologies Easy to install and maintain
  • 6.
    The Skies areOpening Nimbus (Freeman and Keahey, University of Chicago) Client-side cloud-computing interface to Globus-enabled TeraPort cluster at U of C Based on GT4 and the Globus Virtual Workspace Service Lots of cool features Great if local resources are GT4 proficient Tutorials and documentation in “grid space” Enomalism Start-up company distributing open source REST APIs User “dashboard” Multi-virtulaization support Lost of extended cloud services Beta version now available for download from SourceForge
  • 7.
    E lastic U tility C omputing A rchitecture L inking Y our P rograms T o U seful S ystems Web services based implementation of elastic/utility/cloud computing infrastructure Linux image hosting ala Amazon How do we know if it is a cloud? Try and emulate an existing cloud: EC2 + S3 Works with command-line tools from Amazon w/o modification Enables leverage of emerging EC2 value-added service venues (e.g. Rightscale) Functions as a software overlay Existing installation should not be violated (too much) “ One-button” install using Rocks “ System Administrators are people too.”
  • 8.
    Goals for EucalyptusFoster research in elastic/cloud/utility computing models of service provisioning, scheduling, SLA formulation, hypervisor portability and feature enhancement, etc. Experimentation vehicle prior to buying commercial services “ Tech Preview” using local machines with local system administration support Provide a debugging and development platform for EC2 (and other clouds) Allow the environment to be set up and tested before it is instantiated in a for-fee environment Provide a basic software development platform for the open source community E.g. the “Linux Experience” Not a designed as a replacement technology for EC2 or any other cloud service
  • 9.
    Challenges Extensibility Simplearchitecture and open internal APIs Client-side interface Amazon’s EC2 interface and functionality (familiar and testable) Networking Virtual private network per cloud Must function as an overlay => cannot supplant local networking Security Must be compatible with local security policies Packaging, installation, maintenance system administration staff is an important constituency for uptake
  • 10.
    Eucalyptus Architecture: WS-CloudClient-side API Translator Cloud Controller Cluster Controller Node Controller Amazon EC2 Interface Database
  • 11.
    EC2 Compatibility Interfaceis based on Amazon’s published WSDL 2008 compliant except for static IP address assignment Security groups “ Availability” zones correspond to individual clusters Uses the EC2 command-line tools downloaded from Amazon REST interface S3 support/emulation: not yet, but on its way Images accessed by file system name instead of S3 handle for the moment Unless user wants to use the actual S3 and pay for the egress charges System administration is different Eucalyptus defines its own Cloud Admin. tool set for user accounting and cloud management
  • 12.
    Networking Eucalyptus does not assume that all worker nodes will have publicly routable IP addresses Each cloud allocation will have one or more public IP addresses All cloud images have access to a private network interface Two types of networks internal to a cloud allocation Virtual private network Uses VDE interfaced to Xen that is set up dynamically Substantial performance hit within a cluster Allows a cloud allocation to span clusters High-performance private network (availability zone) Bypasses VDE and uses local cluster network for each allocation Runs at “native” network speed (I.e. with Xen) Cloud allocations cannot span clusters Availability zone approach fits with Amazon’s high-level semantics
  • 13.
    Virtual Network: EthernetOverlay ssl vde vde vde vde vde vde vde vde
  • 14.
    Performance of theVirtual Network
  • 15.
    Security All Eucalyptus components use WS-security for authentication Encryption of inter-component communication is not enabled by default Configuration option Ssh key generation and installation ala EC2 is implemented Cloud controller generates the public/private key pairs and installs them User sign-up is web based User specifies a password and submits sign-up request Cert is generated but withheld until admin. approves request User gains access to cert. through password-protected web page Similar to EC2 model without the credit cards
  • 16.
    Packaging, Installation, andDeployment Rocks “ One-button” install per cluster Requires Rocks V (the most current release) for Xen support If you know what you are doing, RPMs can be extracted and installed manually Multiple clusters requires a configuration file Multi-cluster configuration tools ala Rocks not readily available Build-from-source “ Many-button” install Instructions, scripts, rsync, and perseverance Single-machine “cloud” All components run in dom0 Need to resolve port-conflicts by hand
  • 17.
    What’s it MadeOut Of? Axis2 and Axis2c version 1.4.0 Hibernate 3.2.2 HSQLDB 1.8.0 jetty 6.1.9 JiBX (March 30th sourceforge) Mule 2.0.1 Rampart version 1.3 libvirt version 0.4.2 socat-1.6.0 VDE version 2.2.0-pre2
  • 18.
    Eucalyptus Public CloudFree, time limited access to a Eucalyptus installation at UCSB Only installed images can be run (i.e. no image uploading) 4 VM limit 6 hour limit Reverse firewall Configuration 8 Pentium Xeon processors (3.2 GHz) 2.5 GB of memory per image 36 GB of disk space 1 Gb enet interconnect Local availability zone only (i.e. no VDE) Debian 4.0, Linux v2.6.18-xen-3.1 Xen 3.2 Demo
  • 19.
    EC2 and EPCThroughput
  • 20.
  • 21.
  • 22.
  • 23.
  • 24.
    Version History Eucalyptus version 1.0 became available for public release 5/28/08 (Rocks binary only) Version is 1.1 shipped 7/1/2008 Bug fixes Decent WS-security implementation REST interface Source code release Build-from source “guidance” scripts and instructions Version 1.2 shipped 8/1/2008 Primarily a bug-fix release Upgrade mechanism (instead of re-install) Version 1.3 shipped 8/23/2008 Amazon changed their client-side tools
  • 25.
    Next Releases Version1.4 (expected 11/5/2008) S3 support uses local file system Administrator definable SLAs Cross cluster layer 2 networking Elastic IPs and security groups, metadata service User-defined image management and registration Version 1.5 (expected 1/1/09) Elastic Block Store (EBS) VLAN safe layer 3 networking Credential federation support DB managed configuration support Distributed DB state management (maybe) Should be fully 2008 interface compatible in Release 1.5
  • 26.
    Next Generation EucalyptusNetworking Multiple networking implementations Open Source + academic environment == overlay or nothing Some sites are willing to tolerate a more invasive networking approach in exchange for performance and scalability Three different approaches Exploit Xen network interface isolation and VLANS + software only approach - will make Eucalyptus more Xen dependent IP-tables and NATs + high-level software only approach - possible conflicts with existing IP-tables configuration(s) Hardware-supported VLANs and trunking + fast and scalable - requires on-line access to VLAN configuration interface
  • 27.
    More Plans Hypervisorreligiosity and secularism Current implementation uses a subset of the libvirt interface Xen, VMWare, kvm Eucalyptus + Xen + VMWare “works” but is clearly not the right answer HyperV Initial study makes it look quite doable for virtualization support Understanding the networking is next on the list Port of the Eucalyptus components to .Net UCSB Campus Cloud(s) UC Cyberinfrastructure pilot Test installation up at California Nanosystems Institute (CNSI) Leverage UCSB VMWare installation and Eucalyptus installation at SDSC Requires a very rich user accounting system
  • 28.
    Ancillary Projects GoogleApp Engine AppDrop will run App Engine inside EC2 Port AppDrop to Eucalyptus Port App Engine to Hbase and/or Hypertable Should provide an interesting research vehicle Rightscale Local enterprise focused on providing Ruby-on-Rails infrastructure for EC2 “ Turing Test” for Eucalyptus Can Rightscale “tell” that it isn’t talking to EC2? Requires that the REST interface be solid Testing now against the EPC
  • 29.
    Clouds Versus GridsClouds and Grids are distinct Cloud Full private cluster is provisioned Individual user can only get a tiny fraction of the total resource pool No support for cloud federation except through the client interface Opaque with respect to resources Grid Built so that individual users can get most, if not all of the resources in a single request Middleware approach takes federation as a first principle Resources are exposed, often as bare metal These differences mandate different architectures for each
  • 30.
    Lessons Learned soFar Open source for cloud computing constrains design more than we thought it would More of the technical challenge centers on dealing with local configuration choices Multi-cluster service ensemble really isn’t a typical open source tool Do we really need a laptop edition? Administrators in the “real world” still build clusters by hand We thought the use of Rocks early on would make us heroes -- it hasn’t In HPC space, admin time is *really* expensive There are few, if any, cloud configuration tools available Red Hat, Debian, CentOS, Ubuntu => linux packaging and deployment Rocks => cluster packaging and deployment ??? => cloud packaging and deployment?
  • 31.
    Thanks, More Information,and Help! National Science Foundation VGrADS Project SDSC, CNSI, IU, Rice University RightScale.com The Eucalyptus Development Team at UCSB is Chris Grzegorczyk Dan Nurmi Graziano Obertelli Shriram Rajagopalan Sunil Soman Lamia Youseff Dmitrii Zagordnov Rich<no_spam>@cs.ucsb.edu http://eucalyptus.cs.ucsb.edu