Scalable and Open AppEngine
Development and Deployment
  Navraj Chohan       Chris Bunch
  Sydney Pang      Chandra Krintz
   Nagy Mostafa      Sunil Soman
            Rich Wolski
http://www.capgemini.com/technology-blog/2009/04/
   from_lamp_to_leap_and_beyond.php
Terminology

  Software-as-a-Service (SaaS)
      e.g., SalesForce, Gmail
Provides remote application access


  Platform-as-a-Service (PaaS)
    e.g., Google App Engine
 Provides scalable runtime stack


Infrastructure-as-a-Service (IaaS)
   e.g., Amazon Web Services
   Provides full system images
•  Open-source, Platform-as-a-Service for research
   and engineering of cloud computing components,
   applications, and services

•  Automated deployment of applications to high-
   performance databases
•  Fine grain control over application environment
•  Google App Engine apps hosting on your cluster
  –  Real applications
  –  Familiar API (that is extensible for lock-in avoidance)
  –  Your data and code on your resources
From Google App Engine (GAE)
          to AppScale
•  GAE Application Programming Interface
  –    Datastore (get/put)
  –    Memcache
  –    URL Fetching
  –    Mail
  –    Images
  –    Authentication
•  Write Python/Java GAE app
  –  Use SDK locally to test and generate indexes
        •  APIs implemented as non-scalable, simple versions
From Google App Engine (GAE)
          to AppScale
•  GAE Application Programming Interface
  –    Datastore (get/put)                      BigTable
  –    Memcache                                 Memcached
  –    URL Fetching
  –    Mail                                     GMail
  –    Images
  –    Authentication                           Google Accounts
•  Write Python/Java GAE app
  –  Use SDK locally to test and generate indexes
        •  APIs implemented as non-scalable, simple versions
  –  Upload to Google resources
        •  Highly scalable API implementation
Sandboxed Runtime
•    Restricted subset of library calls
•    No reading/writing from/to file system
•    Data persistence only via get/put interface
•    Computation bounded: 30 secs per request
•    Access web services over via HTTP / HTTPS
     only (ports 80 and 443)
Recent GAE Additions
•  Python and JVM SDKs
  –  JRuby, Clojure, etc. available through Java
•  Task Queue, Cron, XMPP APIs
•  New SLAs for paying customers
  –  $0.10 per CPU core hour
  –  $0.10 per GB bandwidth in
  –  $0.12 per GB bandwidth out
  –  $0.15 per GB data stored per month
Protocol Buffers
•  Google App Engine’s internal data format
   –  And AppScale’s
•  Similar to C-style structs:

message Person {
  required int32 id = 1;
  optional string name = 2;
}
From Google App Engine (GAE)
          to AppScale
•  AppScale extends the GAE SDK
  –  Replaces the simple, non-scalable API implementation
     with pluggable, distributed, scalable components
     •  Using open-source solutions as available/possible
     •  Communication over SSL
•  Available as source and as system image
  –  Each instance can implement any component
     •  Self configuring as part of AppScale cloud deployment
  –  Deploys over
     •  Virtual machine monitors (Xen, KVM)
     •  Infrastructure (IaaS) cloud layers
IaaS Cloud Systems
•  Amazon Web Services (AWS)
   –  Elastic Compute Cloud (EC2), Persistent Storage (S3, EBS)
   –  For-fee, as negotiated in SLA (CPU, network, storage)
   –  Vast resources available
       •  Users access small (opaque) subset, can scale-out

•  Eucalyptus
   –  Open source implementation of the AWS APIs
   –  Inspiration for AppScale – familiar, widely-used API
      implementation for execution on your cluster
      •  Limited only by the hardware you have available
Differences in AppScale
          Deployment Options
•  Xen / KVM:
  –  Static deployment
     •  Can use as many nodes as are manually configured
•  Eucalyptus / EC2
  –  Dynamic deployment
     •  Can use as many nodes as the system can support (or pay for
        for EC2 deployment)
  –  As part of ongoing/future work: support for dynamic scaling
     •  Front-end (user-facing) & back-end (data managment & computation)
     •  SLA renegotiation
AppScale System Layout
•  AppLoadBalancer (ALB)
•  AppServer (AS)
•  Database Master/Slave/Peer (DB M/S/P)

 GAE App                                   AppScale
                          DB M/P
Developer                                  tools
               ALB
(AppScale
  Admin)                                   App
                             DB S/P
                                           Controller
GAE App
GAE App
 GAE App             AS
 Users
 Users
   Users                                   HTTPS
AppController (AC)
•  SOAP Server written in Ruby
  –  Runs on all nodes
•  Middleware layer
•  Controls and sets up a node for use
  –  Sets up configuration files (data replication)
  –  Sets up firewall for security
•  Master AC “heartbeats” all other nodes
  –  Collects performance info as well
AppLoadBalancer (ALB)
•  Ruby on Rails application
•  Handles authentication and routing of users
   to AppServers
•  Three copies are deployed via Mongrel
  –  Load balanced via nginx
Database Management
•  Five databases currently available:
  –  HBase, Hypertable: Master / Slave
  –  Cassandra, Voldemort: Peer / Peer
  –  Clustered MySQL: Relational
•  Two main components
  –  Protocol Buffer Server: Data access / storage
  –  User / App Server: Authentication
AppServer (AS)
•  Modified Google App Engine SDK
•  App requests internally are Protocol Buffers
  –  Forwards requests to PB Server
•  Minimal request set:
  –  Put(id)
  –  Get(id)
  –  Query: Equivalent to get_all_in_table
  –  Delete(id)
  –  Count: Total number of items in database
  –  GetSchema
AppScale Tools
•  Ruby scripts that initiate AppScale
   deployment
  –  Initializes the first AppController for use
  –  Uploads AppEngine app
•  Conceptually similar to Amazon AWS EC2
   tools
  –  describe-instances
  –  upload-app: Introduce additional apps
  –  terminate-instances
Fault Tolerance
•  System can survive the following failures:
  –  AppServer failure
  –  Database Slave failure
  –  Database Peer failure
  –  AppLoadBalancer failure *
  –  AppController failure *
Testing Methodology
•  Load testing done via the Grinder
•  Test specifics:
  –  Initially 3 users
  –  3 users added every 5 seconds
  –  Done until 160 seconds have passed
•  Each user navigates the page, performs
   some scripted action
•  Measured total transactions performed and
   average response time
AppScale Evaluation Cluster
•  Three Grinder nodes, four AppScale nodes
  –  One master, three slaves
  –  Virtualized via Xen
  –  Database: HBase (3x replication) 64 MB HDFS blocks
     •  PBServer via Thrift; stores entire protocol buffers
•  Hardware
  –  Quad-core 2.66 GHz machines
  –  8 GB of RAM
  –  Connected via Gigabit Ethernet
Applications Tested
•  Tasks - a to-do list
   –  Read and write intensive (44 transactions per user)
•  Cccwiki – allows users to edit web pages
   –  Read intensive, updates only (74 transactions per
      user)
•  Guestbook – allows users to post messages
   –  Retrieves ten most recent posts only (9 transactions
      per user)
•  Shell – provides an interactive Python shell
   –  Compute intensive (14 transactions per user)
Transactions per App
App Response Time
Comparison with Google
Room for Improvement
•  Current bottlenecks:
  –  Queries perform filtering server-side
  –  Filtering is done outside of the DB
  –  AppEngine, PB Server are single-threaded
  –  Entry point to some DBs is single-threaded
•  Future work will address these problems
  –  Will also compare performance across DBs
  –  e.g., BigTable-like DBs vs. P2P DBs
Related Work
•  AppDrop
  –  Proof-of-concept Rails app
•  TyphoonAE
  –  Relatively new (alpha release)
  –  Runs MongoDB only
•  Microsoft Azure
  –  Uses .NET as the platform
  –  Has a similar pricing model to AppEngine
AppScale Recap
•  Distributed, multi-component system
   –  Deployed as a single system image (self
      configuring)
      •  Static deployment over Xen/KVM
      •  Dynamic deployment over Eucalyptus/EC2
•  Databases supported:
   –  HBase, Hypertable, MySQL, Cassandra,
      Voldemort
•  Fault-tolerant
AppScale Recap
•  Open cloud research platform
  –  International user community
•  Goals
  –  Easy to use and extend
  –  Automatic deployment of PaaS cloud and
     GAE apps on resources other than Google’s
  –  Support real applications and users
     •  Experimentation and testing in real environments
•  Current performance results are a baseline
Performance Improvements
•  AppEngine now multi-process, load balanced
•  PB Server now multi-threaded
•  Storing data like Google for HBase and
   Hypertable
  –  Three tables: Reference, Sort Ascending, Sort
     Descending
Future Work
•  Expand out of the web services domain
  –  Investigating opportunities in streaming
  –  Integrated MapReduce support for high-
     performance computing (HPC)
  –  Co-locate AppEngines and use shared
     memory
•  Additional databases:
  –  MongoDB, Scalaris, CouchDB
Thanks!
•  To the AppScale team!
     –  Co-lead Navraj Chohan
     –  Advisor Prof. Chandra Krintz
•    To the open-source community
•    To Google, NSF, and IBM for financial support
•    To you all for coming out today
•    Check us out on the web:
     –  http://appscale.cs.ucsb.edu

Appscale at CLOUDCOMP '09

  • 1.
    Scalable and OpenAppEngine Development and Deployment Navraj Chohan Chris Bunch Sydney Pang Chandra Krintz Nagy Mostafa Sunil Soman Rich Wolski
  • 2.
  • 3.
    Terminology Software-as-a-Service(SaaS) e.g., SalesForce, Gmail Provides remote application access Platform-as-a-Service (PaaS) e.g., Google App Engine Provides scalable runtime stack Infrastructure-as-a-Service (IaaS) e.g., Amazon Web Services Provides full system images
  • 4.
    •  Open-source, Platform-as-a-Servicefor research and engineering of cloud computing components, applications, and services •  Automated deployment of applications to high- performance databases •  Fine grain control over application environment •  Google App Engine apps hosting on your cluster –  Real applications –  Familiar API (that is extensible for lock-in avoidance) –  Your data and code on your resources
  • 5.
    From Google AppEngine (GAE) to AppScale •  GAE Application Programming Interface –  Datastore (get/put) –  Memcache –  URL Fetching –  Mail –  Images –  Authentication •  Write Python/Java GAE app –  Use SDK locally to test and generate indexes •  APIs implemented as non-scalable, simple versions
  • 6.
    From Google AppEngine (GAE) to AppScale •  GAE Application Programming Interface –  Datastore (get/put) BigTable –  Memcache Memcached –  URL Fetching –  Mail GMail –  Images –  Authentication Google Accounts •  Write Python/Java GAE app –  Use SDK locally to test and generate indexes •  APIs implemented as non-scalable, simple versions –  Upload to Google resources •  Highly scalable API implementation
  • 7.
    Sandboxed Runtime •  Restricted subset of library calls •  No reading/writing from/to file system •  Data persistence only via get/put interface •  Computation bounded: 30 secs per request •  Access web services over via HTTP / HTTPS only (ports 80 and 443)
  • 8.
    Recent GAE Additions • Python and JVM SDKs –  JRuby, Clojure, etc. available through Java •  Task Queue, Cron, XMPP APIs •  New SLAs for paying customers –  $0.10 per CPU core hour –  $0.10 per GB bandwidth in –  $0.12 per GB bandwidth out –  $0.15 per GB data stored per month
  • 9.
    Protocol Buffers •  GoogleApp Engine’s internal data format –  And AppScale’s •  Similar to C-style structs: message Person { required int32 id = 1; optional string name = 2; }
  • 10.
    From Google AppEngine (GAE) to AppScale •  AppScale extends the GAE SDK –  Replaces the simple, non-scalable API implementation with pluggable, distributed, scalable components •  Using open-source solutions as available/possible •  Communication over SSL •  Available as source and as system image –  Each instance can implement any component •  Self configuring as part of AppScale cloud deployment –  Deploys over •  Virtual machine monitors (Xen, KVM) •  Infrastructure (IaaS) cloud layers
  • 11.
    IaaS Cloud Systems • Amazon Web Services (AWS) –  Elastic Compute Cloud (EC2), Persistent Storage (S3, EBS) –  For-fee, as negotiated in SLA (CPU, network, storage) –  Vast resources available •  Users access small (opaque) subset, can scale-out •  Eucalyptus –  Open source implementation of the AWS APIs –  Inspiration for AppScale – familiar, widely-used API implementation for execution on your cluster •  Limited only by the hardware you have available
  • 12.
    Differences in AppScale Deployment Options •  Xen / KVM: –  Static deployment •  Can use as many nodes as are manually configured •  Eucalyptus / EC2 –  Dynamic deployment •  Can use as many nodes as the system can support (or pay for for EC2 deployment) –  As part of ongoing/future work: support for dynamic scaling •  Front-end (user-facing) & back-end (data managment & computation) •  SLA renegotiation
  • 13.
    AppScale System Layout • AppLoadBalancer (ALB) •  AppServer (AS) •  Database Master/Slave/Peer (DB M/S/P) GAE App AppScale DB M/P Developer tools ALB (AppScale Admin) App DB S/P Controller GAE App GAE App GAE App AS Users Users Users HTTPS
  • 14.
    AppController (AC) •  SOAPServer written in Ruby –  Runs on all nodes •  Middleware layer •  Controls and sets up a node for use –  Sets up configuration files (data replication) –  Sets up firewall for security •  Master AC “heartbeats” all other nodes –  Collects performance info as well
  • 15.
    AppLoadBalancer (ALB) •  Rubyon Rails application •  Handles authentication and routing of users to AppServers •  Three copies are deployed via Mongrel –  Load balanced via nginx
  • 16.
    Database Management •  Fivedatabases currently available: –  HBase, Hypertable: Master / Slave –  Cassandra, Voldemort: Peer / Peer –  Clustered MySQL: Relational •  Two main components –  Protocol Buffer Server: Data access / storage –  User / App Server: Authentication
  • 17.
    AppServer (AS) •  ModifiedGoogle App Engine SDK •  App requests internally are Protocol Buffers –  Forwards requests to PB Server •  Minimal request set: –  Put(id) –  Get(id) –  Query: Equivalent to get_all_in_table –  Delete(id) –  Count: Total number of items in database –  GetSchema
  • 18.
    AppScale Tools •  Rubyscripts that initiate AppScale deployment –  Initializes the first AppController for use –  Uploads AppEngine app •  Conceptually similar to Amazon AWS EC2 tools –  describe-instances –  upload-app: Introduce additional apps –  terminate-instances
  • 19.
    Fault Tolerance •  Systemcan survive the following failures: –  AppServer failure –  Database Slave failure –  Database Peer failure –  AppLoadBalancer failure * –  AppController failure *
  • 20.
    Testing Methodology •  Loadtesting done via the Grinder •  Test specifics: –  Initially 3 users –  3 users added every 5 seconds –  Done until 160 seconds have passed •  Each user navigates the page, performs some scripted action •  Measured total transactions performed and average response time
  • 21.
    AppScale Evaluation Cluster • Three Grinder nodes, four AppScale nodes –  One master, three slaves –  Virtualized via Xen –  Database: HBase (3x replication) 64 MB HDFS blocks •  PBServer via Thrift; stores entire protocol buffers •  Hardware –  Quad-core 2.66 GHz machines –  8 GB of RAM –  Connected via Gigabit Ethernet
  • 22.
    Applications Tested •  Tasks- a to-do list –  Read and write intensive (44 transactions per user) •  Cccwiki – allows users to edit web pages –  Read intensive, updates only (74 transactions per user) •  Guestbook – allows users to post messages –  Retrieves ten most recent posts only (9 transactions per user) •  Shell – provides an interactive Python shell –  Compute intensive (14 transactions per user)
  • 23.
  • 24.
  • 25.
  • 26.
    Room for Improvement • Current bottlenecks: –  Queries perform filtering server-side –  Filtering is done outside of the DB –  AppEngine, PB Server are single-threaded –  Entry point to some DBs is single-threaded •  Future work will address these problems –  Will also compare performance across DBs –  e.g., BigTable-like DBs vs. P2P DBs
  • 27.
    Related Work •  AppDrop –  Proof-of-concept Rails app •  TyphoonAE –  Relatively new (alpha release) –  Runs MongoDB only •  Microsoft Azure –  Uses .NET as the platform –  Has a similar pricing model to AppEngine
  • 28.
    AppScale Recap •  Distributed,multi-component system –  Deployed as a single system image (self configuring) •  Static deployment over Xen/KVM •  Dynamic deployment over Eucalyptus/EC2 •  Databases supported: –  HBase, Hypertable, MySQL, Cassandra, Voldemort •  Fault-tolerant
  • 29.
    AppScale Recap •  Opencloud research platform –  International user community •  Goals –  Easy to use and extend –  Automatic deployment of PaaS cloud and GAE apps on resources other than Google’s –  Support real applications and users •  Experimentation and testing in real environments •  Current performance results are a baseline
  • 30.
    Performance Improvements •  AppEnginenow multi-process, load balanced •  PB Server now multi-threaded •  Storing data like Google for HBase and Hypertable –  Three tables: Reference, Sort Ascending, Sort Descending
  • 31.
    Future Work •  Expandout of the web services domain –  Investigating opportunities in streaming –  Integrated MapReduce support for high- performance computing (HPC) –  Co-locate AppEngines and use shared memory •  Additional databases: –  MongoDB, Scalaris, CouchDB
  • 32.
    Thanks! •  To theAppScale team! –  Co-lead Navraj Chohan –  Advisor Prof. Chandra Krintz •  To the open-source community •  To Google, NSF, and IBM for financial support •  To you all for coming out today •  Check us out on the web: –  http://appscale.cs.ucsb.edu