SlideShare a Scribd company logo
1 of 8
An Empirical Performance Study of AppEngine and AppScale

                   Fei Dong                    Yunjia Zhou                 Xuanran Zong
                 Duke University              Duke University              Duke University


1    Introduction                                         mance of Google AppEngine and AppScale, in par-
                                                          ticular from a request latency perspective. Second,
In recent years, we have witnessed an increasing trend    we compare the performance of different databased
on cloud computing usage. The prominence of cloud         supported by AppScale [5]. By answering these
computing comes from its elasticity and “pay-as-you-      two questions, we want to have a general sense of
go” charging model. On one hand, with cloud com-          how good/bad is AppScale compared to Google Ap-
puting, small enterprises do not need to pay any up-      pEngine.
front investment on infrastructure and IT staff, sav-         The rest of this report will be organized as follows.
ing cost and reducing the risk of over-provisioning; on   In section 2, we briefly introduce how AppEngine and
the other hand, cloud providers can multiplex work-       AppScale works. In section 4, we present our exper-
loads from many customers and improve the utiliza-        iment framework. In section 3 and 5, we articulate
tion of their data centers.                               how we deploy the system and perform the experi-
                                                          ment, as well as some results. In the last section, we
   In general, there are three types of cloud comput-
                                                          end up with a brief conclusion.
ing model: Software as a Service(SaaS), Platform as
a service(PaaS) and Infrastructure as a Service(IaaS).
Each cloud provider chooses one model to build their      2      Background
cloud infrastructure. For example, Amazon EC2 [1]
build their cloud model as IaaS, i.e. they rent raw       2.1     Google AppEngine
virtual machines to their customers. On the other
end of the spectrum, Google AppEngine [4] and Mi-         App Engine allows you to deploy your Web applica-
crosoft Azure are PaaS, because they merely provide       tions to Google’s highly scalable infrastructure. Al-
an interface for their customer to host web applica-      though the infrastructure is designed to scale, there
tions on Google’s data centers. While there are a         are a number of ways to optimize the performance of
couple of public cloud services offered by commercial      the application, which results in an improved user ex-
companies, academia is also working hard to offer          perience and less resource consumption. App Engine
open source cloud services. For instance, UCSB has        includes the following features:
announced their Eucalyptus [3], which fully mimics
Amazon EC2, and AppScale [6], which fully mimics              • dynamic web serving, with full support for com-
Google Appengine. Therefore, any party who owns                 mon web technologies
a cluster can become a public cloud provider by de-
ploying either Eucalyptus or AppScale on the cluster.         • persistent storage with queries, sorting and
                                                                transactions
   However, there is a caveat here: although Euca-
lytpus and AppScale can mimic the functionalities,            • automatic scaling and load balancing
do they also provide the same performance? In this
project, we tried to answer one small facet of this           • APIs for authenticating users and sending email
problem. First, we attempt to compare the perfor-               using Google Accounts
• a fully featured local development environment ease the deployment. The users only need to build the
     that simulates Google App Engine on your com- image from source and deploy the image. Though it
     puter                                              sounds trivial at the first glance, we actually confront
                                                        a lot of issues when we deploy it on our department’s
   • task queues for performing work outside of the Eucalyptus cluster. We will further elaborate this in
     scope of a web request                             section .
   • scheduled tasks for triggering events at specified     Once we have deployed AppScale, we can upload
     times and regular intervals                        GAE applications to the system. Each application
                                                        has three major components provided by AppScale to
   The applications can run in one of two runtime en- serve the request. They are AppServer, data storage
vironments: the Java environment, and the Python and AppLoadBalancer(ALB), which are very similar
environment. Each environment provides standard to the components involved in a 3-tier web system.
protocols and common technologies for web applica- The ALB acts like a HTTP server and load balancer.
tion development. Each app is allocated resources When it receives request, it redirect the request to
within limits.                                          one AppServer which hosts the application and initi-
   With App Engine, Google takes care of everything ate the connection. The AppServer is similar to the
for you. The App Engine datastore provides distri- application server in typical web system that it hosts
bution, replication, and load-balancing services be- the application and do servelet processing. Lastly,
hind the scenes, freeing you up to focus on imple- the data storage stores all the persistent data. App-
menting your business logic. App Engine’s data- Scale provides a great flexibility for users to choose
store is powered mainly by two Google services: appropriate back-end database, including HBase, Hy-
Bigtable and Google File System (GFS). Bigtable is pertable, Mysql, Cassandra, Voldemort, MongoDB,
a highly distributed and scalable service for storing MemcacheDB, Scalaris. More importantly, they all
and managing structured data. Bigtable utilizes a shares the same API provided by JDO and users can
non-relationship object model to store entities, allow- run application seamlessly on different data storage.
ing you to create simple, fast, and scalable applica- In other words, users do not need to make any mod-
tions. The datastore also uses GFS to store data and ification to the code in order to accommodate a new
log files. GFS is a scalable, faulttolerant file system back end storage.
designed for large, distributed, data-intensive appli-
cations. App Engine uses the Java Persistence API
(JPA)) and Java Data Objects (JDO) interfaces for 3           Deployment
modeling and persisting entities.
                                                        At the beginning, we attempted to build AppScale
                                                        image from source and launched AppScale instances
2.2 AppScale
                                                        on our local Eucalyptus cluster. We first used App-
AppScale was developed by UCSB in order to mimic Scale tool to create AppScale image emi-9F410FB6
the PaaS cloud model. It offers almost the same on dbc1-03 and ran appscale-run-instances to
application interface as Google AppEngine. This launch the appscale instance. The command can suc-
includes same application structure, same back-end cessfully launch an instance on the Eucalyptus cluster
data storage API (they all use JDO), and very sim- but will be blocked at ’wait for your instance to com-
ilar application deployment routine. Therefore, peo- plete the bootup process’. Figure 1 shows the App-
ple can switch their application from one to the other Scale log when we execute appscale-run-instances.
without any modification.                                From the log we can observe that AppScale have
   AppScale can be deployed on three platforms: launched two instances running AppScale image on
KVM enabled cluster, Eucalyptus, and EC2. App- the Eucalyptus cluster and we can even log in on
Scale provides a set of command written in Ruby to those instances. Yet, it was blocked by some known
$appscale-run-instances --min 1 --max 1 --file sample_apps/guestbook/
--machine emi-9F410FB6 --table memcachedb --infrastructure euca --keyname
dongfei --instance_type c1.xlarge -v --force

About to start AppScale over a cloud environment with the euca tools with
instance type c1.xlarge.

----------- repeat the time ----
Reported Public IPs: [192.168.1.35]
Reported Private IPs: [192.168.1.35]

Please wait for your instance to complete the bootup process.

New secret key is 1Zu22syYs2jKhs2nTpuKhm2bY2nJuV ft
"machine"=>"emi-9F410FB6", "keyname"=>"myapp", "ips"=>"",
"replication"=>"1", "instance_type"=>"c1.xlarge",
"ec2_access_key"=>"DK4LXEFhkcYf8vNztq0FhKXEF5mpW15vAinYfw",
"infrastructure"=>"euca", "table"=>"memcachedb", "min_images"=>"1",
"ec2_secret_key"=>"cfKFBU35lbdAg8soawO9NcumXQgqqUKh9aOSg", "appengine"=>"3",
"ec2_url"=>"http://152.3.144.15:8773/services/Eucalyptus",
"keypath"=>"myapp.key", "hostname"=>"192.168.1.35", "max_images"=>"1"

Head node successfully created at 192.168.1.35.
It is now starting up memcachedb via the command line arguments given.

Generating certificate and private key Copying over credentials for cloud
Starting server at 192.168.1.35 Please wait for the controller to finish
pre-processing tasks.

This AppScale instance is linked to an e-mail address giving it administrator
privileges.
Enter your desired administrator e-mail address: [dongfei@xxx.com]

The new administrator password must be at least six characters long and can
include non-alphanumeric characters.

Enter your new password: [xxxxxx]
Enter again to verify: [xxxxxx]
Please wait for AppScale to prepare your machines for use.

[Blocked here]

dongfei@dbc1-03:~$ euca-describe-instances |grep emi-9F410FB6

INSTANCE    i-47D0099E    emi-9F410FB6    192.168.1.34    192.168.1.34
running     cps212     0     c1.xlarge     2010-12-01T04:15:22.934Z
dukecs-pod1     eki-0AC4191A     eri-59AE1A00


                       Figure 1: AppScale log during instance launching
post preparation work. This is not the correct status
as described in [2]. We changed other configuration
parameters and it still did not work either. We read
most of the online documents of AppScale and did
not find any clue.
   Someone from AppScale mailing suggested us to
run appscale-run-instances on master node. He
said ’can you try using the AppScale tools that are
installed on the master node (and if it’s not there            Figure 2: Sample benchmark.py output
at /usr/local/appscale-tools, can you install it)? Try
a run-instances on the master node and I think the
networking problems should be ok’. So, we ran the        thread is going to pull the job from job queue, exe-
command there but found AppScale was still blocked.      cutes the job and measures the completion time. In
Robin Hood told us ’It seems there are some thing        our experiment, we define job as sending a single re-
with your database. The instance need at lease 1GB       quest to our web application. After we finish sending
memory to start database daemon’.                        all the requests, the script prints out some statistics,
   Given the limited amount of time left, we decided     such as mean request latency. Figure 2 shows a sam-
to give up the Eucalyptus cluster and switch to Ama-     ple output produced by running benchmark.py. Note
zon EC2. Since EC2 already has a default AppScale        that this script is highly generic and extensible. You
image, we do not have to build our own image. App-       can define you own customized job behavior and use
Scale runs smoothly on EC2 for all the configuration      the same script to conduct experiment.
parameters we have tried (different databases, dif-          We then leverage this test script to send concur-
ferent number of instances, etc.). We also attach a      rent requests to server in order to saturate the re-
log for a successful launch of AppScale image in Ap-     quired workload. Note that we cannot use a sin-
pendix.                                                  gle machine to send requests because that is equiv-
                                                         alent to using a single commodity desktop to satu-
                                                         rate a commercial high-performance server. Hence,
4      Experiment Framework                              we employ our department cluster as worker slaves
                                                         to help us sending concurrent requests and our lo-
We have two goals in this project:                       cal machine acts as a master to coordination all the
                                                         experiment done on each worker. We use all the
    • to compare the performance between Google Ap-
                                                         cluster machines (i.e. from linux1.cs.duke.edu
      pEngine and AppScale
                                                         to linux30.cs.duke.edu.            Each worker runs
    • to compare the performance among various           benchmark.py with concurrency 10. In total, we have
      databases provided by AppScale                     10 ∗ 30 outstanding concurrent requests to saturate
                                                         the server workload. In addition, when we run multi-
   There are all kinds of performance metrics, such      ple threads on one worker, we notice a certain amount
as throughput, goodput and latency, etc. We se-          of overhead depending on the concurrency size. In
lected request latency because this is the one which     order to reconcile this overhead, we always fully uti-
the cloud users most care about.                         lize all the threads of one machine before we use the
   We developed an automatic experiment framework        next machine. In this case, we make thread overhead
to conduct a series of experiments. We first wrote our    as a constant. In summary, our framework works as
core test script benchmark.py, which takes configure      follows:
parameters such as number of requests, concurrency,
etc. The scripts is implemented using a thread pool,       1 The master copies all the experiment related
the size of which depends on the concurrency. Each           scripts and configuration files to each worker.
2 The master then issues a command to each                                           30000
                                                                                                                          Google Appengine
    worker simultaneously to invoke benchmark.py                                                        Appscale on EC2/one m.large instance
                                                                                                        Appscale on EC2/two m.large instance
                                                                                       25000            Appscale on EC2/four m.large instance

  3 In the end, all the worker returns the experiment




                                                           Mean request latency (ms)
    results back to the master which is then in charge                                 20000

    of processing the final data.
                                                                                       15000


4.1    Request Workload                                                                10000


We further notice that even though we can leverage                                     5000
thirty workers to send concurrent requests, we still
cannot saturate the server. The reason is that the                                        0
                                                                                               0   20    40     60      80     100    120       140   160   180   200
application we have chosen is too light weighted and                                                                 # of concurrent requests
does not impose too much computation or storage
operations. Therefore, we hack into the application       Figure 3: Performance of AppEngine and AppScale
to modify the workload. In our experiment, each re-
quest invoke 1000000 floating point computation and
                                                     Besides, the AppEngine scales quite well. As the
ten SQL(select) queries. We claim this can capture
                                                     number of concurrent request increased, the perfor-
both the computation and data processing perfor-
                                                     mance of AppEngine was pretty stable. The App-
mance offered by the cloud services.
                                                     Scale, however, not so scalable as AppEngine. The
   We also attempted to measure the performance of
                                                     latency increased dramatically when we augment the
different databases provided by AppScale. This in-
                                                     number of concurrent requests.
volves comparing read/write/delete operations sep-
                                                        Considering the dominant factor that may affects
arately. Hence, we also write our write and delete
                                                     the scalability, we also evaluate the scalability of
workload. The write workload involves 1000 data in-
                                                     AppScale on different number of EC2s instances. As
sertion SQL queries (write operations). The delete
                                                     what we can see from figure 3, the performance of 4
workload involves clean all the existing data, which
                                                     nodes is much better than the 1 node. So, we can
is typically just the 1000 data we previously inserted,
                                                     get a tentative conclusion that, when more instances
in the system.
                                                     added into the master-slave model, the AppScale can
                                                     handle more concurrent requests.
5 Evaluation                                            It looks like the AppEngine outperformed the App-
                                                     Scale. However, we still hesitate to claim the conclu-
5.1 AppEngine vs AppScale                            sion, since we have no idea the number of instances
                                                     run in Google AppEngine. They may have a lot of
In this experiment, we want to evaluate the scal- machines to respond our concurrent requests, which
ability of AppEngine and AppScale with different makes them achieve the high scalability. In other
amount of node.                                      words, if we run enough instances on AppScale, the
  First, we deployed the our modified application as performance would be stable.
mentioned in section 4.1 on both Google AppEngine
and AppScale on Amazon EC2 with 1, 2, 4 instances
                                                     5.2 Data Storage Performance
respectively. Then, we use our framework to saturate
the server. We send concurrent requests from 10 200 In the experiment, we noticed that the Appscale of-
and measured the mean request latency. Figure 3 fers seven different database interfaces for the client
shows the performance of each case.                  to choose. So we decide to evaluate the perfor-
  As we can see, the mean request latency of Google mance of different type of databases when heavy
AppEngine is pretty low comparing to the AppScale. write, heavy-read and delete respectively.
(a) Write operation                 (b) Read operation                 (c) Delete operation



                          Figure 5: Read/Write/Delete operations on two instances


                                                        work on a single node. From the graph, we notice
                                                        that there are significant performance difference be-
                                                        tween different databases, e.g. Voldemort and Mon-
                                                        goDB is much worse than MemcacheDB and Cassan-
                                                        dra. Therefore, we need to be more wise when we
                                                        choose which database to use.
                                                           Figure 5 show the comparison of performance of
                                                        different databases on two instances for the write,
                                                        read and delete operation respectively. As we can
                                                        see, the MemcacheDB did quite well, and the latency
                                                        of Voldemort database is quite long.
                                                           However, we cannot assert that the Voldemort
                                                        database is slow and not useful. We refer to some
                                                        resources to figure out why it performed not so well.
                                                        The reason maybe is that it is a distributed database,
                                                        and not a relational database, which makes sense that
   Figure 4: Read operations on single instances
                                                        it is slow.


  For write, we use 10 threads to concurrently ap-      6    Conclusion
pend 500 Bytes data entry to the back-end data stor-
age and we are inserting 1000 entries in total. ForIn this project, we did an empirical performance
read, we made the server to do 1M floating-point    study of AppEngine and AppScale. We are interested
computation and 10 data queries for a single request.
                                                   in these two platforms because from the program-
In total, we send 100 read requests. For delete, weming interface perspective, AppScale highly mimic
just remove all the existing data.                 AppEngine API. We want to reveal how much perfor-
  Figure 4 shows the performance of four databases mance difference do they have. From our experiment
out of seven options on a single instance. We only results, AppScale has much better performance, both
show four because the rest three databases must be in terms of request latency and scalability. Moreover,
deployed on the master-slave model, which cannot we also conducted experiments to compare different
data storage services provided by AppScale. Our re- A   Appendix I: Log from a    suc-
sults elaborate that different storage services actually
render dramatically different performance. Hence, we
                                                        cessful instance launch
need to be more wise to choose one to use when we Please refer to figure 6.
host our application on AppScale. In future, besides
the performance comparison, we may also explore the
cost of AppEngine and AppScale and investigate if
we can sacrifice a little performance by saving more
cost.


References
[1] Amazon Elastic Compute Cloud.       http://aws.amazon.
    com/ec2.
[2] AppScale Documentation.      http://www.google.com/
    url?q=http%3A%2F%2Fcode.google.com%2Fp%2Fappscale%
    2Fwiki%2FDeploying_AppScale_1_4_via_Eucalyptus%
    23Running_a_Sample_Application.
[3] Eucalyptus. http://open.eucalyptus.com.
[4] Google AppEngine. http://appengine.google.com.
[5] Bunch, C., Chohan, N., Krintz, C., Chohan, J.,
    Kupferman, J., Lakhina, P., Li, Y., Nomura, Y.,
    Bunch, C., Kupferman, J., and Krintz, C. An eval-
    uation of distributed datastores using the appscale cloud
    platform.
[6] Chohan, N., Bunch, C., Pang, S., Krintz, C., Mostafa,
    N., Soman, S., and Wolski, R. Appscale design and im-
    plementation, 2009.
/usr/local/appscale-tools/bin/appscale-run-instances --min 4 --max 4 --file
/home/dongfei/cps212/cps212project/guestbook --machine ami-044fa56d --table
memcachedb --infrastructure ec2 --instance_type m1.large --keyname
cps212_2460_0 --force

About to start AppScale over a cloud environment with the ec2 tools with
instance type m1.large.

Run instances message sent successfully. Waiting for the image to start up.
[Sun Dec 12 20:35:56 -0500 2010] 1799.999987 seconds left until timeout...
[Sun Dec 12 20:36:19 -0500 2010] 1777.262273 seconds left until timeout...
/* time remaining log omitted */
[Sun Dec 12 20:40:50 -0500 2010] 1506.176784 seconds left until timeout...
[Sun Dec 12 20:41:13 -0500 2010] 1483.592267 seconds left until timeout...
[Sun Dec 12 20:41:36 -0500 2010] 1460.731472 seconds left until timeout...
Please wait for your instance to complete the bootup process.
Head node successfully created at ec2-67-202-41-213.compute-1.amazonaws.com.
It is now starting up memcachedb via the command line arguments given.
Generating certificate and private key
Copying over credentials for cloud
Starting server at ec2-67-202-41-213.compute-1.amazonaws.com
Please wait for the controller to finish pre-processing tasks.

This AppScale instance is linked to an e-mail address giving it administrator
privileges.
Enter your desired administrator e-mail address:
The new administrator password must be at least six characters long and can
include non-alphanumeric characters.
Enter your new password:
Enter again to verify:
Please wait for AppScale to prepare your machines for use.
AppController just started
Spawning up 1 virtual machines
Copying over needed files and starting the AppController on the other VMs
Setting up database configuration files
Starting up Load Balancer

Your user account has been created successfully.
Uploading cps212guestbook...
We have reserved the name cps212guestbook for your application.
cps212guestbook was uploaded successfully.
Please wait for your app to start up.

Your app can be reached at the following URL:
http://ec2-67-202-41-213.compute-1.amazonaws.com/apps/cps212guestbook
The status of your AppScale instance is at the following URL:
http://ec2-67-202-41-213.compute-1.amazonaws.com/status

                          Figure 6: A successful instance launch log

More Related Content

What's hot

Learning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlibLearning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlibphanleson
 
SparkR best practices for R data scientist
SparkR best practices for R data scientistSparkR best practices for R data scientist
SparkR best practices for R data scientistDataWorks Summit
 
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit
 
Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...
Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...
Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...Principled Technologies
 
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' GroupJanuary 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' GroupDavid McDaniel
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big dataSergiy Matusevych
 
NextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceNextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceHortonworks
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLSpark Summit
 
Application Timeline Server Past, Present and Future
Application Timeline Server  Past, Present and FutureApplication Timeline Server  Past, Present and Future
Application Timeline Server Past, Present and FutureNaganarasimha Garla
 
Optimal load balancing in cloud computing
Optimal load balancing in cloud computingOptimal load balancing in cloud computing
Optimal load balancing in cloud computingPriyanka Bhowmick
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Spark Summit
 
websphere cast iron labs
 websphere cast iron labs websphere cast iron labs
websphere cast iron labsAMIT KUMAR
 
Apache Ambari Meetup - AMS & Grafana
Apache Ambari Meetup - AMS & GrafanaApache Ambari Meetup - AMS & Grafana
Apache Ambari Meetup - AMS & GrafanaPrajwal Rao
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN ApplicationsHortonworks
 
Big Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyrBig Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyrNicola Lambiase
 
Don't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing SparkDon't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing SparkDataWorks Summit
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataDataWorks Summit/Hadoop Summit
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Hortonworks
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Databricks
 

What's hot (20)

Learning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlibLearning spark ch11 - Machine Learning with MLlib
Learning spark ch11 - Machine Learning with MLlib
 
SparkR best practices for R data scientist
SparkR best practices for R data scientistSparkR best practices for R data scientist
SparkR best practices for R data scientist
 
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
Accumulo Summit 2015: Using D4M for rapid prototyping of analytics for Apache...
 
Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...
Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...
Deliver better ROI and faster data analysis with Dell Technologies APEX Priva...
 
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' GroupJanuary 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
January 2020 - re:Invent reCap slides - Denver Amazon Web Services Users' Group
 
Apache REEF - stdlib for big data
Apache REEF - stdlib for big dataApache REEF - stdlib for big data
Apache REEF - stdlib for big data
 
NextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduceNextGen Apache Hadoop MapReduce
NextGen Apache Hadoop MapReduce
 
Optimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone MLOptimizing Terascale Machine Learning Pipelines with Keystone ML
Optimizing Terascale Machine Learning Pipelines with Keystone ML
 
Application Timeline Server Past, Present and Future
Application Timeline Server  Past, Present and FutureApplication Timeline Server  Past, Present and Future
Application Timeline Server Past, Present and Future
 
Optimal load balancing in cloud computing
Optimal load balancing in cloud computingOptimal load balancing in cloud computing
Optimal load balancing in cloud computing
 
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
Ernest: Efficient Performance Prediction for Advanced Analytics on Apache Spa...
 
websphere cast iron labs
 websphere cast iron labs websphere cast iron labs
websphere cast iron labs
 
Apache Ambari Meetup - AMS & Grafana
Apache Ambari Meetup - AMS & GrafanaApache Ambari Meetup - AMS & Grafana
Apache Ambari Meetup - AMS & Grafana
 
Get Started Building YARN Applications
Get Started Building YARN ApplicationsGet Started Building YARN Applications
Get Started Building YARN Applications
 
Big Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyrBig Data Analytics in R using sparklyr
Big Data Analytics in R using sparklyr
 
Don't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing SparkDon't Let the Spark Burn Your House: Perspectives on Securing Spark
Don't Let the Spark Burn Your House: Perspectives on Securing Spark
 
Apache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing dataApache Beam: A unified model for batch and stream processing data
Apache Beam: A unified model for batch and stream processing data
 
System mldl meetup
System mldl meetupSystem mldl meetup
System mldl meetup
 
Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0Apache Ambari - What's New in 2.0.0
Apache Ambari - What's New in 2.0.0
 
Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...Building Continuous Application with Structured Streaming and Real-Time Data ...
Building Continuous Application with Structured Streaming and Real-Time Data ...
 

Similar to An Empirical Performance Study of AppEngine and AppScale

Eucalyptus cloud computing
Eucalyptus cloud computingEucalyptus cloud computing
Eucalyptus cloud computingRahul Rana
 
Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)Emmanuel Olowosulu
 
Refactoring Web Services on AWS cloud (PaaS & SaaS)
Refactoring Web Services on AWS cloud (PaaS & SaaS)Refactoring Web Services on AWS cloud (PaaS & SaaS)
Refactoring Web Services on AWS cloud (PaaS & SaaS)IRJET Journal
 
Azure Functions.pptx
Azure Functions.pptxAzure Functions.pptx
Azure Functions.pptxYachikaKamra
 
Java Web Programming Using Cloud Platform: Module 10
Java Web Programming Using Cloud Platform: Module 10Java Web Programming Using Cloud Platform: Module 10
Java Web Programming Using Cloud Platform: Module 10IMC Institute
 
AgiLab is using Oracle Cloud to supply innovative applications to laboratories
AgiLab is using Oracle Cloud to supply innovative applications to laboratoriesAgiLab is using Oracle Cloud to supply innovative applications to laboratories
AgiLab is using Oracle Cloud to supply innovative applications to laboratoriesAgiLab
 
File Repository on GAE
File Repository on GAEFile Repository on GAE
File Repository on GAElynneblue
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Chris Bunch
 
Cloud ftp a case study of migrating traditional applications to the cloud
Cloud ftp a case study of migrating traditional applications to the cloudCloud ftp a case study of migrating traditional applications to the cloud
Cloud ftp a case study of migrating traditional applications to the cloudJPINFOTECH JAYAPRAKASH
 
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTSCOMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTScscpconf
 
Comparison of open source paas architectural components
Comparison of open source paas architectural componentsComparison of open source paas architectural components
Comparison of open source paas architectural componentscsandit
 
Basic of oracle application Login steps
Basic of oracle application Login stepsBasic of oracle application Login steps
Basic of oracle application Login stepsGirishchandra Darvesh
 

Similar to An Empirical Performance Study of AppEngine and AppScale (20)

Unit 5.pptx
Unit 5.pptxUnit 5.pptx
Unit 5.pptx
 
Eucalyptus cloud computing
Eucalyptus cloud computingEucalyptus cloud computing
Eucalyptus cloud computing
 
Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)Managing Large Flask Applications On Google App Engine (GAE)
Managing Large Flask Applications On Google App Engine (GAE)
 
Google Cloud Platform
Google Cloud Platform Google Cloud Platform
Google Cloud Platform
 
Refactoring Web Services on AWS cloud (PaaS & SaaS)
Refactoring Web Services on AWS cloud (PaaS & SaaS)Refactoring Web Services on AWS cloud (PaaS & SaaS)
Refactoring Web Services on AWS cloud (PaaS & SaaS)
 
Load Balancing in Cloud Nodes
 Load Balancing in Cloud Nodes Load Balancing in Cloud Nodes
Load Balancing in Cloud Nodes
 
Load Balancing in Cloud Nodes
Load Balancing in Cloud NodesLoad Balancing in Cloud Nodes
Load Balancing in Cloud Nodes
 
Azure Functions.pptx
Azure Functions.pptxAzure Functions.pptx
Azure Functions.pptx
 
Java Web Programming Using Cloud Platform: Module 10
Java Web Programming Using Cloud Platform: Module 10Java Web Programming Using Cloud Platform: Module 10
Java Web Programming Using Cloud Platform: Module 10
 
AgiLab is using Oracle Cloud to supply innovative applications to laboratories
AgiLab is using Oracle Cloud to supply innovative applications to laboratoriesAgiLab is using Oracle Cloud to supply innovative applications to laboratories
AgiLab is using Oracle Cloud to supply innovative applications to laboratories
 
Google App Engine
Google App EngineGoogle App Engine
Google App Engine
 
File Repository on GAE
File Repository on GAEFile Repository on GAE
File Repository on GAE
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09Appscale at CLOUDCOMP '09
Appscale at CLOUDCOMP '09
 
Cloud ftp a case study of migrating traditional applications to the cloud
Cloud ftp a case study of migrating traditional applications to the cloudCloud ftp a case study of migrating traditional applications to the cloud
Cloud ftp a case study of migrating traditional applications to the cloud
 
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTSCOMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
COMPARISON OF OPEN-SOURCE PAAS ARCHITECTURAL COMPONENTS
 
Comparison of open source paas architectural components
Comparison of open source paas architectural componentsComparison of open source paas architectural components
Comparison of open source paas architectural components
 
Basic of Oracle Application
Basic of Oracle ApplicationBasic of Oracle Application
Basic of Oracle Application
 
Basic of oracle application Login steps
Basic of oracle application Login stepsBasic of oracle application Login steps
Basic of oracle application Login steps
 
Azure web apps
Azure web appsAzure web apps
Azure web apps
 

An Empirical Performance Study of AppEngine and AppScale

  • 1. An Empirical Performance Study of AppEngine and AppScale Fei Dong Yunjia Zhou Xuanran Zong Duke University Duke University Duke University 1 Introduction mance of Google AppEngine and AppScale, in par- ticular from a request latency perspective. Second, In recent years, we have witnessed an increasing trend we compare the performance of different databased on cloud computing usage. The prominence of cloud supported by AppScale [5]. By answering these computing comes from its elasticity and “pay-as-you- two questions, we want to have a general sense of go” charging model. On one hand, with cloud com- how good/bad is AppScale compared to Google Ap- puting, small enterprises do not need to pay any up- pEngine. front investment on infrastructure and IT staff, sav- The rest of this report will be organized as follows. ing cost and reducing the risk of over-provisioning; on In section 2, we briefly introduce how AppEngine and the other hand, cloud providers can multiplex work- AppScale works. In section 4, we present our exper- loads from many customers and improve the utiliza- iment framework. In section 3 and 5, we articulate tion of their data centers. how we deploy the system and perform the experi- ment, as well as some results. In the last section, we In general, there are three types of cloud comput- end up with a brief conclusion. ing model: Software as a Service(SaaS), Platform as a service(PaaS) and Infrastructure as a Service(IaaS). Each cloud provider chooses one model to build their 2 Background cloud infrastructure. For example, Amazon EC2 [1] build their cloud model as IaaS, i.e. they rent raw 2.1 Google AppEngine virtual machines to their customers. On the other end of the spectrum, Google AppEngine [4] and Mi- App Engine allows you to deploy your Web applica- crosoft Azure are PaaS, because they merely provide tions to Google’s highly scalable infrastructure. Al- an interface for their customer to host web applica- though the infrastructure is designed to scale, there tions on Google’s data centers. While there are a are a number of ways to optimize the performance of couple of public cloud services offered by commercial the application, which results in an improved user ex- companies, academia is also working hard to offer perience and less resource consumption. App Engine open source cloud services. For instance, UCSB has includes the following features: announced their Eucalyptus [3], which fully mimics Amazon EC2, and AppScale [6], which fully mimics • dynamic web serving, with full support for com- Google Appengine. Therefore, any party who owns mon web technologies a cluster can become a public cloud provider by de- ploying either Eucalyptus or AppScale on the cluster. • persistent storage with queries, sorting and transactions However, there is a caveat here: although Euca- lytpus and AppScale can mimic the functionalities, • automatic scaling and load balancing do they also provide the same performance? In this project, we tried to answer one small facet of this • APIs for authenticating users and sending email problem. First, we attempt to compare the perfor- using Google Accounts
  • 2. • a fully featured local development environment ease the deployment. The users only need to build the that simulates Google App Engine on your com- image from source and deploy the image. Though it puter sounds trivial at the first glance, we actually confront a lot of issues when we deploy it on our department’s • task queues for performing work outside of the Eucalyptus cluster. We will further elaborate this in scope of a web request section . • scheduled tasks for triggering events at specified Once we have deployed AppScale, we can upload times and regular intervals GAE applications to the system. Each application has three major components provided by AppScale to The applications can run in one of two runtime en- serve the request. They are AppServer, data storage vironments: the Java environment, and the Python and AppLoadBalancer(ALB), which are very similar environment. Each environment provides standard to the components involved in a 3-tier web system. protocols and common technologies for web applica- The ALB acts like a HTTP server and load balancer. tion development. Each app is allocated resources When it receives request, it redirect the request to within limits. one AppServer which hosts the application and initi- With App Engine, Google takes care of everything ate the connection. The AppServer is similar to the for you. The App Engine datastore provides distri- application server in typical web system that it hosts bution, replication, and load-balancing services be- the application and do servelet processing. Lastly, hind the scenes, freeing you up to focus on imple- the data storage stores all the persistent data. App- menting your business logic. App Engine’s data- Scale provides a great flexibility for users to choose store is powered mainly by two Google services: appropriate back-end database, including HBase, Hy- Bigtable and Google File System (GFS). Bigtable is pertable, Mysql, Cassandra, Voldemort, MongoDB, a highly distributed and scalable service for storing MemcacheDB, Scalaris. More importantly, they all and managing structured data. Bigtable utilizes a shares the same API provided by JDO and users can non-relationship object model to store entities, allow- run application seamlessly on different data storage. ing you to create simple, fast, and scalable applica- In other words, users do not need to make any mod- tions. The datastore also uses GFS to store data and ification to the code in order to accommodate a new log files. GFS is a scalable, faulttolerant file system back end storage. designed for large, distributed, data-intensive appli- cations. App Engine uses the Java Persistence API (JPA)) and Java Data Objects (JDO) interfaces for 3 Deployment modeling and persisting entities. At the beginning, we attempted to build AppScale image from source and launched AppScale instances 2.2 AppScale on our local Eucalyptus cluster. We first used App- AppScale was developed by UCSB in order to mimic Scale tool to create AppScale image emi-9F410FB6 the PaaS cloud model. It offers almost the same on dbc1-03 and ran appscale-run-instances to application interface as Google AppEngine. This launch the appscale instance. The command can suc- includes same application structure, same back-end cessfully launch an instance on the Eucalyptus cluster data storage API (they all use JDO), and very sim- but will be blocked at ’wait for your instance to com- ilar application deployment routine. Therefore, peo- plete the bootup process’. Figure 1 shows the App- ple can switch their application from one to the other Scale log when we execute appscale-run-instances. without any modification. From the log we can observe that AppScale have AppScale can be deployed on three platforms: launched two instances running AppScale image on KVM enabled cluster, Eucalyptus, and EC2. App- the Eucalyptus cluster and we can even log in on Scale provides a set of command written in Ruby to those instances. Yet, it was blocked by some known
  • 3. $appscale-run-instances --min 1 --max 1 --file sample_apps/guestbook/ --machine emi-9F410FB6 --table memcachedb --infrastructure euca --keyname dongfei --instance_type c1.xlarge -v --force About to start AppScale over a cloud environment with the euca tools with instance type c1.xlarge. ----------- repeat the time ---- Reported Public IPs: [192.168.1.35] Reported Private IPs: [192.168.1.35] Please wait for your instance to complete the bootup process. New secret key is 1Zu22syYs2jKhs2nTpuKhm2bY2nJuV ft "machine"=>"emi-9F410FB6", "keyname"=>"myapp", "ips"=>"", "replication"=>"1", "instance_type"=>"c1.xlarge", "ec2_access_key"=>"DK4LXEFhkcYf8vNztq0FhKXEF5mpW15vAinYfw", "infrastructure"=>"euca", "table"=>"memcachedb", "min_images"=>"1", "ec2_secret_key"=>"cfKFBU35lbdAg8soawO9NcumXQgqqUKh9aOSg", "appengine"=>"3", "ec2_url"=>"http://152.3.144.15:8773/services/Eucalyptus", "keypath"=>"myapp.key", "hostname"=>"192.168.1.35", "max_images"=>"1" Head node successfully created at 192.168.1.35. It is now starting up memcachedb via the command line arguments given. Generating certificate and private key Copying over credentials for cloud Starting server at 192.168.1.35 Please wait for the controller to finish pre-processing tasks. This AppScale instance is linked to an e-mail address giving it administrator privileges. Enter your desired administrator e-mail address: [dongfei@xxx.com] The new administrator password must be at least six characters long and can include non-alphanumeric characters. Enter your new password: [xxxxxx] Enter again to verify: [xxxxxx] Please wait for AppScale to prepare your machines for use. [Blocked here] dongfei@dbc1-03:~$ euca-describe-instances |grep emi-9F410FB6 INSTANCE i-47D0099E emi-9F410FB6 192.168.1.34 192.168.1.34 running cps212 0 c1.xlarge 2010-12-01T04:15:22.934Z dukecs-pod1 eki-0AC4191A eri-59AE1A00 Figure 1: AppScale log during instance launching
  • 4. post preparation work. This is not the correct status as described in [2]. We changed other configuration parameters and it still did not work either. We read most of the online documents of AppScale and did not find any clue. Someone from AppScale mailing suggested us to run appscale-run-instances on master node. He said ’can you try using the AppScale tools that are installed on the master node (and if it’s not there Figure 2: Sample benchmark.py output at /usr/local/appscale-tools, can you install it)? Try a run-instances on the master node and I think the networking problems should be ok’. So, we ran the thread is going to pull the job from job queue, exe- command there but found AppScale was still blocked. cutes the job and measures the completion time. In Robin Hood told us ’It seems there are some thing our experiment, we define job as sending a single re- with your database. The instance need at lease 1GB quest to our web application. After we finish sending memory to start database daemon’. all the requests, the script prints out some statistics, Given the limited amount of time left, we decided such as mean request latency. Figure 2 shows a sam- to give up the Eucalyptus cluster and switch to Ama- ple output produced by running benchmark.py. Note zon EC2. Since EC2 already has a default AppScale that this script is highly generic and extensible. You image, we do not have to build our own image. App- can define you own customized job behavior and use Scale runs smoothly on EC2 for all the configuration the same script to conduct experiment. parameters we have tried (different databases, dif- We then leverage this test script to send concur- ferent number of instances, etc.). We also attach a rent requests to server in order to saturate the re- log for a successful launch of AppScale image in Ap- quired workload. Note that we cannot use a sin- pendix. gle machine to send requests because that is equiv- alent to using a single commodity desktop to satu- rate a commercial high-performance server. Hence, 4 Experiment Framework we employ our department cluster as worker slaves to help us sending concurrent requests and our lo- We have two goals in this project: cal machine acts as a master to coordination all the experiment done on each worker. We use all the • to compare the performance between Google Ap- cluster machines (i.e. from linux1.cs.duke.edu pEngine and AppScale to linux30.cs.duke.edu. Each worker runs • to compare the performance among various benchmark.py with concurrency 10. In total, we have databases provided by AppScale 10 ∗ 30 outstanding concurrent requests to saturate the server workload. In addition, when we run multi- There are all kinds of performance metrics, such ple threads on one worker, we notice a certain amount as throughput, goodput and latency, etc. We se- of overhead depending on the concurrency size. In lected request latency because this is the one which order to reconcile this overhead, we always fully uti- the cloud users most care about. lize all the threads of one machine before we use the We developed an automatic experiment framework next machine. In this case, we make thread overhead to conduct a series of experiments. We first wrote our as a constant. In summary, our framework works as core test script benchmark.py, which takes configure follows: parameters such as number of requests, concurrency, etc. The scripts is implemented using a thread pool, 1 The master copies all the experiment related the size of which depends on the concurrency. Each scripts and configuration files to each worker.
  • 5. 2 The master then issues a command to each 30000 Google Appengine worker simultaneously to invoke benchmark.py Appscale on EC2/one m.large instance Appscale on EC2/two m.large instance 25000 Appscale on EC2/four m.large instance 3 In the end, all the worker returns the experiment Mean request latency (ms) results back to the master which is then in charge 20000 of processing the final data. 15000 4.1 Request Workload 10000 We further notice that even though we can leverage 5000 thirty workers to send concurrent requests, we still cannot saturate the server. The reason is that the 0 0 20 40 60 80 100 120 140 160 180 200 application we have chosen is too light weighted and # of concurrent requests does not impose too much computation or storage operations. Therefore, we hack into the application Figure 3: Performance of AppEngine and AppScale to modify the workload. In our experiment, each re- quest invoke 1000000 floating point computation and Besides, the AppEngine scales quite well. As the ten SQL(select) queries. We claim this can capture number of concurrent request increased, the perfor- both the computation and data processing perfor- mance of AppEngine was pretty stable. The App- mance offered by the cloud services. Scale, however, not so scalable as AppEngine. The We also attempted to measure the performance of latency increased dramatically when we augment the different databases provided by AppScale. This in- number of concurrent requests. volves comparing read/write/delete operations sep- Considering the dominant factor that may affects arately. Hence, we also write our write and delete the scalability, we also evaluate the scalability of workload. The write workload involves 1000 data in- AppScale on different number of EC2s instances. As sertion SQL queries (write operations). The delete what we can see from figure 3, the performance of 4 workload involves clean all the existing data, which nodes is much better than the 1 node. So, we can is typically just the 1000 data we previously inserted, get a tentative conclusion that, when more instances in the system. added into the master-slave model, the AppScale can handle more concurrent requests. 5 Evaluation It looks like the AppEngine outperformed the App- Scale. However, we still hesitate to claim the conclu- 5.1 AppEngine vs AppScale sion, since we have no idea the number of instances run in Google AppEngine. They may have a lot of In this experiment, we want to evaluate the scal- machines to respond our concurrent requests, which ability of AppEngine and AppScale with different makes them achieve the high scalability. In other amount of node. words, if we run enough instances on AppScale, the First, we deployed the our modified application as performance would be stable. mentioned in section 4.1 on both Google AppEngine and AppScale on Amazon EC2 with 1, 2, 4 instances 5.2 Data Storage Performance respectively. Then, we use our framework to saturate the server. We send concurrent requests from 10 200 In the experiment, we noticed that the Appscale of- and measured the mean request latency. Figure 3 fers seven different database interfaces for the client shows the performance of each case. to choose. So we decide to evaluate the perfor- As we can see, the mean request latency of Google mance of different type of databases when heavy AppEngine is pretty low comparing to the AppScale. write, heavy-read and delete respectively.
  • 6. (a) Write operation (b) Read operation (c) Delete operation Figure 5: Read/Write/Delete operations on two instances work on a single node. From the graph, we notice that there are significant performance difference be- tween different databases, e.g. Voldemort and Mon- goDB is much worse than MemcacheDB and Cassan- dra. Therefore, we need to be more wise when we choose which database to use. Figure 5 show the comparison of performance of different databases on two instances for the write, read and delete operation respectively. As we can see, the MemcacheDB did quite well, and the latency of Voldemort database is quite long. However, we cannot assert that the Voldemort database is slow and not useful. We refer to some resources to figure out why it performed not so well. The reason maybe is that it is a distributed database, and not a relational database, which makes sense that Figure 4: Read operations on single instances it is slow. For write, we use 10 threads to concurrently ap- 6 Conclusion pend 500 Bytes data entry to the back-end data stor- age and we are inserting 1000 entries in total. ForIn this project, we did an empirical performance read, we made the server to do 1M floating-point study of AppEngine and AppScale. We are interested computation and 10 data queries for a single request. in these two platforms because from the program- In total, we send 100 read requests. For delete, weming interface perspective, AppScale highly mimic just remove all the existing data. AppEngine API. We want to reveal how much perfor- Figure 4 shows the performance of four databases mance difference do they have. From our experiment out of seven options on a single instance. We only results, AppScale has much better performance, both show four because the rest three databases must be in terms of request latency and scalability. Moreover, deployed on the master-slave model, which cannot we also conducted experiments to compare different
  • 7. data storage services provided by AppScale. Our re- A Appendix I: Log from a suc- sults elaborate that different storage services actually render dramatically different performance. Hence, we cessful instance launch need to be more wise to choose one to use when we Please refer to figure 6. host our application on AppScale. In future, besides the performance comparison, we may also explore the cost of AppEngine and AppScale and investigate if we can sacrifice a little performance by saving more cost. References [1] Amazon Elastic Compute Cloud. http://aws.amazon. com/ec2. [2] AppScale Documentation. http://www.google.com/ url?q=http%3A%2F%2Fcode.google.com%2Fp%2Fappscale% 2Fwiki%2FDeploying_AppScale_1_4_via_Eucalyptus% 23Running_a_Sample_Application. [3] Eucalyptus. http://open.eucalyptus.com. [4] Google AppEngine. http://appengine.google.com. [5] Bunch, C., Chohan, N., Krintz, C., Chohan, J., Kupferman, J., Lakhina, P., Li, Y., Nomura, Y., Bunch, C., Kupferman, J., and Krintz, C. An eval- uation of distributed datastores using the appscale cloud platform. [6] Chohan, N., Bunch, C., Pang, S., Krintz, C., Mostafa, N., Soman, S., and Wolski, R. Appscale design and im- plementation, 2009.
  • 8. /usr/local/appscale-tools/bin/appscale-run-instances --min 4 --max 4 --file /home/dongfei/cps212/cps212project/guestbook --machine ami-044fa56d --table memcachedb --infrastructure ec2 --instance_type m1.large --keyname cps212_2460_0 --force About to start AppScale over a cloud environment with the ec2 tools with instance type m1.large. Run instances message sent successfully. Waiting for the image to start up. [Sun Dec 12 20:35:56 -0500 2010] 1799.999987 seconds left until timeout... [Sun Dec 12 20:36:19 -0500 2010] 1777.262273 seconds left until timeout... /* time remaining log omitted */ [Sun Dec 12 20:40:50 -0500 2010] 1506.176784 seconds left until timeout... [Sun Dec 12 20:41:13 -0500 2010] 1483.592267 seconds left until timeout... [Sun Dec 12 20:41:36 -0500 2010] 1460.731472 seconds left until timeout... Please wait for your instance to complete the bootup process. Head node successfully created at ec2-67-202-41-213.compute-1.amazonaws.com. It is now starting up memcachedb via the command line arguments given. Generating certificate and private key Copying over credentials for cloud Starting server at ec2-67-202-41-213.compute-1.amazonaws.com Please wait for the controller to finish pre-processing tasks. This AppScale instance is linked to an e-mail address giving it administrator privileges. Enter your desired administrator e-mail address: The new administrator password must be at least six characters long and can include non-alphanumeric characters. Enter your new password: Enter again to verify: Please wait for AppScale to prepare your machines for use. AppController just started Spawning up 1 virtual machines Copying over needed files and starting the AppController on the other VMs Setting up database configuration files Starting up Load Balancer Your user account has been created successfully. Uploading cps212guestbook... We have reserved the name cps212guestbook for your application. cps212guestbook was uploaded successfully. Please wait for your app to start up. Your app can be reached at the following URL: http://ec2-67-202-41-213.compute-1.amazonaws.com/apps/cps212guestbook The status of your AppScale instance is at the following URL: http://ec2-67-202-41-213.compute-1.amazonaws.com/status Figure 6: A successful instance launch log