Google app engine


Published on

seminar report on google app engine(pdf)

Published in: Education
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Google app engine

  1. 1. University of Pune A Seminar Report On ”Google App Engine” Submitted by Mr.Suraj Mehta Department of Computer Engineering KJ’S Educational Institute’s Pune - 411048 2012 - 2013
  2. 2. KJ’S Educational Institute’s Trinity College of Engineering and Research Department of Computer Engineering CERTIFICATE This is to certify thatMr.Suraj Mehta of Trinity College of Engineering and Research has submitted the Seminar Report entitled ”Google App Engine” He has satisfactorily completed and submitted Seminar Report as prescribed by University of Pune for Third Year Computer Engineering for the Academic Year 2012 - 2013 Place :Pune Date : Seminar Guide HOD Department of Computer Engineering
  3. 3. University of Pune An Approval Sheet for Seminar Topic Sr.No. Seminar Topic Name Remark(Approved/Not Approved) 1 Google App Engine Submitted by Mr.Suraj Mehta Department of Computer Engineering KJ’S Educational Institute’s Pune - 411048 2012 - 2013
  4. 4. ACKNOWLEDGEMENT I wish to express my sincere gratitude to prof.Pankthi, H.O.D of Computer De- partment of Trinity College of Engineering and Research,Pune for providing me an opportunity to do my seminar work on Google App Engine. This project bears on imprint of many peoples. I sincerely thank to my seminar guide for guidance and encouragement in carrying out this seminar work I also wish to express my gratitude to the officials and other staff members of seminar co-ordinater who rendered their help during the period of my seminar work.Last but not least I wish to thank all our teachers and friends for their constructive comments, suggestions and criticism and all those directly or indirectly helped me in completing this seminar. Name of Student : Suraj Mehta
  5. 5. Abstract Introduction: Google App Engine was first released as a beta version in April 2008. It is a platform for developing and hosting web applications in Google-managed data cen- ters.Google App Engine is cloud computing technology. Google App Engine is soft- ware that facilitates the user to run his web applications on Google infrastructure. It is more reliable because failure of any server will not affect either the performance of the end user or the service of the Google.It virtualizes applications across multiple servers and data centers. The objective of research presented in this paper was to investigate if the Google App Engine cloud service may be used for free of charge execution of parameter study problems. This WIP paper reports the status of a Windows based tool, ’Event Coding and Visualization of Data’ (ECOVRD) that allows real time collaborative video annotation using Google App Engine (GAE) and XMPP protocol. ECOVRD facilitates classification of live or video recorded individual and team behavior. Users of ECOVRD can initiate a shared real-time video annotation with their social network (Google Buzz) via XMPP protocol. The custom-built publish/subscribe architecture wrapped around GAE’s channel service, pushes data from the cloud to subscribed clients resulting in real-time collaborative experience. ECORVD may be the first to successfully leverage the server push framework of GAE for desktop based video annotation applications. (GAE) for high-performance parallel computing. A generic master-slave framework enables fast prototyping and integration of parallel algorithms that are transparently sched- uled and executed on the Google cloud infrastructure. Experiments demonstrated good scalability of a Monte Carlo simulation algorithm. Although this approach produced important speedup, two main obstacles limited its performance: middle- ware overhead and resource quotas. Scope: Google’s App Engine opens Google’s production to any person in the world at no charge.To engineer a search engine is a challenging task. Search engines index tens to hundreds ofmillions of web pages involving a comparable number of distinct terms. They answer tens ofmillions of queries every day. Despite the importance of large-scale search engines on the web,Compared to Amazon Elastic Compute Cloud (EC2), GAE offers lower resource-provisioning overhead and is cheaper for jobs shorter than one hour. Innovativeness and Usefulness: Based on the provided Task Queue API, a simple and extensible framework implementing the master-worker model has been developed, which enables usage of the App Engine application servers as computational nodes as well as monitoring the task execution.
  6. 6. Contents 1 INTRODUCTION 2 1.1 Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1.2 Cloud Computing: Service Delivery Models . . . . . . . . . . . . . . . 3 1.3 Google App Engine . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2 HOW DOES IT WORK? 7 2.1 The Application Environment . . . . . . . . . . . . . . . . . . . . . . 7 2.2 Service Provided By GAE . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2.1 Google Cloud Computing Services Google App Engine . . . . 8 2.3 The Sandbox . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4 The Languages Runtime . . . . . . . . . . . . . . . . . . . . . . . . . 10 2.4.1 The Java Runtime Environment . . . . . . . . . . . . . . . . . 10 2.4.2 The Python Runtime Environment . . . . . . . . . . . . . . . 11 3 STORAGE MANAGEMENT 14 3.1 The Datastore . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.2 Google Accounts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.1 URL Fetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.2 Mail . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.2.3 Memcache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.4 Image Manipulation . . . . . . . . . . . . . . . . . . . . . . . 16 3.2.5 Scheduled Tasks and Task Queues . . . . . . . . . . . . . . . . 16 3.3 DevelopmentWorkflow . . . . . . . . . . . . . . . . . . . . . . . . . . 16 3.4 Quotas and limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 4 FRAMEWORK FOR PARAMETER STUDY 21 4.1 Framework structure . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 4.2 Preparing the program to run on App Engine infrastructure . . . . . 22 5 DISCUSSION 26 5.1 Advantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 5.2 Disadvantages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 i
  7. 7. 5.3 Restriction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5.4 Comparitive Study . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 6 CONCLUSION 30 7 REFERENCES 32 ii
  8. 8. List of Figures 1.1 The Cloud [9] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.1 Services Provided By GAE . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 App Engine Components . . . . . . . . . . . . . . . . . . . . . . . . . 9 3.1 Google DataStore Architecture . . . . . . . . . . . . . . . . . . . . . 15 3.2 Admin Console view . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 3.3 Architecture of Google App Engine . . . . . . . . . . . . . . . . . . . 18 4.1 The computing environment based on App Engine Task Queue service and its functionality; details are explained in Section 4.1. . . . . . . . 22 iii
  9. 9. List of Tables 3.1 Free Quotas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 3.2 Hard Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.1 Comparision Between Google App Engine and Amazon Web Services 28 iv
  11. 11. Chapter 1 INTRODUCTION 1.1 Cloud Computing Cloud computing is the next natural step in the evolution of on-demand information technology services and products. To a large extent cloud computing will be based on virtualized resources. The idea of cloud computing is based on a very fundamen- tal principal of ‘reusability of IT capabilities‘. The difference that cloud computing brings compared to traditional concepts of Agrid computing., Adistributed com- puting, Autility computing, or Autonomic computing is to broaden horizons across organizational boundaries. According to the IEEE Computer Society Cloud Com- puting is: ”A paradigm in which information is permanently stored in servers on the Internet and cached temporarily on clients that include desktops, Entertainment centers, table computers, notebooks, wall computers, handhelds, etc.” Though many cloud computing architectures and deployments are powered by grids, based on au- tonomic characteristics and consumed on the basis of utilities billing, the concept of a cloud is fairly distinct and complementary to the concepts of grid, SaaS , Util- ity Computing etc. In theory, cloud computing promises availability of all required hardware, software, platform, applications, infrastructure and storage with an own- ership of just an internet connection. people can access the information that they need from any device with an Internet connection.including mobile and handheld phones.rather than being chained to the desktop. It also means lower costs, since there is no need to install software or hardware. Cloud computing used to posting and sharing photos on orkut, instant messaging with friends maintaining and up- grading business technology. 2
  12. 12. 1.2 Cloud Computing: Service Delivery Models Clouds computing services fall into three categories: applications (Software as a Ser- vice i.e. SaaS), platform (Platform as a Service i.e. PaaS) and hardware (Infrastruc- ture as a Service i.e. IaaS). These offered services can be accessed anytime, anywhere in the world over the internet. Software as a Service: In the SaaS category, there is delivery of use-specific services over the Internet (such as CRM software and email). The benefit of SaaS clouds is that clients only focus on the use of the software and do not have to worry about the cost and effort to keep software licenses current nor the handling of software updates. The decision on whether or not to deploy software updates are finalized by the providers themselves. Thus, if an update to a software services makes a client incompatible, the client has to either adapt their software or find another software service or even find another SaaS cloud. 1.1 Applications with Figure 1.1: The Cloud [9] a Web-based interface accessed via Web Services and Web 2.0. Examples include Google Apps, and social network applications such as FaceBook . Platform as a Service: The PaaS category represents clouds that access to a range of compute, database and storage functions within a specified framework provided over the Internet. The benefit of PaaS clouds is clients are able to create their own required services and do not have to worry about provisioning and maintain- ing the hardware and software needed to run the services, infrastructure scaling, load balancing and so on. Examples include and Microsoft’s Azure Plat- form . Infrastructure as a Service: The IaaS category allows for the provisioning of hardware resources so cloud clients can create various configurations of computer systems from servers to complete clusters. On comparison to PaaS and SaaS, clients are able to create and use software as well as create and use an underlying software infrastructure to make the software possible. Grids of virtualized servers, storage & networks. Examples include Amazon’s Elastic Compute Cloud and Simple Storage Service. 3
  13. 13. 1.3 Google App Engine So what is Google App Engine? According Kevin Gibbs which is App Engine Tech Lead, Google App Engine is a system that exposes various pieces of Googles scalable infrastructure so that you can write server-side applications on top of them . Simply this is a platform which allows users to run and host their web applications on Googles infrastructure. These applications are easy to build, easy to maintain and easy to scale whenever traffic and data storage needed. By using Googles App Engine, there are no servers to maintain and no administrators needed. The idea is user just to upload his application and it is ready to serve its own customers. User has a choice either his product to be served by the free domain or to allow Google Apps to serve it from domain chosen by the customer. Google also provide the user with the option to limit the access of the application within the members of his own organization or to share it with the rest of the world. The starting packet is free of charge and additional obligation. All the user have to do is to sign up for a free account, and then to develop and publish his own application. The starting package includes up to 500MB of storage and enough CPU power and bandwidth to Google App Engine lets you run your web applications on Google’s infrastructure. App Engine applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage needs grow. With App Engine, there are no servers to maintain: You just upload your application, and it’s ready to serve your users. You can serve your app from your own domain name (such as using Google Apps. Or, you can serve your app using a free name on the domain. You can share your application with the world, or limit access to members of your organization. Google App Engine supports apps written in several program- ming languages. With App Engine’s Java runtime environment, you can build your app using standard Java technologies, including the JVM, Java servlets, and the Java programming languageor any other language using a JVM-based interpreter or compiler, such as JavaScript or Ruby. App Engine also features a dedicated Python runtime environment, which includes a fast Python interpreter and the Python stan- dard library. The Java and Python runtime environments are built to ensure that your application runs quickly, securely, and without interference from other apps on the system. With App Engine, you only pay for what you use. There are no set-up costs and no recurring fees. The resources your application uses, such as storage and bandwidth, are measured by the gigabyte, and billed at competitive 4
  14. 14. rates. You control the maximum amounts of resources your app can consume, so it always stays within your budget. App Engine costs nothing to get started. All applications can use up to 500 MB of storage and enough CPU and bandwidth to support an efficient app serving around 5 million page views a month, absolutely free. When you enable billing for your application, your free limits are raised, and you only pay for resources you use above the free levels. 5
  16. 16. Chapter 2 HOW DOES IT WORK? 2.1 The Application Environment With this new service provided by Google it is really easy to create reliably appli- cations which runs under heavy load and which use large amounts of data. Several key features are included in the environment[1]: • dynamic web serving, with full support for common web technologies. • persistent storage with queries, sorting and transactions. • automatic scaling and load balancing. • APIs for authenticating users and sending email using Google Accounts. • a fully featured local development environment that simulates Google App Engine on users computer. • task queues for performing work outside of the scope of a web request. • scheduled tasks for triggering events at specified times and regular intervals. Implementation of Google App Engine applications is done under Python pro- gramming language. Full Python language support along with most of the Python standard library comes with standard runtime environment. Currently Python is the only supported language by Google App Engine, but improvements to support other languages are in progress 7
  17. 17. 2.2 Service Provided By GAE Figure 2.1: Services Provided By GAE 2.2.1 Google Cloud Computing Services Google App En- gine In the Platform as a Service (PaaS) space Google is a key player. App Engine is a platform to create, store and run applications on Googles servers using de- velopment languages as java and python. App Engine includes tools for manag- ing the data store, monitoring the site and its resource consumption, and debug- ging and logging. A user can serve the app from his own domain name (such as using Google Apps. Or, he can serve his app using a free name on the domain. A user can share his application with the world, or limit access to members of organization. App Engine costs nothing to get started. All applications can use up to 1 GB of storage and enough CPU and bandwidth to support an efficient app serving around 5 million page views a month, absolutely free. Applications requiring more storage or bandwidth can purchase 8
  18. 18. which is divided into five buckets: CPU time, bandwidth in, bandwidth out, stor- age, and outbound email. Google App Engine enables users to build a basic web application very quickly. Configuring and setting up an application is quick and easy. The Google App Engine Architecture provides a new approach without deal- ing with web servers and load balancers but instead deploying the applications on the Google App Engine cloud by providing instance access and scalability which is showing in the figure 2. The Google App Engine Software Development Kit (SDK) provides Java and Python programming languages. The languages have their own web server application that contains all Google App Engine services on a local com- puter. The web server also simulates a secure sandbox environment. The Google App Engine SDK has APIs and libraries including the tools to upload applications. The Architecture defines the structure of applications that run on the Google App Engine. Figure 2.2: App Engine Components 9
  19. 19. 2.3 The Sandbox All user applications operate in a secure environment. This environment has a limited access to the underlying operating system. Because of these limitations, App Engine is able to distribute applications web requests across various servers, which allows starting and stopping the servers to meet traffic demand. The sand- box separates the application in its own protected and reliable environment which is independent of the operating system, hardware or the physical location of the web server. Here are some of the restrictions which are included in the sandbox environment: • An application can only access other computers on the Internet through the provided URL fetch and email services and APIs. Other computers can only connect to the application by making HTTP (or HTTPS) requests on the standard ports. • An application cannot write to the file system and can read files, but only files uploaded with the application code. The application must use the App Engine datastore for all data that persists between requests. • Application code only runs in response to a web request, and must return response data within 30 seconds. A request handler cannot spawn a sub- process or execute code after the response has been sent. 2.4 The Languages Runtime Your application can run in one of two runtime environments: the JAVA environ- ment, and the PYTHON environment. Each environment provides standard proto- cols and common technologies for web application development. 2.4.1 The Java Runtime Environment You can develop your application for the Java runtime environment using com- mon Java web development tools and API standards. Your app interacts with the environment using the Java Servlet standard, and can use common web application 10
  20. 20. technologies such asJavaServer Pages (JSPs). The Java runtime environment uses Java 6. The App Engine Java SDK supports developing apps using either Java 5 or 6. The environment includes the Java SE Runtime Environment (JRE) 6 platform and libraries. The restrictions of the sandbox environment are implemented in the JVM. An app can use any JVM bytecode or library feature, as long as it does not exceed the sandbox restrictions. For instance, bytecode that attempts to open a socket or write to a file will throw a runtime exception. Your app accesses most App Engine services using Java standard APIs. For the App Engine datastore, the Java SDK includes implementations of the Java Data Objects (JDO) and Java Per- sistence API (JPA) interfaces. Your app can use the JavaMail API to send email messages with the App Engine Mail service. The HTTP APIs access the App Engine URL fetch service. App Engine also includes low-level APIs for its ser- vices to implement additional adapters, or to use directly from the application. See the documentation for the datastore, URL fetch, mail, images and Google Accounts APIs. Typically, Java developers use the Java programming language and APIs to implement web applications for the JVM. With the use of JVM-compatible compil- ers or interpreters, you can also use other languages to develop web applications, such as JavaScript, Ruby, or Scala. For more information about the Java runtime environment, see The Java Runtime Environment. 2.4.2 The Python Runtime Environment With App Engine’s Python runtime environment, you can implement your app using the Python programming language, and run it on an optimized Python in- terpreter. App Engine includes rich APIs and tools for Python web application development, including a feature rich data modeling API, an easy-to-use web appli- cation framework, and tools for managing and accessing your app’s data. You can also take advantage of a wide variety of mature libraries and frameworks for Python web application development, such as Django. The Python runtime environment uses Python version 2.5.2. Additional support for Python 3 is being considered for a future release. The Python environment includes the Python standard library. Of course, not all of the library’s features can run in the sandbox environment. For instance, a call to a method that attempts to open a socket or write to a file will raise an exception. For convenience, several modules in the standard library whose core features are not supported by the runtime environment have been disabled, and code that imports them will raise an error. Application code written for the Python environment must be written exclusively in Python. Extensions written in the C 11
  21. 21. language are not supported.The Python environment provides rich Python APIs for the datastore, Google Accounts, URL fetch, and email services. App Engine also provides a simple Python web application framework called webapp to make it easy to start building applications. You can upload other third-party libraries with your application, as long as they are implemented in pure Python and do not require any unsupported standard library modules. 12
  23. 23. Chapter 3 STORAGE MANAGEMENT 3.1 The Datastore A powerful distributed data storage service is present by App Engine. A query engine and transactional storing accessible through a simple API, both running on Googles scalable infrastructure is provided with the App Engine datastore. This Python interface includes a data modeling API and similar to Structured Query Language (SQL) called GQL. Using these features developing data dependent ap- plications should not be more difficult than creating it using normal web hosting service. Similar like distributed web servers which grow with the amount of traffic, the distributed datastore grows as well when the amount of data raise. Unlike other traditional datastore, the App Engine datastore supports set of properties in data objects also known as entities . Like any other data filters, data can be extracted from tables using queries filtered by property values. Data modeling interface which can define structure for datastore entities is included in the Python API for the datastore. This data model specifies if property value must be within a given preset range or will assign a default value to it if not given. User application supports as much as needed little structure of data. The Integrity of data is very well guaranteed. Each application can execute multiple datastore operations in a single transaction which either succeed or fail. Concurrency control is very well ensured as well. Up- date to the any single data record occurs if another process is trying to access the same data record in the same time. 14
  24. 24. Figure 3.1: Google DataStore Architecture 3.2 Google Accounts App Engine supports integrating an app with Google Accounts for user authenti- cation. Your application can allow a user to sign in with a Google account, and access the email address and displayable name associated with the account. Using Google Accounts lets the user start using your application faster, because the user may not need to create a new account. It also saves you the effort of implementing a user account system just for your application. If your application is running under Google Apps, it can use the same features with members of your organization and Google Apps accounts. The Users API can also tell the application whether the current user is a registered administrator for the application. This makes it easy to implement admin-only areas of your site. 3.2.1 URL Fetch Applications can access resources on the Internet, such as web services or other data, using App Engine’s URL fetch service. The URL fetch service retrieves web resources using the same high-speed Google infrastructure that retrieves web pages for many other Google products. 3.2.2 Mail Applications can send email messages using App Engine’s mail service. The mail service uses Google infrastructure to send email messages. 15
  25. 25. 3.2.3 Memcache The Memcache service provides your application with a high performance in-memory key-value cache that is accessible by multiple instances of your application. Mem- cache is useful for data that does not need the persistence and transactional features of the data store, such as temporary data or data copied from the data store to the cache for high speed access. 3.2.4 Image Manipulation Distributed in-memory data cache in front of or in place of reliable constant storage is often use by high performance scalable web applications for some tasks. Because of this reason Google App Engine supports memory cache service . The Memcache service assures user applications with a high performance in-memory key-value cache that is available by numerous instances of the applications. Data that does not need the persistence and transactional features of the datastore, like for example a temporary data or data copied from the datastore to the cache for high speed access, which makes most of the Memcache service usage. 3.2.5 Scheduled Tasks and Task Queues An application can perform tasks outside of responding to web requests. Your application can perform these tasks on a schedule that you configure, such as on a daily or hourly basis. Or, the application can perform tasks added to a queue by the application itself, such as a background task created while handling a request. Scheduled tasks are also known as ”cron jobs,” handled by the Cron service Task queues are currently released as an experimental feature. At this time, only the Python runtime environment can use task queues. A task queue interface for Java applications will be released in the near future. 3.3 DevelopmentWorkflow A web application which emulates all of the App Engine services on the local com- puter is included in the App Engine software development Kit (SDK). All of the APIs and the libraries available in App Engine are included in it. Simulations of the secure sandbox environment, which includes check for imports of disabled modules or attempt to access not allowed system resources, are supported by the web server. Implementation of the Python SDK is done in pure Python programming language and it is operate on any operating system, including Windows, Mac OS and Linux which supports Python 2.5 release . 16
  26. 26. Figure 3.2: Admin Console view The Python software package is available on Python web page. The Google App Engine SDK can be obtained from the Google App Engine homepage, as a ZIP file, or as an installer available for Windows and Mac OS X. Tool for uploading the applications to App Engine infrastructure is included in the SDK. Each application consist two types of files, static files and configuration files. Once these files are ready, they can be uploaded by the tool, which also prompts the user about his Google account e-mail and password. Very useful feature is that Google App En- gine supports version of the applications. If customer develops new major release of his product, he can upload the new release as a new version, while the old version is still in use. Final tests of the newer release can be done in same time, before it can be switched on. Management of the applications running on App Engine is done through an administration console(fig.5) . This a web-based interface which allows the customer to create new web-based interface which allows the customer to create new applications, configure domain names, change which version of the application is in use, examine access and error logs, and browse an applications datastore. 17
  27. 27. Figure 3.3: Architecture of Google App Engine 3.4 Quotas and limits Not only is creating an App Engine application easy, it’s free! You can create an account and publish an application that people can use right away at no cstorage and up to 5 million page views a month. When you are ready for more, you can enable billing, set a maximum daily budget, and allocate your budget for each re- source according to your needs.harge, and with no obligation. An application on a free account can use up to 500MB of storage and up to 5 million page views a month. When you are ready for more, you can enable billing, set a maximum daily budget, and allocate your budget for each resource according to your needs. Free quotas: Application creators who enable billing pay only for CPU, bandwidth, storage, and e-mails used in excess of the free quotas. Limits marked with * are increased for application authors who enable billing, even if their application never uses enough resources to incur charges. Free quotas were reduced on May 25, 2009 and were reduced again on June 22, 2009. App Engine defines usage quotas for free appli- 18
  28. 28. Quota Limit Emails per day 2,000 Bandwidth in per day 1GB Bandwidth out per day 1GB CPU time per day 6.5 hrs. per day HTTP Request per day 1,300,000 Datastore API calls per day 10,000,000 Data stored 1GB URL Fetch API calls per day 657,084 Table 3.1: Free Quotas cations. Extensions to these quotas can be requested, and application authors can pay for additional resources. Quota Limit Apps per developer 10 Times per request 30sec Blobstore size 2GB HTTP Response size 10MB Datastore item size 1MB Application code size 150MB Table 3.2: Hard Limits 19
  30. 30. Chapter 4 FRAMEWORK FOR PARAMETER STUDY 4.1 Framework structure The framework implements the master-worker model and its structure is presented in Fig. 1. Master and worker tasks are implemented as Java classes extending HttpServlet. Their execution is triggered upon HTTP request. Parameters are passed in the HTTP GET message. Information about tasks is stored in the dis- tributed, object-oriented Bigtable database (App Engine DB), as the queue interface does not provide methods for task monitoring. Data concerning a single task con- sist of: input data, creation time, execution start time, execution finish time, task execution status and task execution result. Additionally, for master task a list of its worker tasks IDs is stored. Every data entity stored in Bigtable database is identified by an unique ID. The job execution scenario consists of the following steps which are presented in Fig. 1. The client uses a web application as the main interface to submit a job (1). The corresponding data is added to the database (2) and an unique job ID is returned to the user (3). This ID can be used to monitor the execution status of the job and to get results. Simultaneously, a master task is created by the web interface and executed (4). When the master task is executed, it splits the job data into given number of chunks and spawns corresponding number of worker tasks, which are added to the Task Queue (5). The execution of worker tasks is controlled by the queue engine (7) and the worker task results are stored in the database (8). Finally, when all partial results are ready, master task aggregates them and returns the job result to the user (11). Aggregation process involves fetching partial job results computed by worker tasks (9) from the Bigtable data storage system. Then, accordingly to problem nature, these results are merged to produce final job result (10). The task management, such as new task submission, monitoring and getting 21
  31. 31. the results, is controlled by the web interface. Figure 4.1: The computing environment based on App Engine Task Queue service and its functionality; details are explained in Section 4.1. 4.2 Preparing the program to run on App Engine infrastructure Adding a new type of task takes just a couple of simple steps. First, classes im- plementing the IMasterTask and IWorkerTask interfaces have to be programmed. There are two required methods in the Master: • execute() responsible for dividing a job into worker subtasks and enqueuing them, • aggregate() executed when the user requests the results. It may perform additional computations such as summing or finding average of results returned by the workers. The worker has to provide only one method: • execute() performs the computations and returns the results. Finally, the application is deployed on the server using the standard tools of App Engine. The pseudo-codes of master and worker running a simplified Monte Carlo integration are shown in Figures 2 and 3. The helper methods are used to create worker tasks, add them to the queue, obtain a list of workers, retrieve their results and check the termination condition. Simplified code of Monte Carlo master task: public List<Long> execute(String taskData , int workersNumber) 22
  32. 32. 2 { 3 List<Long> workersList = new ArrayList<Long>(); 4 5 for ( int i =0; i <workersNumber; i ++) 6 { 7 Long workerTaskId = helper .createWorkerTask ( ) ; 8 workersList .add(workerTaskId ) ; 9 10 helper .enqueueWorker(workerTaskId, 11 ”MonteCarloSimulationWorker”, 12 dataForThisWorker ) ; 13 } 14 return workersList ; 15 } 16 17 18 public String aggregate(Long masterTaskId) 19 { 20 String result ; 21 List<Long> workersIds = helper .getWorkers(masterTaskId) ; 22 23 for (Long id : workersIds) 24 processWorkerResult( result , 25 helper .getWorkerResults( id ) ) ; 26 27 return result ; 28 } Simplified code of Monte Carlo worker task: 1 public String execute(String data) 2 { 3 helper . startTimer ( ) ; 4 for (long i = 0; i <numberOfSteps; i ++) 5 { 6 doWork( ) ; 7 8 i f (helper . shouldIFinish ( ) ) 9 return WorkerTaskHelper.TIME LIMIT EXCEEDED; 10 } 11 return result ; 12 } 23
  33. 33. The master-worker pattern enables an easy implementation of useful solutions to a wide range of compute intensive problems. Although in many cases it is possible to run parameter study applications using MapReduce model and vice versa, our framework has a slightly different design philosophy (task-driven vs. data driven). Master task explicitly creates tasks for workers and enters them into the queue, while in MapReduce the tasks (mappers) are created implicitly based on entries in the input dataset. MapReduce is better suited for large-scale data processing, while our framework is more convenient for compute-intensive parameterstudy applications, such as Monte Carlo simulations or various optimization problems. Our approach is also better suited for the cases when tasks are dynamically added to the queue or when the aggregate phase needs to be run periodically or interactively e.g. to assess the current work progress. One of the problems which has to be solved manually by application programmer is the task partitioning granularity. When submitting tasks, the master class has to ensure that run times of all the tasks fit into the 10-minute limits imposed by the framework. Moreover, as observed in our experiments higher granularity yields better performance due to the underlying auto-scaling and load-balancing algorithms of App Engine. The task granularity can be determined using simple benchmarking by running tasks locally and on the infrastructure. The situation is very similar to computing clusters with batch processing systems such as Condor [, where users have to adjust their tasks sizes to the limits imposed on queues for scheduling purpose. 24
  35. 35. Chapter 5 DISCUSSION 5.1 Advantages Google App Engine enables you to build web applications on the same scalable systems that power Google applications. App Engine applications are easy to build, easy to maintain, and easy to scale as your traffic and data storage needs grow. With App Engine, there are no servers to maintain: You just upload your application, and it’s ready to serve to your users. Find out why App Engine may be right your business. • Easy to get Started • Automatic scalability • The reliability, performance and security of Google’s infrastructure • Cost efficient hosting • Risk free trial period 5.2 Disadvantages Does Google gives any service guarantees of any kind? Google is prone to occa- sionally change its algorithms and mechanisms in a very opaque and downright secretive manner. Everyone in the field of search engine optimization will know the story. Google tweaks their page ranking algorithms and suddenly your site appears below the fold or (gasp!) not even on the first page any more. Your business liter- ally might evaporate in an instance. What if something similar happens with App Engine? What if they change the service levels your site receives? What if they suddenly decide that your site actually doesn’t need that great request latency you 26
  36. 36. have been enjoying and they change their scaling methods in a way that suddenly impacts your user’s experience negatively? Also, do you know what Google does with your data? Do you want Google to know? What if you have a good idea that competes with some of Google’s many activities? Do you want to be dependent on their infrastructure - and worse - their APIs? Changing hosts is always painful. But with something like Amazon’s VM hosting service (EC2), you at least know that you can deploy your app as it is on another VM hosting environment (as long as you didn’t start to rely on Amazon’s S3, at least). But once your APIs are bound to Google’s, your cost of switching becomes much higher, since you will need to make Changes to your source code. 5.3 Restriction • Developers have read-only access to the file system on App Engine. Your application can use only virtual file systems, like gae - filestore . • App Engine can only execute code called from an HTTP request (scheduled background tasks allow for self calling HTTP requests). • Users may upload arbitrary Python modules, but only if they are pure- Python; C and Pyrex modules are not supported. • Java applications may only use a subset (The JRE Class White List) of the classes from the JRE standard edition. • Java applications cannot create new threads. • Does not support ’naked’ domains (without www) like The required alias to is implemented with a DNS CNAME record in order for changes in Google server IP addresses not to impact the ser- vice. This record cannot be used with other DNS records including the required Start of Authority for the DNS zone. Suggested workaround is to use the domain registrar HTTP redirection to a subdomain, e.g. ””. • SSL/HTTPS is only available via * domains and not via Google Apps Domains. • Datastore cannot use inequality filters on more than one entity property per query. • A process started on the server to answer a request can’t last more than 30 seconds. • Does not support sticky sessions only replicated 27
  37. 37. 5.4 Comparitive Study Google App Engine Amazon Web Services Cloud Services PaaS PaaS,IaaS Platforms Supported Linux,Window Server 2008 Linux,Window Server 2003 Virtualization Platform Application Container Os level running Storage BigTable,Megastore SimpleDB Control Interface API API command line Languages Supported Java,Python Java,PHP,Python,Ruby Load Balancing Auto Round-Robin SLA Availability 100%uptime EC2 99.95%uptime Data after Termination no action take for no action take for 90 day after termination 30 days after termination Table 5.1: Comparision Between Google App Engine and Amazon Web Services 28
  39. 39. Chapter 6 CONCLUSION Cloud computing has dramatically changed how business applications are built and run. At its core, cloud computing eliminates the costs and complexity of evalu- ating, buying, configuring, and managing all the hardware and software needed for enterprise applications. Instead, these applications are delivered as a service over the Internet. Cloud computing is a powerful new abstraction for large scale data processing systems which is scalable, reliable and available and also it needs to be extended to support negotiation of QoS based on Service Level Agreement (SLAs). Cloud computing is particularly valuable to small and medium businesses, where effective and affordable IT tools are critical to helping them become more produc- tive without spending lots of money on in-house resources and technical equipment. Also it is a new emerging architecture needed to expand the Internet to become the computing platform of the future. 30
  41. 41. Chapter 7 REFERENCES A. Research Papers 1. Rabi Prasad Padhy, Manas Ranjan Patra and Suresh Chandra Satapathy, ”X-as-a-Service: Cloud Computing with Google App Engine, Amazon Web Services, Microsoft Azure and”,International Journal of Computer Science and Telecommunications [Volume 2, Issue 9, December 2011] 2. Maciej Malawski Maciej Kuzniar, Piotr Wjcik, Marian Bubak,”How to Use Google App Engine for Free Computing”,1089-7801$26.00 2011 IEEE. 3. Alexander Zahariev,”Google App Engine”,Helsinki University of Technology 2010 B. Website 4. 5. 6. 7. 32