  1. 1. Jawaharlal Institute of Technology (Borawan) Introduction on Cloud Computing Cloud computing provides Internet-based services, computing, and storage for users in all markets including financial, healthcare, and government. This new approach to computing allows users to avoid upfront hardware and software investments, gain flexibility, collaborate with others, and take advantage of the sophisticated services that cloud providers offer. However, security is a huge concern for cloud users. Cloud providers have recognized the cloud security concern and are working hard to address it. In fact, cloud security is becoming a key differentiator and competitive edge between cloud providers. By applying the strongest security techniques and practices, cloud security may soon be raised far above the level that IT departments achieve using their own hardware and software. NIST Definition of Cloud Computing For the record, here is the definition of cloud computing offered by the National Institute of Standards and Technology (NIST): Cloud computing is a model for enabling convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. Enrollment No: - 0805IT101047 Page No: - 1
  2. 2. Jawaharlal Institute of Technology (Borawan) Google App Engine On April 2008, Google released the beta version of the Google App Engine which allows the developers to develop the applications based on Python. The developers can also use Google’s infrastructures to manage their developing process (maximum 500MB storage space). Google App Engine is an application hosting and development platform that Powers everything from enterprise web applications to mobile games, using the same infrastructure that powers Google’s global-scale web applications. Developers know that time-to-market is critical to success, and with Google App Engine’s simple development, robust APIs and worry-free hosting, you can accelerate your application development and take advantage of simple scalability as the application grows. Google App Engine makes it easy to take your app ideas to the next level. Quick to start with no software or hardware to buy and maintain you can prototype and deploy applications to your users in a matter of hours. Simple to use Google App Engine includes the tools you need to create test, launch, and update your apps. Rich set of APIs Build feature-rich services faster with Google App Engine’s easy-to-use APIs. Immediate scalability there’s almost no limit to how high or how quickly your app can scale. Pay for what you use get started without any upfront costs with App Engine’s free tier and pay only for the resources you use as your application grows. Enrollment No: - 0805IT101047 Page No: - 2
  3. 3. Jawaharlal Institute of Technology (Borawan) Amazon Web Service Amazon’s ‘cloud’ was initialized in 2002 and named Amazon Web Service. It is a web based remote computing collection. It is constructed based on four key services, Simple Storage Service (S3), Elastic Compute Cloud (EC2), Simple Queuing Service and Simple DB. In other words, Amazon now provide the storage service, computing service, queuing service and data base access service through the Internet. Other Services include Amazon Associates Web Services (A2S), Amazon AWS Authentication, Amazon Virtual Private Cloud (VPC). Comparison of Google App Engine and Amazon Web Service The comparison of Amazon Web Service and Google App Engine is shown in Figure 1. The main difference between Amazon Web Service and Google App Engine is that Amazon Web Service is IaaS while Google App Engine is PaaS. The next chapter will analyze the performance of Google App Engine and Amazon Web Service in two different aspects respectively. Enrollment No: - 0805IT101047 Page No: - 3
  4. 4. Jawaharlal Institute of Technology (Borawan) Hadoop Hadoop is a powerful framework that allows for automatic parallelization of computing task. Unfortunately programming for it poses certain challenges, namely it is really hard to understand and debug Hadoop programs. One way to easy things a little bit is to have a simplified version of the hadoop cluster that could run locally on the developer's machine. This tutorial describes how to set-up such cluster on the computer running Microsoft Windows; also it describes how to integrate this cluster with the Eclipse development environment. Eclipse is a prime environment for Java development. Hadoop challenges With all large environments, deployment of the servers and software is an important consideration. Dell provides best practices for the deployment of Hadoop solutions. These best practices are implemented through a set of tools to automate the configuration of the hardware, installation of the operating system (OS), and installation of the Hadoop software stack from Cloudera. As with many other types of information technology (IT) solutions, change management and systems monitoring are a primary consideration within Hadoop. The IT operations team needs to ensure tools are in place to properly track and implement changes, and notify staff when unexpected events occur within the Hadoop environment. Hadoop is a constantly growing, complex ecosystem of software and provides no guidance to the best platform for it to run on. The Hadoop community leaves the platform decisions to end users, most of whom do not have a background in hardware or the necessary lab environment to benchmark all possible design solutions. Hadoop is a complex set of software with more than 200 tunable parameters. Each parameter affects others as tuning is completed for a Hadoop environment and will change over time as job structure changes, data layout evolves, and data volume grows. As data centers have grown and the number of servers under management for a given organization has expanded, users are more conscious of the impact new hardware will have on existing data centers and equipment. Enrollment No: - 0805IT101047 Page No: - 4
  5. 5. Jawaharlal Institute of Technology (Borawan) Hadoop uses Compute –A common use of Hadoop is as a distributed compute platform for analyzing or processing large amounts of data. The compute use is characterized by the need for large numbers of CPUs and large amounts of memory to store in-process data. The Hadoop ecosystem provides the application programming interfaces (APIs) necessary to distribute and track workloads as they are run on large numbers of individual machines. Storage –One primary component of the Hadoop ecosystem is HDFS—the Hadoop Distributed File System. The HDFS allows users to have a single addressable namespace, spread across many hundreds or thousands of servers, creating a single large file system. HDFS manages the replication of the data on this file system to ensure hardware failures do not lead to data loss. Many users will use this scalable file system as a place to store large amounts of data that is then accessed within jobs run in Hadoop or by external systems. Database –The Hadoop ecosystem contains components that allow the data within the HDFS to be presented in a SQL-like interface. This allows standard tools to INSERT, SELECT, and UPDATE data within the Hadoop environment, with minimal code changes to existing applications. Users will commonly employ this method for presenting data in a SQL format for easy integration with existing systems and streamlined access by users. Enrollment No: - 0805IT101047 Page No: - 5
  6. 6. Jawaharlal Institute of Technology (Borawan) Microsoft Azure Cloud computing is here. Running applications on machines in an Internet-accessible data center can bring plenty of advantages. Yet wherever they run, applications are built on some kind of platform. For on-premises applications, this platform usually includes an operating system, some way to store data, and perhaps more. Applications running in the cloud need a similar foundation. The goal of Microsoft’s Windows Azure is to provide this. Part of the larger Azure Services Platform, Windows Azure is a platform for running Windows applications and storing data in the cloud. Figure 1: Windows Azure applications run in Microsoft data centers and are accessed via the Internet. Enrollment No: - 0805IT101047 Page No: - 6
  7. 7. Jawaharlal Institute of Technology (Borawan) As the figure shows, Windows Azure runs on machines in Microsoft data centers. Rather than providing software that Microsoft customers can install and run themselves on their own computers, Windows Azure is a service: Customers use it to run applications and store data on Internet-accessible machines owned by Microsoft. Those applications might provide services to businesses, to consumers, or both. The Compute Service The Windows Azure Compute service can run many different kinds of applications. A primary goal of this platform, however, is to support applications that have a very large number of simultaneous users. (In fact, Microsoft has said that it will build its own SaaS applications on Windows Azure, which sets the bar high.) Reaching this goal by scaling up—running on bigger and bigger machines—isn’t possible. Instead, Windows Azure is designed to support applications that scale out, running multiple copies of the same code across many commodity servers. Enrollment No: - 0805IT101047 Page No: - 7
  8. 8. Jawaharlal Institute of Technology (Borawan) Aneka Aneka is a market oriented Cloud development and management platform with rapid application development and workload distribution capabilities. Aneka is an integrated middleware package which allows you to seamlessly build and manage an interconnected network in addition to accelerating development, deployment and management of distributed applications using Microsoft .NET frameworks on these networks. It is market oriented since it allows you to build, schedule, provision and monitor results using pricing, accounting, QoS/SLA services in private and/or public (leased) network environments. Aneka is a workload distribution and management platform that accelerates applications in Microsoft .NET framework environments. Some of the key advantages of Aneka over other GRID or Cluster based workload distribution solutions include: · rapid deployment tools and framework, ability to harness multiple virtual and/or physical machines for accelerating application result · provisioning based on QoS/SLA · support of multiple programming and application environments · simultaneous support of multiple run-time environments · built ontop of Microsoft .NET framework, with support for Linux environments through Mono. Enrollment No: - 0805IT101047 Page No: - 8