Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009
Introduction to Cloud Computing - CCGRID 2009

Editor's Notes

  • #16 Both VMWare and Citrix give away enterprise-class VM platforms.
  • #18 Can move multiple lightly/moderately loaded VMs to a single physical host - idea is hopefully they all won’t be under heavy load simultaneously. Agile deployment means I can expand and contract my deployment trivially on a cluster Removes the difficulty inherent with distributed process migration.
  • #19 GFS and HDFS split large files into chunks (typically around 64MB each) and stores multiple copies of chunks for performance and redundancy reasons. XenServer Storage Pool is an abstraction layer than can include physical storage, iSCSI, NFS and other storage sources.
  • #20 WS-* refers to the huge number of Web service specifications that exist. WS-* covers things like Security Specifications, Privacy, Reliable Messaging Specifications, Resource Specifications, Business Process Specifications & Transaction Specifications
  • #28 With increasing popularity and usage, large Grid installations are facing new problems, such as excessive spikes in demand for resources coupled with strategic and adversarial behaviour by users.
  • #32 Mosso Cloud Servers is a separate Rackspace product offering that works as a utility service with hourly billing.
  • #39 SmugMug - Autoscaling software called “SkyNet”. EC2 instances deployed as needed. Animoto - launched Facebook app to integrate their service, popularity soared. Peaked at 5000 EC2 instances.
  • #50 Each storage provider outlines their own cost structure for data transferred in and out of their service, as well as charging for persistent storage. In each of these cases, the costs are in the order of cents per gigabyte. Pricing scales downward based on higher usage for all providers. There is no minimum data usage requirement and no contracts - you only pay for what you store and transfer.
  • #51 The providers themselves have very similar core functionality, but there are some key differences, for example, the largest allowable file size, the coverage footprint or specific features.
  • #52 Amazon CloudFront offers a CDN-like service that is significantly cheaper than tradional CDNs. Amazon charges different rates depending on where the data is delivered from to reflect the cost of data transfer and operations in different locations.
  • #54 An App Engine application cannot write to the filesytem (you must use Datastore) or open a socket or access another host directly (you must use URL fetch service). A Java application cannot create a new Thread either. .
  • #55 You can run very substantial application just using the free services.
  • #62 Distributed Systems Architecture Research Group at Universidad Complutense de Madrid.
  • #63 Computer Science Department at the University of California, Santa Barbara Now is now maintained by Eucalyptus Systems.
  • #64 Kate Keahey - Argonne National Laboratory and a Computation Institute fellow at the University of Chicago WSRF: WS-Resource Framework
  • #66 MetaCDN is a system that leverages several existing ‘storage clouds’, creating an integrated overlay network that provides a low cost, high performance content delivery network for content creators.
  • #68 *Content Delivery Networks (CDNs) such as Akamai and Mirror Image place web server clusters in numerous geographical locations to improve the responsiveness and locality of the content it hosts for end-users. *However, their services are priced out of reach for all but the largest enterprise customers.
  • #69 Major CDN providers are notoriously cagey about revealing their prices. Most will only reveal their prices if you are serious customer and are willing to commit to a contract and minimum data usage (as detailed in the previous slide). As such Dan Rayburn @ StreamingMedia.com (a blog run for streaming media and CDN professionals) undertakes an informal sampling of pricing (taken from CDN customers) every few months.
  • #70 Numerous ‘storage cloud’ providers (or ‘Storage as a Service’) have emerged that can provide data storage and delivery in several continents, offering S.L.A. backed performance and uptime promises for their services.
  • #71 It is easy to see why storage clouds provide a compelling alternative to traditional CDNs for content producers that transfer significant amounts of data to their customers.
  • #72 1. MetaCDN is more likely to meet the needs of content creators than a single provider could. 2. There is no ‘unified’ or familiar interface for all storage clouds. Consider Amazon S3, Nirvanix SDN, Mosso Cloud Files and Microsoft Azure Storage Service. These four cloud storage providers have four separate access APIs that a developer would need to learn to access these services. 3. If a content creator attempted to utilise these providers themselves, they would essentially need to perform the load balancing and redirection themselves at their origin sites (complex!)
  • #73 The service is presented to end-users as a web portal for small or ad-hoc deployments or as Web Services (currently under development) for integration of customers with more complex and frequently changing content delivery needs. The web portal was developed using Java Enterprise and Java Server Faces (JSF) technologies, with a MySQL back-end to store user accounts, deployments, and the capabilities and pricing of service providers. Introduce connectors, and major components.
  • #74 The MetaCDN system integrates with each storage provider via connectors that provides an abstraction to hide the complexity arising from the differing ways each provider allows access to their systems. The connectors encapsulate basic operations like creation, deletion and renaming of files and folders. If an operation is not supported on a particular service, then the connector for that service should throw a FeatureNotSupportedException.
  • #75 1. MetaCDN deploys as many replicas as possible to all available locations. 2. A user nominates regions and MetaCDN matches the requested regions with providers that service those areas. 3. Where MetaCDN deploys as many replicas in the locations requested by the user as their storage and transfer budget will allow, keeping them active until that budget is exhausted. 4. MetaCDN deploys to providers that match specific QoS targets that a user specifies, such as average throughput or response time from a particular location, which is tracked by persistent probing from the MetaCDN QoS monitor.
  • #78 The MetaCDN database tracks all pertinent information such as users of the system, credentials for various providers, details about the providers capabilities, pricing and footprint and details of replicas deployed.
  • #79 Using the web portal, users can sign up for an account on the MetaCDN system, and enter credentials for any cloud storage or other provider they have an account with. Once this simple step has been performed, they can utilise the MetaCDN system to intelligently deploy content onto storage providers according to their performance requirements and budget limitations.
  • #81 A MetaCDN user is required to register the credentials of providers they have accounts with. Once this step is done they do not need to worry about how to interact with each of the providers. Eventually we would like MetaCDN users to not require accounts with specific providers - rather MetaCDN would provide consolidated billing of users for storage and transfer of content.
  • #82 The MetaCDN \"Control Panel\" gives easy access to the core features of the service. You can deploy content, view existing deployments (via a high level content view or a detailed replica view), and view a deployment map overlayed onto Google Maps or Google Earth.
  • #83 Here we can see an example of geographical-based deployment. A user nominates regions and MetaCDN matches the requested regions with providers that service those areas. The user also specifies the desired lifetime of the deployment, after which the replicas will be removed.
  • #84 We can view details of our past deployments. We store and track information such as the origin id (i.e. the original source of the content), a unique GUID, the MetaCDN URL that represents the deployment, the number of times this content has been downloaded, the last time this content was downloaded and how many replicas were generated from this deployment.
  • #85 Here we can see the specific replicas that have been generated from our various deployments. For each replica, we can see which provider and location was utilised, the public URL of the replica, the number of times the replica has been downloaded, the last access time of a specific replica and options to modify, delete, or view the replica if we wish to fine tune our deployment.
  • #86 We can get a birds eye view of where our replicas are stored, and how many are stored in each location. MetaCDN generates a KML file for each user that is used to overlay on Google Maps (shown here) or we can view our deployments in Google Earth. We expect to overlay more useful information in these views in the near future, such as the cost expenditure at each location and the location of client (i.e. file consumer) hotspots.
  • #87 A web service interface is under development that will make all the functionality of the web portal available in a programmatic fashion. Obviously it's not feasible to deploy thousands of files manually via the web portal so we need to prove the facility for advanced customers to scale out easily and rapidly.
  • #88 *With multiple sources (and multiple URL’s) the complexity of load balancing is imposed on the origin / content provider *With single namespace we can have coarse and fine-grained control via DNS redirection and layer4/7 load balancing http://www.metacdn.org:8080/MetaCDN/FileMapper?itemid=1 http://www.metacdn.org:8080/MetaCDN/FileMapper?itemid=1&policy=RAN http://www.metacdn.org:8080/MetaCDN/FileMapper?itemid=1&policy=GEO
  • #90 During the development phase only, only 1 copy of portal/redirector is running in Melbourne, Australia. The plan is to deploy portals in several locations across US, Asia and Europe. We will see from next slide why this is necessary.
  • #91 Let's assume that a consumer in the USA was accessing a replica directly (i.e. it magically knew the best replica to select), or via MetaCDN. Here we can see that there is around 0.4 seconds of overhead, which is predominantly the round-trip time to access the gateway in Australia.
  • #92 When a consumer has a gateway that is close to it (in this case, a consumer in Australia is utilising a local gateway) the overhead is significantly smaller, in the order of 0.05 seconds per request. It is obvious that local gateways are needed in key areas to maximise performance.
  • #93 In the second half of 2008 we evaluated the two major cloud storage providers at the time, Amazon S3 and Nirvanix SDN. We ran the test over 24 hours from a variety of client and replica locations to see whether the providers demonstrated sufficient performance (i.e. throughput and response times) to act as a \"poor man's\" CDN.
  • #94 In 5 out of 6 client locations there were at least 2 cloud-hosted replicas each that delivered throughput that is consistent with what we would expect from a traditional CDN service. <Kilobytes per second>
  • #95 In 4 our of 6 client locations there was 1-3 cloud-hosted replicas that delivered response time that is consistent with what we would expect from a traditional CDN service. It is worth noting here that these times represent the end to end latency and connection time (i.e. a HTTP connection is made), they are not simply ping measurements.
  • #97 Deploy EC2 instances
  • #98 UltraDNS or Dynect are DNS providers that allow you to update your ADNS entries via WS. FTP/WebDav support will be useful in locations where cloud providers do not (and are unlikely to) service.
  • #99 There is a lot of demand from customers to move away from Youtube and Vimeo flash video hosts and host their own streaming content directly on their origin site. This way they control the look and feel and ad monetization of their content.
  • #104 Many open questions - these are just a few of them. I don&#x2019;t have the answers - many exciting research opportunities here.
  • #105 Currently there are no well-defined standards for the different cloud offerings, so it is not trivial to switch between them, resulting in a degree of vendor lock-in. From a cloud consumer&#x2019;s perspective it should be easy to switch from one provider to another, if you are dissatisfied with the service, reliability or value for money you are receiving. I certainly believe that standards are needed for cloud computing and cloud storage providers but we need to be careful not to impose them too early and stifle all the great innovation that is happening in the cloud space right now. We may find that the dominant provider ends up setting the standards by default.
  • #106 Hundreds of premade AMIs available through Amazon VMWare has VMware Virtual Appliance Marketplace (over 1000) VBDs can be based on physical disk partitions, raw image files, QCOW images, and logical volume management (LVM) volumes. Many independent websites host hundreds of pre-made Xen images
  • #107 OVF has been defined as \"open, secure, portable, efficient and extensible format for the packaging and distribution of software to be run in virtual machines\". OVF is not tied to any particular hypervisor or processor architecture.
  • #120 It remains unclear whether cloud computing providers can meet regulatory compliance requirements for data storage, security, segregation and availability for industries like healthcare and banking, although there has been some success stories in the US where HIPAA ( Health Insurance Portability and Accountability Act) compliant health insurance claim processing system was deployed on Amazons Cloud Services. Care needs to be taken in these instances to ensure sensitive data is protected when hosted on external cloud computing providers.
  • #121 Innovative pricing models for Cloud Computing Services &#x2013; Currently the pricing for Cloud Computing services is very static. Something that can be derived from traditional utilities like electricity is the notion of peak and off peak pricing. This spreads the demand, making capacity planning more predictable for the cloud computing providers and can save money for cloud consumers that are flexible enough to use the services in off peak times.
  • #122 Innovative service models &#x2013; Allow cloud consumers to be cloud providers. In the same fashion end-users and businesses can sell back electricity generated by solar energy to the power grid, allow them to contribute to computational and storage clouds. It would be great for companies to sell back their under-utilised resources using the same technology cloud providers&#x2019; use. This could be a win-win situation as companies like Amazon, who only have a presence in the US and Europe, could resell these resources and expand their footprint into places like Australia, Asia and Africa.