Your SlideShare is downloading. ×
  • Like
Introduction to Cloud Computing
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×

Now you can save presentations on your phone or tablet

Available for both IPhone and Android

Text the download link to your phone

Standard text messaging rates apply

Introduction to Cloud Computing

  • 9,022 views
Published

 

Published in Technology , Business
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
9,022
On SlideShare
0
From Embeds
0
Number of Embeds
5

Actions

Shares
Downloads
506
Comments
0
Likes
8

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Introduction to Cloud Computing Marin Dimitrov (technology watch #3) Apr 2010
  • 2. Contents • Introduction • Cloud Computing platforms • Programming for the Cloud • Semantic Web on the Cloud Cloud Computing Apr 2010 #2
  • 3. Contents Part I Introduction Cloud Computing Apr 2010 #3
  • 4. Cloud Computing - NIST definition • “Cloud computing is a model for enabling ubiquitous, convenient, on- demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction.” • Delivery models – IaaS (Infrastructure as a Service) - the consumer uses "fundamental resources" such as processing power, storage, networking components or middleware. The consumer can control the operating system, storage, applications and possibly networking – PaaS (Platform as a Service) - the consumer uses a hosting environment for their applications and has control over the applications (and some control over the hosting environment), but does not control the infrastructure on which they are running – SaaS (Software as a Service) - the consumer uses an application, but does not control the infrastructure on which it's running (OS, hardware) Cloud Computing Apr 2010 #4
  • 5. XaaS spectrum – Google, Amazon, Microsoft • Elastic Map Reduce • Gmail SaaS • Google apps • SimpleDB • App Engine • SQL Azure PaaS • Relational DataStore • BigTable / MegaStore • Flexible Payment Service • EC2 • Google Storage • Blob storage IaaS • Simple Queue Service • Azure Computing • Simple Notification Service • Queues • Elastic Block Storage • Load Balancer • S3 / RRS • CloudWatch / Auto Scaling • Elastic Load Balancer Cloud Computing Apr 2010 #5
  • 6. Cloud Computing - Essential characteristics (NIST) • Rapid elasticity – the ability to scale resources both up and down as needed. To the consumer, the Cloud appears to be infinite, and the consumer can purchase as much / little computing power as they need • Measured service – aspects of the Cloud service are controlled and monitored by the Cloud provider. This is crucial for billing, access control, resource optimization & capacity planning • On-demand self service – a consumer can use cloud services as needed without any human interaction with the cloud provider • Ubiquitous network access – the Cloud provider’s capabilities are available over the network and can be accessed through standard mechanisms • Resource Pooling – allows a Cloud provider to serve its consumers via a multi-tenant model - resources are (re)assigned according to consumer demand. Cloud Computing Apr 2010 #6
  • 7. Cloud Computing - deployment models (NIST) • Public cloud – Infrastructure owned by some organisation but sold to 3rd parties – E.g. Amazon Web Services, Google AppEngine, Windows Azure • Private cloud – Internal infrastructure for a single organisation (on or off-premise) – E.g. VMware vCloud, IBM Cloudburst, Microsoft Hyper-V • Community cloud – Infrastructure shared by several organisations, targeting a specific community – E.g. OpenCirrus (HP, Intel, Yahoo, KIT, CMU, …) • Hybrid cloud – Composition of the above – E.g. AWS Virtual Private Cloud Cloud Computing Apr 2010 #7
  • 8. Cloud computing – business drivers 1. Business agility – Faster time to market • No major upfront commitment & investment in infrastructure – Scalability & elasticity • Instant on-demand provisioning • Shifting the risk of over-/under-provisioning to the cloud provider 2. Focus – Outsource non-core tasks to the cloud provider 3. Pay-as-you-go – Speed up new project launching & rollout (start small, add resources when needed) – No need for complex planning ahead – Turn fixed costs (CapEx) into variable costs (OpEx) Cloud Computing Apr 2010 #8
  • 9. Some cloud use cases • Overflow buffer – Avoid over-provisioning for peak loads, but just for the average load • Seasonal business – E.g. Wallmart has 4:1 peak-to-average ratio (source?) • Small startups time-to-market – Less upfront investment, more focus on core competencies • Experimental playground – Rollout experimental projects without major equipment purchases • Speedup of large scale batch operations – 1000 servers for 1 hour cost the same as 1 server for 1000 hours – More cost-efficient computing (off-peak tariffs & time zones) • Unforeseeable events – E.g. sudden traffic spikes to web sites (volcanoes, anyone?) 2010 Cloud Computing Apr #9
  • 10. Cloud-able applications • Typical characteristics – Non mission critical – Need >99% uptime – Low bandwidth / higher latency tolerance – Relaxed security requirements – Few integration points – E.g • Batch operations (speedup at the same price!) • One-time large scale processing • Barriers to cloud migration – Security & trust – Lack of SLA – Lack of standardization (vendor lock-in) Cloud Computing Apr 2010 #10
  • 11. Cloud Computing – pros & cons (C) Dion Hinchcliffe Cloud Computing Apr 2010 #11
  • 12. Contents Part II Cloud Computing Platforms AWS, Google AppEngine, Windows Azure Cloud Computing Apr 2010 #12
  • 13. XaaS spectrum – Google, Amazon, Microsoft (again) • Elastic Map Reduce • Gmail SaaS • Google apps • SimpleDB • App Engine • SQL Azure PaaS • Relational Database Service • BigTable / MegaStore • Flexible Payment Service • EC2 • Google Storage • Blob storage IaaS • Simple Queue Service • Azure Computing • Simple Notification Service • Queues • Elastic Block Storage • Load Balancer • S3 / RRS • CloudWatch / Auto Scaling • Elastic Load Balancer • Virtual Private Cloud Cloud Computing Apr 2010 #13
  • 14. Amazon Web Services • http://aws.amazon.com/ • Xen VMs, 1 ECU = 1.2GHz AMD Opteron, US/EU prices EC2 instance RAM CU* HDD bit $/h on $/h $/h GB (Cores) GB demand Spot reserved S 1.7 1 (1) 160 32 0.085 0.03 0.03 L 7.5 4 (2) 850 64 0.34 0.13 0.12 XL 15 8 (4) 1690 64 0.68 0.24 0.24 High-mem XL 17.1 6.5 (2) 420 64 0.50 0.18 0.17 High-mem 2XL 34.2 13 (4) 850 64 1.20 0.43 0.42 High-mem 4XL 68.4 26 (8) 1690 64 2.40 0.82 0.84 High-CPU M 1.7 5 (2) 350 32 0.17 0.06 0.06 High-CPU XL 7 20 (8) 1690 64 0.68 0.24 0.24 Cloud Computing Apr 2010 #14
  • 15. Amazon Web Services (2) • Simple Storage Service (S3) – Eventually consistent blob storage (SLA available) – Max 5GB per object, REST+SOAP API – Storage $0.15/GB/mo, transfer $0.15/GB, $0.10 per 100K API calls • Elastic Compute Cloud (EC2) – Xen VM, Amazon Machine Image (AMI), no SLA • Elastic Block Storage (EBS) – Up to 1TB storage to be used by EC2 instances (attached devices) – Raw/unformatted block devices (create your own filesystem on top) – Replicated – $0.10/GB/mo, $0.10 per 1 million I/O ops (iostat) Cloud Computing Apr 2010 #15
  • 16. Amazon Web Services (3) • Simple Queue Service – Persistent, reliable, secure, distributed queue (no SLA) – Message size 8KB, autodelete 4 days – duplicate and out-of-order delivery may occur – Price: $0.15/GB transfer, $0.10 per 100K API calls • Simple Notification Service – Reliable, secure & scalable pub/sub service (no SLA) – Protocols: HTTP, e-mail, SQS – Price: $0.15/GB transfer, $0.06 per 100K API calls, price per 100K notifications: $0.06 (HTTP), $2.00 (e-mail), free (SQS) • SimpleDB – Distributed column store (built on Erlang) – Consistent or eventually consistent reads, flexible schema – $0.14/hour consumed, $0.15/GB transfer, $0.25/GB/mo storage Cloud Computing Apr 2010 #16
  • 17. Amazon Web Services (4) • Relational Database Service – MySQL (no SLA) – Automated backup and scaling – $0.11 to $3.10 per hour (instance type), $0.10/GB/mo storage, $0.10 per million I/O ops, $0.15/GB transfer • Elastic MapReduce – Based on Hadoop – Price: EC2 instance price + premium ($0.01 - $0.42/hour) • CloudWatch, Auto Scaling, Elastic Load Balancer – Monitoring, auto scaling & load balancing for EC2 • Virtual Private Cloud Cloud Computing Apr 2010 #17
  • 18. Google AppEngine • http://code.google.com/appengine/ • Features – custom JVM (lots of limitations) – servlet container, JSP – Datastore based on BigTable (column store, consistent, C+P) – JDO/JPA – Google infrastructure services: URL fetch, mail – Memcache (in-memory distributed key/value cache) – Task queues & scheduler – Development: local dev server, Eclipse plugins, administration • Pricing – traffic/GB $0.10 ($0.12); CPU/h $0.10; storage/GB/mo $0.15; e-mail $1 per 10K Cloud Computing Apr 2010 #18
  • 19. Google AppEngine (2) (C) Dan Sanderson / O’Reilly Cloud Computing Apr 2010 #19
  • 20. Google AppEngine (3) • Restrictions – Applications run in a restricted JVM sandbox • No threads, no System calls, limited reflection – No sub-process forking – Connections • Outbound – only URL fetch & mail • Inbound – only HTTP(S) – No filesystem writes (limited read access), use datastore instead – Limits • Request duration – 30 sec • Request/response size – 10 MB (datastore request/response – 1MB) • file size – 10 MB, number of files – 3,000 • Datastore: entity size – 1 MB, property values – 1000, entities per batch - 500 Cloud Computing Apr 2010 #20
  • 21. Google AppEngine (4) • Datastore – Based on BigTable, distributed column-store • Entities and multi-valued properties • Entities have unique key & a type (kind) • Flexible schema Select from Person where lastName = … – Transactional, consistent && height < … – JDO/JPA interface order by height desc • Queries – JDOQL: entity kind + property value restrictions + sort order – Cursors can be specified (query range) – query resultset is materialised in a predefined index • query execution only fetches data from the existing index • queries with same kind + property restriction operator (but different value filler) + same sort order share the same index Cloud Computing Apr 2010 #21
  • 22. Windows Azure • http://www.microsoft.com/windowsazure/ • Components – Windows Azure • Fabric – management & monitoring of cloud services (Hyper-V) • Compute – hosted applications (.net, c++, java, …) • Storage – blob storage, tables, queues (REST interface) – SQL Azure • Cloud based MS SQL Server – AppFabric • Infrastructure services, Service registry • Access control • Pricing – CPU/h $0.12; storage $0.15/GB/mo, transfer $0.10 ($0.15), storage transactions – $1 per 1 million Cloud Computing Apr 2010 #22
  • 23. Windows Azure (2) (C) David Chapell Cloud Computing Apr 2010 #23
  • 24. Contents Part III Programming for the Cloud Tools & APIs Cloud Computing Apr 2010 #24
  • 25. Programming for the Cloud • Amazon – REST API – AWS Java SDK (http://aws.amazon.com/sdkforjava/) – AWS Toolkit for Eclipse (http://aws.amazon.com/eclipse) – Typica (http://code.google.com/p/typica/) – JetS3t (S3 only) http://jets3t.s3.amazonaws.com/index.html • Google AppEngine – AppEngine SDK (dev server, admin tools, Eclipse plugins) – Datastore: JDO, JPA, low-level Java API – Memcache: JCache + low level Java API – URL fetch: java.net + low level Java API – Mail: java.mail + low level Java API – Task queue, blob store, accounts: low level APIs Cloud Computing Apr 2010 #25
  • 26. Programming for the Cloud (2) • jClouds – http://code.google.com/p/jclouds/ – Cloud interoperability framework (AWS, Google AppEngine*, Windows Azure, GoGrid) – Mostly storage oriented functionality • Eucalyptus – http://www.eucalyptus.com/ – Open source private cloud infrastructure – AWS compatible (EC2, EBS, S3) (C) Eucalyptus Inc. – Cross-hypervisor support Cloud Computing Apr 2010 #26
  • 27. Don’t forget… • Deploying on EC2 requires minimal to no modifications of existing software • EC2 has some big machines: 70GB RAM / 8 CPU cores • 1,000 servers for 1hr cost the same as 1 server for 1,000hrs • Data traffic (in/out) of the Cloud can be expensive • Storage relatively cheap • Internal cloud traffic is free (AWS), e.g. accessing other applications/datasets on the Cloud • CPU price: uptime (EC2) vs. computing cycles (AppEngine) • EC2 spot instances (off-peak hours) are very, very cheap! Cloud Computing Apr 2010 #27
  • 28. Contents Part IV Semantic Web on the Cloud Cloud Computing Apr 2010 #28
  • 29. Semantic Web on the Cloud • Public Data Sets on AWS – A lot of datasets hosted for free by Amazon • Freebase, UniGene, US Census, … – New data sets can be submitted too (after approval) – Full LOD cloud still not available (due to licensing issues) • SaaS – Virtuoso (AWS hosted), OpenCalais, … • “Semantic Cloud” initiatives (cloud interoperability & data integration) – E.g. fluidOps - Management & provisioning of semantic applications (SaaS) and datasources (DaaS) on the Cloud • Semantic Web apps as virtual appliances on the Cloud • LOD data sources as virtual resources on the Cloud (“Self-service” paradigm) Cloud Computing Apr 2010 #29
  • 30. Unified Cloud Computing • http://code.google.com/p/unifiedcloud/ • Uses RDF for cloud data interoperability Cloud Computing Apr 2010 #30
  • 31. Useful and useless links • http://groups.google.com/group/cloud-computing • “An Essential Guide to Possibilities and Risks of Cloud Computing” • “Talking To Your CFO About Cloud Computing” • Nick Carr @ Atmosphere’2009 • Introducing the Windows Azure platform Cloud Computing Apr 2010 #31
  • 32. Q&A Questions? Cloud Computing Apr 2010 #32