An Introduction to Cloud Computing (2009)


Published on

This is a talk I gave in 2009 introducing cloud computing. It is a bit dated now.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

An Introduction to Cloud Computing (2009)

  1. 1. An Introduction to Cloud Computing<br />Robert Grossman<br />December 8, 2009<br />
  2. 2. Part 1<br />Introduction<br />2<br />
  3. 3. What is a Cloud?<br />Clouds provide elastic, on-demand resources or services over a network, often the Internet, with the scale and reliability of a data center.<br />The NIST definition has become standard.<br />Cloud architectures are not new.<br />What is new:<br />Scale<br />Ease of use<br />Pricing model.<br />3<br />
  4. 4. 4<br />Scale is new.<br />
  5. 5. Elastic, Usage Based Pricing Is New<br />5<br />costs the same as<br />1 computer in a rack for 120 hours<br />120 computers in three racks for 1 hour<br /><ul><li> Elastic, usage based pricing turns capex into opex.
  6. 6. Clouds can manage surges in computing needs.</li></li></ul><li>Simplicity Offered By the Cloud is New<br />6<br />+<br />.. and you have a computer ready to work.<br />A new programmer can develop a program to process a container full of data with less than day of training using MapReduce.<br />
  7. 7. Two Types of Clouds<br />On-demand resources & services over a network at the scale of a data center<br />On-demand, elastic computing instances (IaaS)<br />IaaS: Amazon EC2, S3, etc.; Eucalyptus<br />supports many Web 2.0 applications/users<br />Large data clouds (Large Data PaaS)<br />GFS/MapReduce/Bigtable, Hadoop, Sector, …<br />Manage and compute with large data (say 100+ TB)<br />7<br />
  8. 8. Ease of use – With Google’s GFS & MapReduce, it is simple to compute with 10 terabytes of data over 100 nodes. With Amazon’s AMIs, it is simple to respond to a surge of 100 additional web servers.<br />8<br />
  9. 9. Cloud Architectures – How Do You Fill a Data Center?<br />on-demand computing capacity<br />App<br />App<br />App<br />App<br />App<br />on-demand computing instances<br />Cloud Data Services (BigTable, etc.) <br />Quasi-relational Data Services<br />App<br />App<br />Cloud Compute Services (MapReduce & Generalizations)<br />App<br />App<br />…<br />App<br />App<br />App<br />Cloud Storage Services<br />
  10. 10. Varieties of Clouds<br />Architectural Model<br />Computing Instances vs Computing Capacity<br />Economic Model<br />Elastic, usage based pricing, lease/own, …<br />Management Model<br />Private vs Public; Single vs Multiple Tenant; …<br />Programming Model<br />Queue Service, MPI, MapReduce, Distributed UDF<br />10<br />Computing instances vs computing capacity<br />Private internal vspublic external <br />Elastic, usage-based pricing or not<br />All combinations occur.<br />
  11. 11. Payment Models<br />Buying racks, containers and data centers<br />Leasing racks containers and data centers<br />Utility based computing (pay as you go)<br />Moves cap ex to op ex<br />Handle surge requirements (use 1000 servers for 1 hour vs 1 server for 1000 hours)<br />11<br />
  12. 12. Management Models<br />Public, private and hybrid models<br />Single tenant vs multiple tenant (shared vs non-shared hardware)<br />Owned vs leased<br />Manage yourself vs outsource management<br />All combinations are possible<br />12<br />
  13. 13. Programming Model<br />13<br />on-demand<br />computing instances<br />on-demand computing capacity<br />Amazon’s Simple Queue Service<br />MPI, sockets, FIFO<br /><ul><li>DryadLINQ
  14. 14. Azure services
  15. 15. MapReduce
  16. 16. Distributed UDF</li></li></ul><li>Applications<br />Apps<br />Compute Services<br />Data Services<br />Metadata Services<br />PaaS<br />Storage Services<br />Identity Manager<br />Virtual Machine Manager<br />Virtual Network Manager<br />IaaS<br />Network Transport<br />
  17. 17. Instances, Services & Frameworks<br />15<br />Hadoop DFS & MapReduce<br />Google AppEngine<br />Microsoft Azure<br /><br />VMWare<br />Vmotion…<br />many instances<br />Amazon’s SQS<br />Azure Services<br />Amazon’s EC2<br />single instance<br />S3<br />instance<br />(IaaS)<br />service<br />framework<br />(PaaS)<br />operating system<br />
  18. 18. Part 2. Cloud Computing Industry<br />“Cloud computing has become the center of investment and innovation.”Nicholas Carr, 2009 IDC Directions<br />16<br />Cloud computing is approaching the top of the Gartner hype cycle.<br />
  19. 19. Cloud Computing Eco-System<br />No agreed upon terminology<br />Vendors supporting data centers<br />Vendors providing cloud apps & services to end users<br />Vendors supporting the industry i.e. those developing cloud applications and services for themselves or to sell to end users<br />Communities developing software, standards, benchmarks, etc.<br />17<br />
  20. 20. Cloud Computing Ecosystem<br />18<br />Consumers of Software as a Service<br />Providers of Software as a Service<br />Data Centers<br />Consumers of Cloud Services<br />Providers of Cloud Services<br />Berkeley RAD Report on cloud computing divides industry into these layers.<br />
  21. 21. Transition Taking Place<br />A hand full of players are building multiple data centers a year and improving with each one.<br />This includes Google, Microsoft, Yahoo, …<br />A data center today costs $200 M – $400+ M<br />Berkeley RAD Report points out analogy with semiconductor industry as companies stopped building their own Fabs and starting leasing Fabs from others as Fabs approached $1B <br />19<br />
  22. 22. Data Center Operating Systems<br />20<br />…<br />…<br />VM 50,000<br />VM 1<br />VM 1<br />VM 5<br />Data Center Operating System<br />workstation<br />Data center services include: VM management services, business continuity services, security services, power management services, etc.<br />
  23. 23. Building Data Centers<br />Sun’s Modular Data Center (MD)<br />Formerly Project Blackbox<br />Containers used by Google, Microsoft & others<br />Data center consists of 10-60+ containers.<br />21<br />
  24. 24. Mindmeister Map of Cloud Computing<br />Dupont’sMindmeister Map divides the industry:<br />IaaS, PaaS, Management, Community<br /><br />22<br />
  25. 25. Part 3<br />Virtualization<br />23<br />
  26. 26. Virtualization<br />Virtualization separates logical infrastructure from the underlying physical resources to decrease time to make changes, improve flexibility, improve utilization and reduce costs<br />Example - server virtualization. Use one physical server to support multiple logical virtual machines (VMs), which are sometimes called logical partitions (LPARs)<br />Technology pioneered by IBM in 1960s to better utilize mainframes<br />24<br />
  27. 27. Idea Dates Back to the 1960s<br />25<br />App<br />App<br />App<br />CMS<br />CMS<br />MVS<br />IBM VM/370<br />IBM Mainframe<br />Native (Full) Virtualization<br />Examples: Vmware ESX<br />
  28. 28. Two Types of Virtualization<br />26<br />Apps<br />Apps<br />Unmodified Guest OS 1<br />Unmodified Guest OS 2<br />Modified Guest OS 1<br />Modified Guest OS 2<br />Hyperviser<br />Hyperviser<br />Physical Hardware<br />Physical Hardware<br />Native (Full) Virtualization<br />Examples: Vmware ESX<br />Para Virtualization<br />Examples: Xen<br />Using the hypervisor, each guest OS sees its own independent copy of the CPU, memory, IO, etc.<br />
  29. 29. Four Key Properties<br />Partitioning: run multiple VMs on one physical server; one VM doesn’t know about the others<br />Isolation: security isolation is at the hardware level.<br />Encapsulation: entire state of the machine can be copied to files and moved around<br />Hardware abstraction: provision and migrate VM to another server<br />27<br />
  30. 30. Managing Virtual Machines<br />Provision VM<br />Schedule VM<br />Monitor VM<br />Self-service portal for VM<br />28<br />
  31. 31. Part 4 <br />Technical differences between clouds for data intensive computing, databases and supercomputers<br />29<br />
  32. 32. Supercomputer Center Model<br />or<br />Data Center Model<br />
  33. 33. What Resource is Managed?<br />Scarce processors wait for data<br />Manage cycles<br />wait for an opening in the queue<br />scatter the data to the processors<br />and gather the results<br />Persistent data wait for queries<br />Manage data<br />persistent data waits for queries<br />computation done locally<br />results returned<br />Supercomputer Center Model <br />(local)<br />HPC Grid<br />(distributed)<br />Data Center 2.0 <br />Model<br />Distributed 2.0<br />Data Centers<br />
  34. 34. Databases<br />vs<br />Data Clouds<br />Trading functionality for scalability.<br />32<br />
  35. 35. Trading Functionality for Scalability<br />33<br />
  36. 36. Not Everyone Agrees<br />David J. DeWitt and Michael Stonebraker, MapReduce: A Major Step Backwards, Database Column, Jane 17, 2008<br />34<br />
  37. 37. Part 5. Standards Efforts<br />35<br />Train gauge in Russia is 1520 mm<br />Train gauge in China is 1435 mm<br />How can a cloud application move from one cloud storage service to another?<br />Change of gauge at Ussuriisk (near Vladivostok) at the Chinese –Russian border<br />
  38. 38. Standards Efforts for Clouds<br />Distributed Management Task Force (DMTF)<br />Storage Network Industrial Association (SNIA)<br />Cloud Computing Interoperability Forum (CCIF)<br />Open Cloud Consortium (OCC)<br />Open Grid Forum (OGF)<br />Plus several others…<br />36<br />