My Other Computer is a Data Center (2010 v21)


Published on

This is a talk I gave recently about cloud computing from the perspective of thinking of a data center as your "other computer."

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

My Other Computer is a Data Center (2010 v21)

  1. 1. An Overview of Cloud Computing:My Other Computer is a Data Center<br />Robert GrossmanOpen Cloud Consortium<br />January 7, 2010<br />
  2. 2. Part 1What is a Cloud?<br />2<br />
  3. 3. What is a Cloud?<br />3<br />Software as a Service (SaaS)<br />
  4. 4. What Else is a Cloud?<br />4<br />Platform as a Service (PaaS)<br />
  5. 5. Is Anything Else a Cloud?<br />5<br />Infrastructure as a Service (IaaS)<br />
  6. 6. Are There Other Types of Clouds?<br />6<br />ad targeting <br />Large Data Cloud Services<br />
  7. 7. What is Virtualization?<br />7<br />
  8. 8. Idea Dates Back to the 1960s<br />8<br />App<br />App<br />App<br />CMS<br />CMS<br />MVS<br />IBM VM/370<br />IBM Mainframe<br />Native (Full) Virtualization<br />Examples: Vmware ESX<br />Virtualization first widely deployed with IBM VM/370.<br />
  9. 9. What Do You Optimize?<br />Goal: Minimize latency and control heat.<br />Goal: Maximize data (with matching compute) and control cost.<br />
  10. 10. 10<br />Scale is new<br />
  11. 11. Elastic, Usage Based Pricing Is New<br />11<br />costs the same as<br />1 computer in a rack for 120 hours<br />120 computers in three racks for 1 hour<br /><ul><li> Elastic, usage based pricing turns capex into opex.
  12. 12. Clouds can be used to manage surges in computing needs.</li></li></ul><li>Simplicity Offered By the Cloud is New<br />12<br />+<br />.. and you have a computer ready to work.<br />A new programmer can develop a program to process a container full of data with less than day of training using MapReduce.<br />
  13. 13. 13<br />
  14. 14. What Resource is Managed?<br />Scarce processors wait for data<br />Manage cycles<br />wait for an opening in the queue<br />scatter the data to the processors<br />and gather the results<br />Persistent data wait for queries<br />Manage data<br />persistent data waits for queries<br />computation done locally<br />results returned<br />Supercomputer Center Model <br />Data Center<br />Model<br />
  15. 15. Part 2. Data Centers as the Unit of Computing<br />Cloud computing is at the top of the Gartner hype cycle.<br />“Cloud computing has become the center of investment and innovation.”Nicholas Carr, 2009 IDC Directions<br />15<br />
  16. 16. 2004<br />10x-100x<br />1976<br />10x-100x<br />data<br />science<br />1670<br />250x<br />simulation science<br />1609<br />30x<br />experimental science<br />
  17. 17. Requirements for Clouds<br />
  18. 18. Transition Taking Place<br />A hand full of players are building multiple data centers a year and improving with each one.<br />This includes Google, Microsoft, Yahoo, …<br />A data center today costs $200 M – $400+ M<br />Berkeley RAD Report points out analogy with semiconductor industry as companies stopped building their own Fabs and starting leasing Fabs from others as Fabs approached $1B <br />18<br />
  19. 19. Which is the Operating System?<br />19<br />…<br />…<br />VM 1<br />VM 5<br />VM 50,000<br />VM 1<br />Data Center Operating System<br />Hyperviser<br />workstation<br />data center<br />
  20. 20. How Do You Program A Data Center?<br />20<br />
  21. 21. Some Programming Models for Data Centers<br />Operations over data center of disks<br />MapReduce (“string-based”)<br />User-Defined Functions (UDFs) over data center<br />SQL and Quasi-SQL over data center<br />Data analysis / statistics over data center<br />Operations over data center of memory<br />Grep over distributed memory<br />UDFs over distributed memory<br />SQL and Quasi-SQL over distributed memory<br />Data analysis / statistics over distributed memory<br />
  22. 22. Part 3.Open Cloud Consortium<br />
  23. 23. U.S. 501(3)(c) not-for-profit corporation<br />Supports the development of standards and interoperability frameworks.<br />Supports reference implementations for cloud computing. <br />Manages testbeds: Open Cloud Testbed, IntercloudTestbed, Open Science Data Cloud<br />Develops benchmarks.<br />23<br /><br />
  24. 24. OCC Members<br />Companies: Aerospace, Booz Allen Hamilton, Cisco, InfoBlox, Open Data Group, Raytheon, Yahoo<br />Universities: CalIT2, Johns Hopkins, Northwestern, University of Illinois at Chicago, University of Chicago<br />Government agencies: NASA<br />Organizations: Sector Project<br />24<br />
  25. 25. Open Cloud Testbed<br />C-Wave<br />CENIC<br />Dragon<br />Phase 2<br />9 racks<br />250+ Nodes<br />1000+ Cores<br />10+ Gb/s<br /><ul><li>Hadoop
  26. 26. Sector/Sphere
  27. 27. Thrift
  28. 28. KVM VMs
  29. 29. Eucalyptus VMs</li></ul>MREN<br />25<br />
  30. 30. IntercloudTestbed<br /><ul><li>Platform as a Service
  31. 31. Cloud Compute Services
  32. 32. Data & Storage as a Service</li></ul>Large Data Cloud Interoperability Framework<br />Working with Infrastructure 2.0 Working Group<br />SNIA Cloud Data Management Interface (CDMI)<br />Dynamic infrastructure service linking IaaS and DaaS<br />Working with Infrastructure 2.0 Working Group<br /><ul><li>Infrastructure as a Service</li></ul>Virtual Data Centers (VDC)<br />Virtual Networks (VN)<br />Virtual Machines (VM)<br />Physical Resources<br />Dynamic infrastructure service naming and linking entities in the IaaS layers<br />Open Cloud Computing Interface (OCCI)<br />Open Virtualization Format (OVF)<br />
  33. 33. Open Science Data Cloud<br />sky cloud<br />Planning to work with 5 international partners (all connected with 10 Gbps networks).<br />biocloud<br />27<br />
  34. 34. MalStone (OCC-Developed Benchmark)<br />Sector/Sphere 1.20, Hadoop 0.18.3 with no replication on Phase 1 of Open Cloud Testbed in a single rack. Data consisted of 20 nodes with 500 million 100-byte records / node.<br />
  35. 35. Some Lessons Learned (So Far)<br />Python over Hadoop Distributed File System surprisingly powerful.<br />Tuning Hadoop can be a large (unacknowledged) cost. <br />Performance of a cloud computation can be significantly impacted by just 1 or 2 nodes that are a bit slower.<br />Wide area clouds can be practical in some cases.<br />29<br />
  36. 36. Part 4. Sector<br />30<br /><br />
  37. 37. Sector Overview<br />Sector is fast<br />As measured by MalStone & Terasort<br />Sector is easy to program<br />Supports UDFs, MapReduce & Python over streams<br />Sector does not require extensive tuning.<br />Sector is secure<br />A HIPAA compliant Sector cloud is being set up<br />Sector is reliable<br />Sector v1.24 supports multiple master node servers<br />31<br />
  38. 38. Google’s Large Data Cloud<br />Compute Services<br />Data Services<br />Storage Services<br />32<br />Applications<br />Google’s MapReduce<br />Google’s BigTable<br />Google File System (GFS)<br />Google’s Stack<br />
  39. 39. Hadoop’s Large Data Cloud<br />Compute Services<br />Storage Services<br />33<br />Applications<br />Hadoop’sMapReduce<br />Data Services<br />Hadoop Distributed File System (HDFS)<br />Hadoop’s Stack<br />
  40. 40. Sector’s Large Data Cloud<br />34<br />Applications<br />Compute Services<br />Sphere’s UDFs<br />Data Services<br />Sector’s Distributed File System (SDFS)<br />Storage Services<br />UDP-based Data Transport Protocol (UDT)<br />Routing & Transport Services<br />Sector’s Stack<br />
  41. 41. Generalization: Apply User Defined Functions (UDF) to Files in Storage Cloud<br />map/shuffle<br />reduce<br />35<br />UDF<br />UDF<br />
  42. 42. Hadoopvs Sector<br />36<br />Source: Gu and Grossman, Sector and Sphere, Phil. Trans. Royal Society A, 2009.<br />
  43. 43. Terasort - Sector vsHadoop Performance<br />Sector/Sphere 1.24a, Hadoop 0.20.1 with no replication on Phase 2 of Open Cloud Testbed with co-located racks.<br />
  44. 44. Sector Applications<br />Distributing the 15 TB Sloan Digital Sky Survey to astronomers around the world (joint with JHU, 2005)<br />Managing and analyzing high throughput sequence data (Cistrack, University of Chicago, Cistrack, 2007).<br />Detecting emergent behavior in distributed network data (Angle, won SC 07 Analytics Challenge)<br />Image processing for high throughput sequencing.<br />Wide area clouds (won SC 09 BWC with 100 Gbps wide area computation) <br />New ensemble-based algorithms for trees<br />Graph processing<br />38<br />
  45. 45. Cistrack Web Portal & Widgets<br />Cistrack Elastic Cloud Services<br />Cistrack Database<br />Analysis Pipelines & Re-analysis Services<br />Cistrack Large Data Cloud Services<br />Ingestion Services<br />
  46. 46. Thank you<br />For more information, please see<br />40<br />