Above the Clouds: A View From Academia Armando Fox, UC Berkeley EDUSERV Symposium, 12 May 2011 Presentation slides license...
Who Am I? <ul><li>Research: Internet-scale systems; productive parallel programming </li></ul><ul><li>Teaching: software e...
How We Got Into the Cloud:  RAD Lab’s 5-year Mission <ul><li>Enable  1 entrepreneur  to prototype a great Web app over 3-d...
Outline: Two Themes <ul><li>Academic clouds: public or private?  </li></ul><ul><ul><li>Theme 1 : save money or improve res...
Public Cloud: CS Research <ul><li>Over $350,000 spent on AWS since 2008  </li></ul><ul><ul><li>PhD student ~ US$75k/year =...
Public Cloud: CS Education <ul><li>Great Ideas in Computer Architecture (reinvented Fall 2010): 190 students </li></ul><ul...
Cloud Economics <ul><li>“ Private should be cheaper if you have stable utilization”  </li></ul>Demand Capacity Time Demand...
$Private  <  $Public? <ul><li>Capital:  hardware, networking, power 5-7x cheaper at 100K’s scale (Hamilton 2007) </li></ul...
Hard to Compete on Cost <ul><li>Zero-touch metering/billing  infrastructure </li></ul><ul><li>Optimized for low margin  </...
Try to smooth out peaks? <ul><li>Not waiting in queues accelerates research! </li></ul><ul><ul><li>Run several experiments...
Example: wait times on UC Berkeley “Mako” cluster <ul><li>Mako has 272 dual-socket (quad-core per socket) nodes with 24 GB...
On the other hand...Big Data * Simson L. Garfinkel,  An Evaluation of Amazon’s Grid Computing Services: EC2, S3 and SQS,  ...
On the other hand...Cloud Provider <ul><li>Hard research on public cloud: </li></ul><ul><ul><li>scheduling/provisioning re...
Nonprofit/Academic clouds <ul><li>PlanetLab & Emulab </li></ul><ul><ul><li>highly successful from their customers’ point o...
OpenCirrus <ul><li>Infrastructure costs  increase  with # sites </li></ul><ul><li>Claim: even at ~50% utilization, owning ...
Public & private clouds don’t see same benefits * Implies ability to meter, and incentive to release idle resources Benefi...
So You Want to Build a Cloud... <ul><li>Single point of failure? </li></ul><ul><li>Zero Touch? </li></ul><ul><li>Hidden co...
Single point of failure <ul><li>30+ hour EBS outage on 21 April 2011 </li></ul><ul><ul><li>triggered by human error (netwo...
Metering and Billing <ul><li>Billing is  policy.  Metering is  mechanism. </li></ul><ul><ul><li>Pay-as-you-go  policy  all...
Hidden Costs <ul><li>Single billing scheme captures all costs, or must some costs be billed/accounted separately? </li></u...
Two themes <ul><li>Academic clouds: public or private?  </li></ul><ul><ul><li>Theme 1 : save money or improve research? </...
Two themes <ul><li>Academic clouds: public or private?  </li></ul><ul><ul><li>Theme 1 : save money or improve research? </...
Summary <ul><li>Public cloud shows how to “move slider” between insourcing & outsourcing  </li></ul><ul><li>Unlikely to co...
Thanks! <ul><li>UC Berkeley Reliable Adaptive Distributed Systems Lab & Affiliates </li></ul><ul><li>UC Cloud Computing Ta...
Upcoming SlideShare
Loading in...5
×

Above the Clouds: A View From Academia

3,286

Published on

The closing keynote by Armando Fox at the Eduserv Symposium 2011 - Virtualisation and the Cloud.

Published in: Technology, Education
2 Comments
2 Likes
Statistics
Notes
  • You can watch a video of Armando Fox giving his presentation at the Eduserv Symposium 2011:

    http://www.eduserv.org.uk/newsandevents/events/eduserv-symposium-2011/closing-keynote---above-the-clouds---a-view-from-academia
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • very good
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
3,286
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
62
Comments
2
Likes
2
Embeds 0
No embeds

No notes for slide

Above the Clouds: A View From Academia

  1. 1. Above the Clouds: A View From Academia Armando Fox, UC Berkeley EDUSERV Symposium, 12 May 2011 Presentation slides licensed under Creative Commons Attribution-ShareAlike 3.0 Unported License. Image: John Curley http://www.flickr.com/photos/jay_que/1834540/
  2. 2. Who Am I? <ul><li>Research: Internet-scale systems; productive parallel programming </li></ul><ul><li>Teaching: software engineering </li></ul><ul><li>Writing: co-author Above the Clouds tech report </li></ul><ul><li>Disclaimer 1: I don’t speak for UC </li></ul><ul><li>Disclaimer 2: Relationship with Amazon </li></ul>
  3. 3. How We Got Into the Cloud: RAD Lab’s 5-year Mission <ul><li>Enable 1 entrepreneur to prototype a great Web app over 3-day weekend, then deploy at scale </li></ul><ul><li>Key technology: Statistical machine learning </li></ul><ul><li>Early critiques: “Demonstrate your ideas at scale!” </li></ul><ul><li>Moved from Sun Blackbox to EC2 in mid-2008 </li></ul><ul><li>Feb. 2009: Above the Clouds tech report * </li></ul><ul><ul><li>Over 50K downloads, influenced high-profile IT co.’s </li></ul></ul>* abovetheclouds.cs.berkeley.edu, or CACM April 2010
  4. 4. Outline: Two Themes <ul><li>Academic clouds: public or private? </li></ul><ul><ul><li>Theme 1 : save money or improve research? </li></ul></ul><ul><ul><li>Theme 2: cloud user or cloud provider? </li></ul></ul><ul><li>Assumption : familiar with cloud basics </li></ul><ul><ul><li>Public/pay-as-you-go </li></ul></ul><ul><ul><li>Private/closed/“condo” </li></ul></ul><ul><li>Non-goal : regulatory thickets around cloudifying “sensitive” information </li></ul>
  5. 5. Public Cloud: CS Research <ul><li>Over $350,000 spent on AWS since 2008 </li></ul><ul><ul><li>PhD student ~ US$75k/year => cloud ~ 1/3 student/mo. </li></ul></ul><ul><li>Experiments: 100-300 nodes common, 900 max </li></ul><ul><ul><li>large-scale storage, cloud programming, MapReduce </li></ul></ul><ul><ul><li>results at scale now required for top-tier conferences </li></ul></ul><ul><ul><li>most experiments last 0-4 hours </li></ul></ul><ul><ul><li>“ Small” experiments also in cloud for convenience </li></ul></ul><ul><li>Comparison: Sun BlackBox at Berkeley </li></ul><ul><ul><li>$200k acquire & install </li></ul></ul><ul><ul><li>$300k+ in hardware donations </li></ul></ul><ul><ul><li>staff: ≥0.5 FTE </li></ul></ul>
  6. 6. Public Cloud: CS Education <ul><li>Great Ideas in Computer Architecture (reinvented Fall 2010): 190 students </li></ul><ul><li>Software Engineering for Software-as-a-Service: 70 students </li></ul><ul><li>Operating Systems: 70 students </li></ul><ul><li>Intro. Data Science: 30 students </li></ul><ul><li>Adv. topics in HCI: 20 students </li></ul><ul><li>Natural language processing: 20 students </li></ul><ul><li>Large-scale programming abstractions for the cloud: ~20 students (Fall 2011) </li></ul><ul><li>Administration, provisioning, sizing much easier on public cloud than UC instructional computing </li></ul>
  7. 7. Cloud Economics <ul><li>“ Private should be cheaper if you have stable utilization” </li></ul>Demand Capacity Time Demand Capacity Time
  8. 8. $Private < $Public? <ul><li>Capital: hardware, networking, power 5-7x cheaper at 100K’s scale (Hamilton 2007) </li></ul><ul><li>Operations: heavy automation => 1000’s machines per FTE admin </li></ul><ul><li>R&D: cloud providers had to serve internal business need </li></ul><ul><li>Services: “Scale makes availability affordable”: wide-area disaster recovery facilities </li></ul><ul><li>Hidden/shared costs: power, cooling, staff, .... </li></ul>
  9. 9. Hard to Compete on Cost <ul><li>Zero-touch metering/billing infrastructure </li></ul><ul><li>Optimized for low margin </li></ul><ul><ul><li>$0.08/hr: virtual CPU on EC2 </li></ul></ul><ul><ul><li>$0.02-0.08/hr: as-available “spot instances” </li></ul></ul><ul><ul><li>Reserved (prepay 1-3 years, save if utilization > 25%) </li></ul></ul><ul><ul><li>$0.00/hr: 1 year free usage tier for all services </li></ul></ul><ul><ul><li>Private: ≥ $0.075/hr ($2000 private server amortized over 3 years with no indirect costs) </li></ul></ul><ul><li>“ Moving to EC2 would cost about a factor of 2” </li></ul><ul><ul><li>Highly placed colleague at major social site </li></ul></ul>
  10. 10. Try to smooth out peaks? <ul><li>Not waiting in queues accelerates research! </li></ul><ul><ul><li>Run several experiments simultaneously, each using 100’s of machines for 1-2 hours, without queueing up </li></ul></ul><ul><ul><li>Basic queueing theory: trade utilization vs service time </li></ul></ul><ul><ul><li>Better performance isolation than private cloud (!) </li></ul></ul><ul><ul><li>N.B. for long jobs, some queueing may be OK </li></ul></ul><ul><li>Corollary 1: cost-associative billing encourages research spontaneity </li></ul><ul><li>Corollary 2: incentive to stop using is important! </li></ul><ul><li>Effective metering & billing is key to on-demand usage model </li></ul>
  11. 11. Example: wait times on UC Berkeley “Mako” cluster <ul><li>Mako has 272 dual-socket (quad-core per socket) nodes with 24 GB RAM each </li></ul><ul><li>Source: ShaRCS—Shared Research Computing Services, presentation by UC Office of the President at the UC Cloud Summit, April 2011 </li></ul>
  12. 12. On the other hand...Big Data * Simson L. Garfinkel, An Evaluation of Amazon’s Grid Computing Services: EC2, S3 and SQS, Technical Report TR-08-07, School of Engineering & Applied Sciences, Harvard University, 2008.  Source: Ed Lazowska, eScience 2010, Microsoft Cloud Futures Workshop, lazowska.cs.washington.edu/cloud2010.pdf <ul><li>Challenge: Long-haul networking is most expensive cloud resource, and improving most slowly </li></ul><ul><li>Copy 8 TB to EC2 at ~20 Mbps*: ~35 days, ~$800 </li></ul><ul><li>Ship four 2 TB drives to Amazon: 1 day, ~$150 </li></ul><ul><li>Can private/shared networking resources be combined with public cloud to get best of both? </li></ul>Application Data generated per day  DNA Sequencing (Illumina HiSeq machine) 1 TB Large Synoptic Survey Telescope 30 TB; 400 Mbps sustained data rate between Chile and NCSA Large Hadron Collider 60 TB
  13. 13. On the other hand...Cloud Provider <ul><li>Hard research on public cloud: </li></ul><ul><ul><li>scheduling/provisioning research </li></ul></ul><ul><ul><li>security: honeypots, malware containment,epidemic modeling </li></ul></ul><ul><ul><li>energy efficiency or other physical monitoring </li></ul></ul><ul><ul><li>experimenting with networking fabric, multicast, etc. </li></ul></ul><ul><li>N.B., cloud provider research needs cloud users! </li></ul><ul><ul><li>Example: Microsoft Research Silicon Valley “Sherwood” cluster (~240 nodes) </li></ul></ul><ul><li>Demanding customers drove cloud research </li></ul>
  14. 14. Nonprofit/Academic clouds <ul><li>PlanetLab & Emulab </li></ul><ul><ul><li>highly successful from their customers’ point of view </li></ul></ul><ul><ul><li>lots of great research, some of which might have been impossible on today’s public cloud </li></ul></ul><ul><li>Academic/research clusters </li></ul><ul><ul><li>Yahoo/IBM/M45 cluster, Google/IBM cluster, TerraGrid: primarily application-level research </li></ul></ul><ul><ul><li>OpenCirrus (HP/Intel/Yahoo/UIUC/IDA Singapore/Karlsruhe): bare-metal, federated, 1K+ cores/site </li></ul></ul><ul><li>Access model: write proposal; closed community </li></ul><ul><li>Saving money is non-goal (in fact, a subsidized investment by universities & industrial partners) </li></ul>
  15. 15. OpenCirrus <ul><li>Infrastructure costs increase with # sites </li></ul><ul><li>Claim: even at ~50% utilization, owning your infrastructure pays for itself in ~3 years </li></ul><ul><li>Source: R. Campbell et al., OpenCirrus..., Proc. 2011 Workshop on Hot Topics in Cloud Computing (HotCloud’09), June 2011 (to appear) </li></ul>
  16. 16. Public & private clouds don’t see same benefits * Implies ability to meter, and incentive to release idle resources Benefit Public Private “ infinite” resources on-demand Yes No Instantaneous provisioning Yes Varies Better hardware Yes No Zero-commitment pay-as-you-go* Yes No Reduced costs from economy of scale Yes No Can do “cloud provider” research No Yes Can trust co-tenants No Yes Better utilization through virtualization Yes Yes Quickly & inexpensively move big data No Yes Address data-custody regulatory issues Varies Yes
  17. 17. So You Want to Build a Cloud... <ul><li>Single point of failure? </li></ul><ul><li>Zero Touch? </li></ul><ul><li>Hidden costs? </li></ul>
  18. 18. Single point of failure <ul><li>30+ hour EBS outage on 21 April 2011 </li></ul><ul><ul><li>triggered by human error (network config change) </li></ul></ul><ul><li>Georedundant services (Netflix) largely unaffected </li></ul><ul><ul><li>At least, georedundancy was an available option! </li></ul></ul><ul><li>Non-redundant services had catastrophic outages </li></ul><ul><li>Question: would “more” operational expertise have resolved outage faster? </li></ul>
  19. 19. Metering and Billing <ul><li>Billing is policy. Metering is mechanism. </li></ul><ul><ul><li>Pay-as-you-go policy allows cost associativity </li></ul></ul><ul><ul><li>Any policy only as flexible as its mechanism </li></ul></ul><ul><ul><li>Amazon’s mechanism: “zero touch” metering </li></ul></ul><ul><ul><li>So, Virtual Private Cloud ≠ your private cloud </li></ul></ul><ul><li>Which of these need human intervention: </li></ul><ul><ul><li>Signing up? Provisioning? Deploying? Billing? </li></ul></ul><ul><ul><li>Academic/nonprofit clouds don’t even try this </li></ul></ul>
  20. 20. Hidden Costs <ul><li>Single billing scheme captures all costs, or must some costs be billed/accounted separately? </li></ul><ul><ul><li>shared expenses: power, networking </li></ul></ul><ul><ul><li>general employment benefits/overhead for staff </li></ul></ul><ul><li>Cost of keeping up with innovation </li></ul><ul><ul><li>On average, AWS has deployed 1 new service every 2 months since EC2 beta launch* </li></ul></ul><ul><li>Competition from new providers will exacerbate </li></ul><ul><ul><li>Microsoft Azure, VMware CloudFoundry, ... </li></ul></ul><ul><ul><li>* 21 Web service APIs as of April 2011 </li></ul></ul>
  21. 21. Two themes <ul><li>Academic clouds: public or private? </li></ul><ul><ul><li>Theme 1 : save money or improve research? </li></ul></ul><ul><ul><li>Theme 2: cloud user or cloud provider? </li></ul></ul><ul><li>Capability </li></ul><ul><ul><li>Cloud accelerates and enables new research </li></ul></ul><ul><ul><li>Scale that can’t be achieved any other way </li></ul></ul><ul><li>Cost </li></ul><ul><ul><li>Will private cloud cost less? Is that the main goal? </li></ul></ul><ul><ul><li>Have hidden costs been accounted for? </li></ul></ul><ul><ul><li>Cost-associativity allows bursty use, encourages spontaneity, but needs fine grained metering </li></ul></ul>
  22. 22. Two themes <ul><li>Academic clouds: public or private? </li></ul><ul><ul><li>Theme 1 : save money or improve research? </li></ul></ul><ul><ul><li>Theme 2: cloud user or cloud provider? </li></ul></ul><ul><li>Cloud provider research may require private cloud </li></ul><ul><ul><li>Security, energy, bare-metal, cloud provisioning, ... </li></ul></ul><ul><ul><li>But, still need cloud users (customers) to drive/validate </li></ul></ul><ul><ul><li>Need public-cloud-level APIs, service reliability </li></ul></ul><ul><li>Cloud user </li></ul><ul><ul><li>Big data may impede some public-cloud-ready apps </li></ul></ul><ul><ul><li>Exotic architectures (SSD, in-memory DB, ...) </li></ul></ul><ul><ul><li>Regulatory issues.... </li></ul></ul>
  23. 23. Summary <ul><li>Public cloud shows how to “move slider” between insourcing & outsourcing </li></ul><ul><li>Unlikely to compete on cost with very large scale public clouds </li></ul><ul><li>So, how much can/should you outsource... </li></ul><ul><ul><li>...for technical reasons (types of research possible)? </li></ul></ul><ul><ul><li>...for regulatory reasons (data privacy, etc.)? </li></ul></ul><ul><li>Remember the non-obvious costs </li></ul><ul><ul><li>Metering & billing, esp. for shared overheads </li></ul></ul><ul><ul><li>Keeping up with the ecosystem </li></ul></ul>
  24. 24. Thanks! <ul><li>UC Berkeley Reliable Adaptive Distributed Systems Lab & Affiliates </li></ul><ul><li>UC Cloud Computing Task Force </li></ul><ul><li>Andy Powell & Eduserv </li></ul>RAD Lab Team in 2009
  1. ¿Le ha llamado la atención una diapositiva en particular?

    Recortar diapositivas es una manera útil de recopilar información importante para consultarla más tarde.

×