Svccg 2011-05-12


Published on

Slides for my talk at the Silicon Valley Cloud Computing Group Meetup on May 12 2011 at Yahoo's campus in Sunnyvale.

Published in: Technology, Business
  • No downloads for now - sorry. I may want to tweak the slides to add in a few attributions.

    Don't get too hung up on the 'Knowledge as a Service' tag. It's just an internal term referring to certain shared aspects of the SOA architecture for Yahoo applications.
    Are you sure you want to  Yes  No
    Your message goes here
  • Geoff Arnold presented the slides to Silicon Valley Cloud Computing Group May 12, 2011. Great meeting with 200+ folks at Yahoo last night.

    Visit and join us at

    Video recording of the talk is available at

    For other presentations from SVCCG at slideshare
    Are you sure you want to  Yes  No
    Your message goes here
  • Cool,

    Should the download is enabled, that's even better.

    Thanks & Good w/e

    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Svccg 2011-05-12

  1. 1. Cloud Computing at Yahoo!Lessons, Challenges and Futures<br />Geoff Arnold<br />May 12,2011<br />
  2. 2. Abstract<br />Yahoo operates a wide range of global and regional web services, from email to photo sharing, from sports and business news to ecommerce, from social networks to education. Our properties range in size from global behemoths like Yahoo Mail to hyper-local startups. The numbers are huge. Yahoo sites attract more than 680 million users; our Superbowl coverage generated 37 million clicks, and there were 1.2 billion page views of our Oscars site. And this scale carries through to our technology: we analyze 100 billion events every day, and we’re rolling out 135 websites on our new media technology platform this year.<br />
All of this is focused on one end: to deliver the premier digital experience to our users. And this leads to an interesting challenge: how can we apply all of our resources - computation, networking, user data, media, and science – to deliver a consistent and profitable experience, with agility and efficiency? For Yahoo, the answer has been to embrace cloud computing. In this talk, I’ll discuss what this really means for software developers within Yahoo, in terms of technologies and engineering practices, and how we intend to transform the development and delivery of Yahoo applications.<br />5/12/11<br />Yahoo! Presentation<br />2<br />
  3. 3. Agenda<br />Why this is different from the last 99 cloud computing presentations you’ve sat through<br />Motivation<br />Déjà vu<br />What we’ve learned<br />Where we’re going<br />5/12/11<br />Yahoo! Presentation<br />3<br />
  4. 4. Why this is different from the last 99 cloud computing talks you’ve sat through<br />No definition of cloud<br />If you don’t know it by now….<br />I’m not selling anything<br />Except maybe a job opportunity or two….<br />No cool new technologies<br />It’s mostly about operational and systems refactoring<br />More questions than answers<br />Progress report<br />5/12/11<br />Yahoo! Presentation<br />4<br />
  5. 5. Motivation<br />Yahoo! Presentation<br />5<br />5/12/11<br />
  6. 6. OurMission<br />Create a global, scalable platform built on science that enables rapidinnovation and delivery of personalized, monetizable experiences across devices.<br />5/12/11<br />6<br />Yahoo! Presentation<br />
  7. 7. Today’s Architecture<br />5/12/11<br />Yahoo! Presentation<br />7<br />
  8. 8. INTERNET<br />EDGE<br />MEDIA <br />ADVERTISING <br />DATA HWY<br />PLATFORMS<br />PIPELINES<br />BT<br />COKE<br />KEYSTONE<br />YELLOWSTONE<br />MAIL/ANTI-SPAM<br />GD STONE<br />CONTENT AGILITY<br />User Data<br />Media & Files<br />UPS / UDS / UDB / <br />SHERPA <br />MOBSTOR<br />HADOOP<br />ODS<br />STORAGE<br />MAIL<br />FRONTPAGE<br />SEARCH<br />8<br />
  9. 9. Space Yahoo is big. You just won't believe how vastly, hugely, mind- bogglingly big it is. I mean, you may think it's a long way down the road to the chemist's, but that's just peanuts to space Yahoo.<br />Douglas Adams, The Hitchhiker's Guide to the Galaxy<br />
  10. 10. Unfortunately, I can’t tell you just how big<br />What can I say?<br />680+M users <br />200PB of data, adding ~50TB of data per day<br />100B events a day captured, collected, transported and processed <br />43k Hadoop servers running 5M+Hadoop jobs every month <br />Let’s just say that life is simpler for other folks…<br />5/12/11<br />Yahoo! Presentation<br />10<br />
  11. 11. Why cloud – specifically “private cloud”?<br />After all, many people think that “private cloud” is an oxymoron<br />OH: “You should either be using a public cloud or operating one.”<br />5/12/11<br />Yahoo! Presentation<br />11<br />
  12. 12. Anticipated benefits<br />Business agility<br />Operational consistency<br />Interoperability<br />Quality<br />Tech deduplication<br />Efficiency<br />Risk reduction<br />Cost transparency<br />5/12/11<br />Yahoo! Presentation<br />12<br />
  13. 13. Why is agility #1?<br />March 11, 2011 – the Japanese earthquake and tsunami<br />Yahoo News spiked to 20.6 million unique visitors<br />We served 371 million page views<br />We added a “Donate Now” button which raised $7M<br />Frequently need to spin up new sites with high traffic and short lifetime:<br />Royal Wedding<br />Canadian Election<br />5/12/11<br />Yahoo! Presentation<br />13<br />
  14. 14. But why private cloud?<br />Our business is digital media:“People want to be informed, they want to be entertained, they want to be educated, and they want to communicate. And that's what Yahoo is all about.”<br />There are no public cloud operators that have the scale, geographic presence, and functionality that we would need<br />So we have to do it ourselves<br />5/12/11<br />Yahoo! Presentation<br />14<br />
  15. 15. Déjà vu<br />Yahoo! Presentation<br />15<br />5/12/11<br />
  16. 16. Hasn’t Yahoo been here before?<br />There have been many public presentations on Yahoo’s cloud ambitions and technology over the last three years<br />Architectural details<br />IETF submissions<br />Open source plans<br />So what happened?<br />And is this talk going to be just another one in the series?<br />5/12/11<br />Yahoo! Presentation<br />16<br />
  17. 17. Progress report<br />In some areas, we’ve made great progress<br />Grid<br />Cloud storage<br />Transforming complex middleware into hosted services<br />OH: “Cloud apps are SOA apps”<br />In others, we got ahead of ourselves<br />And we’ve learned some useful lessons<br />5/12/11<br />Yahoo! Presentation<br />17<br />
  18. 18. What we’ve learned<br />Yahoo! Presentation<br />18<br />5/12/11<br />
  19. 19. The cloud compute API story is a mess<br />We have procedural primitives to manipulate instances<br />AWS EC2, RackSpace, OpenStack<br />We have “virtual data center” declarative schemes<br />VMware vCloud, Orancle/Sun, Yahoo CSE<br />We have PaaS models which abstract away instances and data centers<br />And there’s no coherent way to tie them all together<br />5/12/11<br />Yahoo! Presentation<br />19<br />
  20. 20. Consequences<br />How do I describe alternative roll-out/roll-back policies for my VDC?<br />It seems natural to describe – or specify – the semantics of my declarative API in terms of the primitives in my procedural API<br />How can I deploy a multi-tier application in which one tier was created using a PaaS framework<br />One or more!<br />Who’s on first?<br />Does the VDC deployment description control the PaaS system?<br />Does the PaaS application configuration trigger the deployment of other tiers? <br />5/12/11<br />Yahoo! Presentation<br />20<br />
  21. 21. Forget “long tail” – what about the head?<br />Yahoo has a number of “mega-properties”<br />Mail, Front Page, Sports, News, Flickr<br />They all want to take advantage of the benefits of cloud<br />Potential for huge gains in efficiency, predictabilty<br />However because of their size, they’ve had to radically optimize their technology and operations<br />They have stringent performance (latency) requirements and use highly customized technologies<br />This means that they aren’t necessarily a good fit for generic, multi-tenant services<br />
  22. 22. Cranking up the rate of change<br />Historically, Yahoo properties have acquired new hardware through traditional committee-driven processes<br />So there are long lead times, and plenty of time to provision systems like asset management, DNS, monitoring, access control, and so forth<br />Many of these systems were not designed to support real-time updates<br />And even if they were, they were rarely stressed<br />The effects of introducing on-demand provisioning tend to cascade though many operational systems<br />5/12/11<br />Yahoo! Presentation<br />22<br />
  23. 23. Security policies<br />Pre-cloud security policies tend to assume a single point of responsibility: the property which “owns” the box and the software stack that’s running on it<br />Threat analysis tends to focus on the box, and on mitigating consequences of attack<br />With the cloud, there are many more attack vectors:<br />The application VM on the physical box<br />Any other VM running on the box<br />The hypervisor managing the box<br />The VM management system<br />Creating new security policies takes time<br />5/12/11<br />Yahoo! Presentation<br />23<br />
  24. 24. Invalidating traditional assumptions<br />In a slowly-changing environment, it’s easy to assume that some things are constant<br />For example, Yahoo has a flat IP network with fixed IP addresses per-box and per-rack<br />So it’s reasonable to use an IP address as a key for many things<br />It violates current policy, but there are many legacy systems…<br />So when a system that uses IP addresses for app instance identifiers is dynamically [re]deployed in a cloud fabric, things break<br />E.g. Reprocessing historic Apache log records<br />5/12/11<br />Yahoo! Presentation<br />24<br />
  25. 25. Legacy workloads<br />Most people agree that the future of cloud computing lies with PaaS systems which provide a programming, development and operational model that is optimized for the cloud<br />CloudFoundry and OpenShift are promising developments<br />However it is difficult to make the case that the benefits of cloud computing should only be available to new applications<br />So we have IaaS fabrics on which people run traditional workloads<br />Sometimes unchanged, more often with minor cloud adaptations<br />
  26. 26. Legacy workloads at very large scale<br />At Yahoo, we run into the problem that many of our legacy property workloads run on large fleets in many data centers around the world<br />Their “cloud adoption strategy” involves deploying their current stack on a small number of cloud instances and running them in parallel with their existing fleet<br />This makes it hard to adapt the application configuration to work well in the cloud<br />It also means that the property’s existing operational practices have to be made to work with cloud instances<br />This is particularly challenging for Edge configuration <br />5/12/11<br />Yahoo! Presentation<br />26<br />
  27. 27. The reality of “on demand”<br />“On demand” capacity presumes that each individual customer request is very small compared with the size of the resource pool<br />Capacity management involves careful over-provisioning of the pool<br />But we have many large properties…<br />Today we overprovision to their planned size/QPS, which:<br />Can waste a lot of resources<br />Has long lead times<br />We need to be able to capture the time dimension in our capacity planning and provisioning requests<br />Define the envelope, allow on-demand within that envelope<br />5/12/11<br />Yahoo! Presentation<br />27<br />
  28. 28. Integrating Platform as a Service<br />We’ve already seen that PaaS presents us with a problem in terms of cloud compute APIs<br />We’re working on a number of outstanding issues with PaaS:<br />Accommodating PaaS in our existing software development and test methodologies, including Continuous Integration<br />Deploying PaaS in different network security zones<br />Managing the various kinds of access to the PaaS infrastructure<br />Integrating various network and request handling mechanisms – throttling, traffic shaping, A/B testing, etc. – into PaaS frameworks <br />5/12/11<br />Yahoo! Presentation<br />28<br />
  29. 29. Automation<br />One of the key benefits of cloud computing is improved consistency and predictability through automation<br />Scale affects automation in interesting ways<br />Potential interactions between automated deployment and self-healing<br />Very long-running automation, such as the initial population of a new storage farm in a new location<br />Issues also arise when automation is designed to exploit an API that was created for a different use-case<br />E.g. an API intended to support a portal GUI with a human operator, which has never been stressed<br />5/12/11<br />Yahoo! Presentation<br />29<br />
  30. 30. Where we’re going<br />Yahoo! Presentation<br />30<br />5/12/11<br />
  31. 31. Architectural Vision<br />Software as a Service<br />Knowledge as a Service<br />Platform as a Service<br />Infrastructure as a Service<br />Hardware<br />5/12/11<br />Yahoo! Presentation<br />31<br />
  32. 32. Architectural Vision: IaaS<br /><ul><li>Planet scale Cloud Fabric that abstracts away all the underlying hardware
  33. 33. Fundamental Cloud Services that can be assembled to build higher level services</li></ul>5/12/11<br />Yahoo! Presentation<br />32<br />
  34. 34. Architectural Vision: PaaS<br /><ul><li>On demand, higher level Platform services that support:
  35. 35. Interactive onstage applications
  36. 36. Offline batch applications
  37. 37. Sophisticated programming environments that are offered as a holistic Platforms
  38. 38. Hosts the development and operation of Yahoo applications</li></ul>5/12/11<br />Yahoo! Presentation<br />33<br />
  39. 39. Takeaways<br />The Yahoo private cloud project is on track<br />Many successes already, but much more work to do<br />End-to-end architecture is a key element<br />Open source is important: we’re committed to a collaborative approach<br />Founder member of Open Networking Foundation<br />Studying other open source projects<br />No more premature announcements<br />And of course we’re hiring…. ☺<br />5/12/11<br />Yahoo! Presentation<br />34<br />
  40. 40. geoff<br />arnold<br />Cloud services wrangler<br /><br />Twitter: @geoffarnold<br />Y! IM:<br /><br />
  41. 41. Our Open Source Philosophy<br />Open Source Benefits<br /><ul><li>Higher quality software
  42. 42. Avoid technological dead ends
  43. 43. Leverage community contributions
  44. 44. Workforce already trained
  45. 45. More widespread usage
  46. 46. “Platform/ecosystem” effect
  47. 47. Talent recruiting
  48. 48. Partner / M&A acquisitions</li></ul>Ongoing Contributions<br />Adoption of Open Source<br />Probable Future Engagement<br />IaaS; PaaS; Cloud storage<br />5/12/11<br />Yahoo! Presentation<br />36<br />