Your SlideShare is downloading. ×
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
High Availability Clouds-Cloud Computing Expo
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

High Availability Clouds-Cloud Computing Expo

3,137

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
3,137
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
151
Comments
0
Likes
2
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. High Availability Clouds “Moving mission critical applications to the cloud.” Jeremy Hitchcock, CEO Dynamic Network Services Cloud Computing Expo 2009
  • 2. Who cares? Why Relevant? • Enterprises and service providers: “now what”? • Desire to move business or mission critical apps – That’s most of them • Clouds have an “unstable” feel Cloud Computing Expo 2009
  • 3. Who cares? Why Relevant? • Still, benefits to virtualizing computing resources • Most don’t care about raw hardware • Becoming more software/resource integrators – Less concerned with software/hardware integration • Better use of hardware resources – Most systems are pretty idle all the time • Hardware is getting expensive (well, power is) Cloud Computing Expo 2009
  • 4. Where are Clouds? You Are Here Cloud Computing Expo 2009
  • 5. Where we are going (or like to be) • Cloud adoption going to be like this? – Limited to spiky demand or distributed processing • Will more services move to cloud environments? • Even between clouds and traditional hosting? • No hardware? – Someone has to worry about infrastructure though Cloud Computing Expo 2009
  • 6. Background on me • Internet infrastructure: DNS for other people – DynDNS.com, Dynect Platform • Do traffic management, dynamic quot;routingquot; for clouds • Work with a lot of cloud providers to get domain.com to node-19334 but not node-49291 • Background in networking, software engineering • Use all unmanaged hosting (but do have a VPS offering for consumer (it was a dev project)) Cloud Computing Expo 2009
  • 7. Terms • Unmanaged hosting – corporate/outsourced datacenter, your own everything • Managed hosting – Hardware is provided with ping port and power • Cloud hosting – Using virtual resources to accomplish the same as the above two items Cloud Computing Expo 2009
  • 8. Goals with High Availability • Availability: Users do not see outages • Scaling: Not impossible or easy – Does not mean more resources available – Important when you think “on demand” • Efficient use of resources (more on that) • Institutionalized operations practices – Monitoring, security regimes Cloud Computing Expo 2009
  • 9. High Available What? • Well, anything? • Applications • File systems • CPU, I/O, and network – I/O is both storage space and retrieval Cloud Computing Expo 2009
  • 10. HA Availability Cloud Computing Expo 2009
  • 11. Early Days of Hosting • Been here before: mainframes to 1U servers • Copy over redundancy in larger systems – “That’s how larger systems were so accessible” • Expensive 1Us lead to commodity hardware • “We just take our application and move it over here” • And that was when things took a turn… Cloud Computing Expo 2009
  • 12. Cloud Computing Expo 2009
  • 13. Ouch! • Lots of cheap hardware, gained efficiency – Most of the time anyway • Applications were not available – Up and down all of time • DB admins, network admins, system admins all pointing fingers Cloud Computing Expo 2009
  • 14. Ouch! • Needed more 1Us to do the job • 1U equipment quality was not as good • More people, more operations issues • Security concerns, DB admins having system access • Failures and scaling became a problem until… Cloud Computing Expo 2009
  • 15. Ah Ha Moment! It’s ok if a 1U fails. It happens all the time! Cloud Computing Expo 2009
  • 16. Ah Ha Moment! • Make the system more redundant, fault-tolerant • Break apart units to create working spaces – N+1 redundancy, whatever your risk tolerance is • Specialized hardware to maintain efficiency • Monitor the units of work – Ping, port, power separately Cloud Computing Expo 2009
  • 17. Ah Ha Moment! • Separate DB/app/file into clusters – That makes scaling and failover easy • Filiers for DB and large scale storage • Demand SLAs for network transit • Get the NOC to work on cross system outages Cloud Computing Expo 2009
  • 18. Still Some Lingering Issues • Architectures grew to match applications – Tightly coupled, is that good? – Makes it hard to move around – Specialized hardware pieces • Do you look like Flickr? – If you do, their hosting platform will work for your app Cloud Computing Expo 2009
  • 19. Cloud Computing Expo 2009
  • 20. Still Some Lingering Issues • Systems are more complicated – Yahoo 9/11 Memorial site cascade failures • Fix was a load balancer/DNS tweak • Lots of “glue” to make sure everything works • Each architecture is [slightly] different Cloud Computing Expo 2009
  • 21. Finally: Some Lingering Issues • Therefore: – Failures, if an application is in shards, works – Scaling is application specific, different bottlenecks – Reasonable efficiency, limited specialized hardware – More people to maintain “the system” but secure Cloud Computing Expo 2009
  • 22. Now Onto Clouds… • Promise: – On demand resources (true if you can use it) – Greater computer efficiency (all costs are internalized) – More flexibility for development and peak usage – Greater availability • Reality: – Your responsibility to throw in more hardware – Trade specialization for generalization (bottlenecks) – Limited by tools provided and consumed – Maybe Cloud Computing Expo 2009
  • 23. Availability Cloud Computing Expo 2009
  • 24. Availability is Defined by Outages Cloud Computing Expo 2009
  • 25. Amazon/Cloud Outages? • Not clear: – “There was this one in July 2008” – “Some DNS issues yesterday” • How often? How regular? • Out of 500,000 harddrives, x will fail in 3.243 years • Out of 1 cloud provider? (or maybe 5) – We don’t know. Cloud Computing Expo 2009
  • 26. Cloud Realities • “Best effort” to provide services • Ever ask for an SLA? – I’m sure it’s coming but not soon enough for some • Remember, Amazon is providing a service – Unmanaged environment • Relax, that’s the Internet, we’ll figure it out Cloud Computing Expo 2009
  • 27. Cloud Realities • No physical access to systems • No guarantee for systems to be available • No guarantee that new systems to be available • No continuity guarantee – Great performance one moment, maybe not the next – Shared resources • Everything is local, security is a lot different Cloud Computing Expo 2009
  • 28. But Clouds are Virtualized 1Us! • Well, they are, but not really • Used to be: – Ping, port, power – raw access – Hybrids: corporate datacenter, managed, unmanaged • Now: – Ping, port, power, file I/O– virtual access • Outsourcing network, hardware, and OS Cloud Computing Expo 2009
  • 29. Why is it different • Hardware becomes a service – Depending on the application, that may matter • More vendors in the mix – Network, hardware, OS much more packaged • Simpler presentation but complicated behind the scenes • Library issues, security issues, OS upgrades? Cloud Computing Expo 2009
  • 30. Availability • Goal: Eliminate single points of failure – Clouds are consolidations of services – Solution is to split it apart • Achieve true diversity – Business continuity diversity – Geographic diversity – Network diversity – OS diversity • More layers make interactions hard to predict Cloud Computing Expo 2009
  • 31. Eliminate Pointsof Failure • Cloud diversity • Cloud outages are typically binary • Interoperability needed to make it easier – That will come in several ways Cloud Computing Expo 2009
  • 32. Failover Events • Failure events happen (more frequently in clouds?) • Trick is detecting and redirecting – quot;Once is a mistake, twice is jazz” – Miles Davis • Needs to be seamless and automatic • Good provisioning and monitoring in place – Server builds, revisioning, server configurations – Everything more modular Cloud Computing Expo 2009
  • 33. Scaling • Go from 1 to 2 to 4 to 10,000 units • Split apart work units • Have to do it sooner than later • More sharding, less efficient • Not all units are going to be equal nor constant Cloud Computing Expo 2009
  • 34. Provisioning • Everything needs to be automatic (or at least close) • As you grow, this hurts more and more • Provisioning means lab, dev, and production • This becomes a critical system – Monitoring and backups should work with provisioning Cloud Computing Expo 2009
  • 35. Hardware Considerations • Hardware optimized software packages may change • Security patches – Default images v. custom images • Physical access not granted to you but others – Physical access means all access – Encrypted data on disc – Less recovery options • Do you really have access to your data? – See backups Cloud Computing Expo 2009
  • 36. Host Issues • Host system security vulnerabilities • Everything is local – VLANing becoming more available • Underlying systems need maintenance – Live migrations Cloud Computing Expo 2009
  • 37. Monitoring • System related outages because units will fail • Normal tools are based on physical limitations • Cloud environments not always clear where the failure is • Test from the last mile • Performance testing important too • System testing and transactions • May not pinpoint problems but it does send pages Cloud Computing Expo 2009
  • 38. Backups • Incremental backups much more important • Backup within the same cloud? – Probably not, but where? • Data files, application files, configuration files – Version everything – Document how they all go together • But you already do that so it’s ok  Cloud Computing Expo 2009
  • 39. Migrations • Be able to take your data (server image) – Server import and export • Live migration, underlying software provides it • This is all interoperability needs Cloud Computing Expo 2009
  • 40. Disaster Planning • When things go really wrong: – Need to communicate using other means • Social networking like Twitter (are they affected as well?) – Have a plan B, diversity of cloud providers – Seek SLAs? Cloud Computing Expo 2009
  • 41. Some Things External • DNS – Point domain.com to your plan B • Backups and files – When you want to publish content at plan B • Customer communications – Tell customers and users what’s going on • Last-mile monitoring – Everything might look ok in the cloud • Want options if there is an outage Cloud Computing Expo 2009
  • 42. Key Points • Clouds are great for applications, even mission critical ones • Best practices for server farms aren’t always best practices for clouds • Need to rely on software to make hardware assumptions work right • Constant trade off of cost and availability, what’s the risk tolerance Cloud Computing Expo 2009
  • 43. Questions Jeremy Hitchcock jeremy@dyn-inc.com http://dyn-inc.com/ Cloud Computing Expo 2009

×