Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cloud Architecture Concepts


Published on

This is an internal talk I gave within Datalynx in May 2013.
It’s an introduction to the ideas and concepts involved in building cloud systems and applications for technical people who are new to the cloud. It also compares and contrasts these ideas to those we’re used to in “traditional” enterprise IT systems.

Notes on the “Hippo” slide:
The inclusion of persistent storage here may make it sound like a hippo and a pet are essentially the same, however to my mind there is a key difference between them.

A pet’s persistent data is not transferable to other pets without some sort of intervention, be it a restore from backup, a manual copy, or similar. Typically it would be stored on disks which are not immediately and automatically accessible if the pet is offline or not functioning.
A hippo’s persistent data, by contrast, is automatically transferred to the hippo’s successor instance if the original hippo dies or

otherwise ceases to function. Typically the data would exist on a storage mechanism such as AWS EBS or OpenStack Cinder volumes, whose lifecycle is separate from that of the hippo and which are immediately and automatically accessible by other instances if the hippo dies.

Notes on the “Design for Failure” section:
This section provides a few rough and ready calculations for failure rates of hard drives. The calculations here are quite simplistic and assume and even distribution of failures over time. Obviously this isn’t the case in reality, however the idea here is to provide a rough illustration of the differences in how we experience failure between enterprise IT systems and cloud systems.

The “Enterprise Failures” slide is based on a hypothetical application where 10 servers are involved in its delivery and each server has 10 drives (including SAN/NAS/backup systems etc.). It also assumes that “enterprise class” drives with MTBFs have been used.

The “Cloud Failures” slide is based on numbers for Microsoft’s Windows Azure data centre in Dublin, which houses around 600’000 servers, and again assumes 10 drives average per server. It also assumes that consumer drives with low MTBFs have been used.

My override aim here was to express, to technical people who is not used to truly large scale systems, why they need to take the attitude of assuming that anything can fail at any time, and to realise that the implicit assumption of hardware reliability that is often applied in enterprise IT doesn’t map onto the cloud.

Published in: Technology, Business
  • Be the first to comment

Cloud Architecture Concepts

  2. 2. THE SERVER ZOOModel of server typesApplicable beyond the cloudCourtesy of Tim Bell from CERNPhoto by rbrwr via Flickr
  3. 3. 7PETSPhoto by chris friese via Flickr
  4. 4. UNIQUE&CONFIGUREDBY HANDPhoto by picto:graphic via Flick
  5. 5. NAMED&STATEFULPhoto by captainsubtle via Flickr
  6. 6. FIXED WHEN BROKENPhoto by Ruud Hein via Flickr
  7. 7. CATTLEPhoto by twicepix via Flickr
  8. 8. IDENTICAL&AUTOMATEDPhoto by cwasteson via Flickr
  9. 9. NUMBERED&STATELESSPhoto by vonguard via Flickr
  10. 10. Photo by blmurch via FlickrREPLACED WHENBROKEN
  11. 11. Photo by Gusjer via FlickrCOW+PERSISTENTSTORAGE=HIPPO
  12. 12. Photo by chriswsn via FlickrCOW+EXPERIMENTALCONFIG=CANARY
  13. 13. INSTANCESVS.SERVERSPets = ServersCattle = InstancesCattle ≠ Pets∴Instances ≠ ServersPhoto by wstryder via Flickr
  14. 14. MINIMISE PETSMAXIMISE CATTLEMore time formust-have petsBetter serviceDo more with lessPhoto by aWorldTourer via Flickr
  15. 15. REGULATORS(SHOULD)LOVE CATTLEHighly consistencyHighly testableHighly change controllableHighly monitorableInstant remediationPhoto by gordonplant via Flickr
  16. 16. ANATOMY OF A COWBootstrappedStatelessUsually LinuxImage by Pearson Scott Foresman via Wikimedia
  17. 17. BOOTSTRAPPINGPhoto by neoroma via FlickrConfig OSInstall softwareWrite config filesInitialise servicesAt boot timeWithout humaninput
  18. 18. Photo by Velo Steve via FlickrBOOTSTRAPPING TOOLSPuppetChefAnsibleCFEngineAWS CloudFormationOpenStack HeatGroup Policy/System Centeretc. etc. …
  19. 19. STATELESSPhoto by Numinosity (Gary J Wood) via FlickrNo persistent dataCollects state / jobdata on bootEphemeral storageException: Hippos
  20. 20. USUALLY LINUXPhoto by brian.gratwicke via FlickrFewer licensingconsiderationsEasier to automateEasier to imageSmaller footprintMore common atlarge scale
  21. 21. ELASTICITY&SCALABILITYLoose couplingHorizontal scalingParallel processingMonitoringPhoto by rwkvisual via Flickr
  22. 22. LOOSE COUPLINGTiered architecturesNo hostnamedependenciesAsynchronouscommunicationMessage queuing
  23. 23. HORIZONTAL SCALINGMore servers, notbigger serversDistributed workloadScale tiersindependently
  24. 24. PARALLEL PROCESSINGPhoto by °Florian via FlickrBreak workloadinto many chunksProcess manychunks at onceAcceleratesprocessing
  25. 25. MONITORINGIdentify keymetricsAutomatewatchingLog continuallyAutomateresponses
  26. 26. MONITORING TOOLSPhoto by C G-K via FlickrNagiosCactiGangliaAWS CloudWatchSystem Center
  27. 27. DESIGN FOR FAILUREThis is the most importantconcept of all!Embrace failure!
  28. 28. ENTERPRISE FAILURES100 drivesMTBF = 1’200’000hoursAFR ≈ 0.73%1 failure in ≈15 months
  29. 29. CLOUD FAILURES6’000’000 drivesMTBF = 300’000 hoursAFR ≈ 2.88%1 failure in ≈3 minutes≈215’000 failures in 15months
  30. 30. DESIGN FOR FAILUREInstances have no SLAAssume anything canfail at any timeBackup persistent dataDuplicate everything
  31. 31. TEST EVERYTHINGCreate your owndisastersUnleash the lastanimal in the zoo…
  32. 32. CONTACTE-mail: chris.bingham@datalynx.chLinkedIn: