Ten CloudDesign PatternsShlomo SwidlerFounderOrchestratus
Shlomo SwidlerFounder, OrchestratusStrategic and technicalIT consultingCustomers include:Cloud Developer Tips bloghttp://shlomoswidler.com/Among top community-ranked contributors to Amazon Web Services discussion forums2
Ten CloudDesign PatternsShlomo SwidlerFounderOrchestratus
Ten Cloud ApplicationDesign PatternsShlomo SwidlerFounderOrchestratus
Ten Cloud ApplicationDesign PatternsManyShlomo SwidlerFounderOrchestratus
What is a Design PatternA reusable recipe for building (software) systems that solve a particular problem.6
What is a Design PatternA reusable recipe for building (software) systems that solve a particular problem.AKA Architectural Pattern7
What is a Design PatternA reusable recipe for building (software) systems that solve a particular problem.GoalAvailableResourcesConstraints8
A Design PatternA reusable recipe for building (software) systems that solve a particular problem.Meets affirmativerequirementsGoalDoes not violatenegative requirementsAvailableResourcesConstraintsCan be implemented9
Challenges Faced by Apps in the CloudApplication ScalabilityCloud promises rapid (de)provisioning of resources.How do you tap into that to create scalable systems?Application AvailabilityUnderlying resource failures happen		… usually more frequently than in		traditional data centers.How do you overcome that to create highly available systems?10
The Scalability ChallengeScalability: Handle more (or fewer) requestsIt’s not Performance (handle requests faster)It’s not Availability (tolerate failures)But improving Scalability often improves Availability11
The Scalability ChallengeTwo different components to scale:State (inputs, data store, output)Behavior (business logic)Any non-trivial application has both.Scaling one component means scaling the other, too.12
App Scalability Patterns for State Data GridsDistributed CachingHTTP CachingReverse ProxyCDNConcurrencyMessage-PassingDataflowSoftware Transactional MemoryShared-StatePartitioningCAP theorem: Data ConsistencyEventually ConsistentAtomic DataDB StrategiesRDBMSDenormalizationShardingNOSQLKey-Value storeDocument storeData Structure storeGraph database13
App Scalability Patterns for BehaviorCompute GridsEvent-Driven ArchitectureMessagingActorsEnterprise Service BusDomain EventsEvent Stream ProcessingEvent SourcingCommand & Query Responsibility Segregation (CQRS)Load BalancingRound-robinRandomWeightedDynamicParallel ComputingMaster/WorkerFork/JoinMapReduceSPMDLoop Parallelism14
The Availability ChallengeAvailability: Tolerate failuresTraditional IT focuses on increasing MTTFMean Time to FailureCloud IT focuses on reducing MTTRMean Time to Recovery15
The Availability ChallengeAvailability: Tolerate failuresTraditional IT focuses on increasing MTTFMean Time to FailureCloud IT focuses on reducing MTTRMean Time to RecoveryWhat follows is four availability scenarios:			[low, high] X [MTTF, MTTR]16
Availability and MTTF, MTTR17
Availability and MTTF, MTTRUptime53%86%69%30%18
Availability and MTTF, MTTRTraditional ITUptime53%86%69%30%19
Availability and MTTF, MTTRTraditional ITUptime53%86%Cloud69%30%20
Availability and MTTF, MTTRTraditional ITUptime53%86%Cloud69%30%Cloud done wrong21
Design Patterns for AvailabilityPattern: ReplicationPattern: Fail-OverOften used together.22
Availability Pattern: Fail-OverSource: Michael Nygaard23
Availability Pattern: Fail-OverIn practice, fail-over is not this simpleSource: Michael Nygaard24
Availability Pattern: Fail-OverSource: Michael Nygaard25
Availability Pattern: Fail-Over with Fail-BackSource: Michael Nygaard26
Availability’s NemesisSingle Points of Failure27
SPOT the SPOF**Single Point of Failure
Spot the SPOF: 129InternetCloudAppApp Instance
Spot the SPOF: 1b30InternetCloudAppApp Instance
Spot the SPOF: 1b31InternetCloudAppApp Instance
Spot the SPOF:232InternetElastic IP AddressCloudAppAppApp InstanceApp InstanceFail-over
Spot the SPOF:233InternetMight work…Until you need more App instancesOr until another SPOF fails…Elastic IP AddressCloudAppAppApp InstanceApp InstanceFail-over
Spot the SPOF: 2a34InternetLBLoad Balancer InstanceCloudAppApp
Spot the SPOF: 2a35InternetLBLoad Balancer InstanceCloudAppApp
Spot the SPOF: 336InternetElastic IP AddressLBLBAvailability ZoneReplicated configurationCloudAppAppFail-over
Spot the SPOF: 337InternetElastic IP AddressLBLBAvailability ZoneReplicated configurationCloudAppAppFail-over
Spot the SPOF: 438InternetElastic Load Balancer (Magic)ELBAvailability ZoneCloudAppApp
Spot the SPOF: 439InternetElastic Load Balancer (Magic)ELBAvailability ZoneCloudAppApp
Spot the SPOF: 540InternetElastic IP AddressLBLBAvailability ZoneAvailability ZoneReplicated configurationRegionAppAppAppAppFail-over
Spot the SPOF: 541InternetElastic IP AddressLBLBAvailability ZoneAvailability ZoneReplicated configurationRegionAppAppAppAppFail-over
Spot the SPOF: 642InternetElastic Load Balancer (Magic)ELBAvailability ZoneAvailability ZoneRegionAppAppAppApp
Spot the SPOF: 643InternetElastic Load Balancer (Magic)ELBAvailability ZoneAvailability ZoneRegionAppAppAppApp
Spot the SPOF: 744InternetLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegionOr…
Spot the SPOF: 7a45InternetLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 7/7a46InternetElastic IPs aresingle-region onlyLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 7b47InternetELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 7b48InternetELB is single-region onlyELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 7c49InternetDNSELBELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 7c50InternetELB Can’t Do ThatMultiple CNAMEs Violate RFC 2181DNSELBELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 7d51InternetDNSLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 7d52InternetDNSCloud ProviderLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
Spot the SPOF: 8InternetDNSAWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion53
Spot the SPOF: 8InternetDNSAWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion54
Spot the SPOF: 8InternetDNSand...AWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion55
Spot the SPOF: 8InternetFail-overmechanismDNSand...AWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion56
Spot the SPOF: 8InternetFail-overmechanismDNSand...Ops staffandAWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion57
Availability: Ensure RedundanciesPhysicalVirtual resource (instance, disk, etc.)Availability zoneRegionProviderHuman (ops staff)58
Availability Best Practice:Chaos MonkeyAKA Error Injection TestingForcibly create fault conditions in your cloud components.Kill instances, detach disks, screw up DNS, etc.Automate recovery from the errors.The team gets really good at reducing MTTR, increasing availability!Popularized by Netflix, who run it on their live environment.59
For more on Designing forAvailability, ScalabilityJonas BonérScalability, Availability, Stability Patterns  http://slidesha.re/cK3NJvGeorge ReeseThe AWS Outage: The Cloud’s Shining Momenthttp://oreil.ly/eKCGG9John Ciancutti of Netflix5 Lessons We’ve Learned Using AWShttp://bit.ly/h8rU8b60
Ten Cloud ApplicationDesign PatternsThank you!ManyShlomo SwidlerFounderOrchestratusshlomo@orchestratus.com@ShlomoSwidler

Ten^H^H^H Many Cloud App Design Patterns

  • 1.
    Ten CloudDesign PatternsShlomoSwidlerFounderOrchestratus
  • 2.
    Shlomo SwidlerFounder, OrchestratusStrategicand technicalIT consultingCustomers include:Cloud Developer Tips bloghttp://shlomoswidler.com/Among top community-ranked contributors to Amazon Web Services discussion forums2
  • 3.
    Ten CloudDesign PatternsShlomoSwidlerFounderOrchestratus
  • 4.
    Ten Cloud ApplicationDesignPatternsShlomo SwidlerFounderOrchestratus
  • 5.
    Ten Cloud ApplicationDesignPatternsManyShlomo SwidlerFounderOrchestratus
  • 6.
    What is aDesign PatternA reusable recipe for building (software) systems that solve a particular problem.6
  • 7.
    What is aDesign PatternA reusable recipe for building (software) systems that solve a particular problem.AKA Architectural Pattern7
  • 8.
    What is aDesign PatternA reusable recipe for building (software) systems that solve a particular problem.GoalAvailableResourcesConstraints8
  • 9.
    A Design PatternAreusable recipe for building (software) systems that solve a particular problem.Meets affirmativerequirementsGoalDoes not violatenegative requirementsAvailableResourcesConstraintsCan be implemented9
  • 10.
    Challenges Faced byApps in the CloudApplication ScalabilityCloud promises rapid (de)provisioning of resources.How do you tap into that to create scalable systems?Application AvailabilityUnderlying resource failures happen … usually more frequently than in traditional data centers.How do you overcome that to create highly available systems?10
  • 11.
    The Scalability ChallengeScalability:Handle more (or fewer) requestsIt’s not Performance (handle requests faster)It’s not Availability (tolerate failures)But improving Scalability often improves Availability11
  • 12.
    The Scalability ChallengeTwodifferent components to scale:State (inputs, data store, output)Behavior (business logic)Any non-trivial application has both.Scaling one component means scaling the other, too.12
  • 13.
    App Scalability Patternsfor State Data GridsDistributed CachingHTTP CachingReverse ProxyCDNConcurrencyMessage-PassingDataflowSoftware Transactional MemoryShared-StatePartitioningCAP theorem: Data ConsistencyEventually ConsistentAtomic DataDB StrategiesRDBMSDenormalizationShardingNOSQLKey-Value storeDocument storeData Structure storeGraph database13
  • 14.
    App Scalability Patternsfor BehaviorCompute GridsEvent-Driven ArchitectureMessagingActorsEnterprise Service BusDomain EventsEvent Stream ProcessingEvent SourcingCommand & Query Responsibility Segregation (CQRS)Load BalancingRound-robinRandomWeightedDynamicParallel ComputingMaster/WorkerFork/JoinMapReduceSPMDLoop Parallelism14
  • 15.
    The Availability ChallengeAvailability:Tolerate failuresTraditional IT focuses on increasing MTTFMean Time to FailureCloud IT focuses on reducing MTTRMean Time to Recovery15
  • 16.
    The Availability ChallengeAvailability:Tolerate failuresTraditional IT focuses on increasing MTTFMean Time to FailureCloud IT focuses on reducing MTTRMean Time to RecoveryWhat follows is four availability scenarios: [low, high] X [MTTF, MTTR]16
  • 17.
  • 18.
    Availability and MTTF,MTTRUptime53%86%69%30%18
  • 19.
    Availability and MTTF,MTTRTraditional ITUptime53%86%69%30%19
  • 20.
    Availability and MTTF,MTTRTraditional ITUptime53%86%Cloud69%30%20
  • 21.
    Availability and MTTF,MTTRTraditional ITUptime53%86%Cloud69%30%Cloud done wrong21
  • 22.
    Design Patterns forAvailabilityPattern: ReplicationPattern: Fail-OverOften used together.22
  • 23.
  • 24.
    Availability Pattern: Fail-OverInpractice, fail-over is not this simpleSource: Michael Nygaard24
  • 25.
  • 26.
    Availability Pattern: Fail-Overwith Fail-BackSource: Michael Nygaard26
  • 27.
  • 28.
    SPOT the SPOF**SinglePoint of Failure
  • 29.
    Spot the SPOF:129InternetCloudAppApp Instance
  • 30.
    Spot the SPOF:1b30InternetCloudAppApp Instance
  • 31.
    Spot the SPOF:1b31InternetCloudAppApp Instance
  • 32.
    Spot the SPOF:232InternetElasticIP AddressCloudAppAppApp InstanceApp InstanceFail-over
  • 33.
    Spot the SPOF:233InternetMightwork…Until you need more App instancesOr until another SPOF fails…Elastic IP AddressCloudAppAppApp InstanceApp InstanceFail-over
  • 34.
    Spot the SPOF:2a34InternetLBLoad Balancer InstanceCloudAppApp
  • 35.
    Spot the SPOF:2a35InternetLBLoad Balancer InstanceCloudAppApp
  • 36.
    Spot the SPOF:336InternetElastic IP AddressLBLBAvailability ZoneReplicated configurationCloudAppAppFail-over
  • 37.
    Spot the SPOF:337InternetElastic IP AddressLBLBAvailability ZoneReplicated configurationCloudAppAppFail-over
  • 38.
    Spot the SPOF:438InternetElastic Load Balancer (Magic)ELBAvailability ZoneCloudAppApp
  • 39.
    Spot the SPOF:439InternetElastic Load Balancer (Magic)ELBAvailability ZoneCloudAppApp
  • 40.
    Spot the SPOF:540InternetElastic IP AddressLBLBAvailability ZoneAvailability ZoneReplicated configurationRegionAppAppAppAppFail-over
  • 41.
    Spot the SPOF:541InternetElastic IP AddressLBLBAvailability ZoneAvailability ZoneReplicated configurationRegionAppAppAppAppFail-over
  • 42.
    Spot the SPOF:642InternetElastic Load Balancer (Magic)ELBAvailability ZoneAvailability ZoneRegionAppAppAppApp
  • 43.
    Spot the SPOF:643InternetElastic Load Balancer (Magic)ELBAvailability ZoneAvailability ZoneRegionAppAppAppApp
  • 44.
    Spot the SPOF:744InternetLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegionOr…
  • 45.
    Spot the SPOF:7a45InternetLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 46.
    Spot the SPOF:7/7a46InternetElastic IPs aresingle-region onlyLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 47.
    Spot the SPOF:7b47InternetELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 48.
    Spot the SPOF:7b48InternetELB is single-region onlyELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 49.
    Spot the SPOF:7c49InternetDNSELBELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 50.
    Spot the SPOF:7c50InternetELB Can’t Do ThatMultiple CNAMEs Violate RFC 2181DNSELBELBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 51.
    Spot the SPOF:7d51InternetDNSLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 52.
    Spot the SPOF:7d52InternetDNSCloud ProviderLBLBLBLBAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppRegionRegion
  • 53.
    Spot the SPOF:8InternetDNSAWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion53
  • 54.
    Spot the SPOF:8InternetDNSAWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion54
  • 55.
    Spot the SPOF:8InternetDNSand...AWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion55
  • 56.
    Spot the SPOF:8InternetFail-overmechanismDNSand...AWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion56
  • 57.
    Spot the SPOF:8InternetFail-overmechanismDNSand...Ops staffandAWSLBLBLBLBLBRackspaceAvailability ZoneAvailability ZoneAvailability ZoneAvailability ZoneAppAppAppAppAppAppAppAppAppAppRegionRegion57
  • 58.
    Availability: Ensure RedundanciesPhysicalVirtualresource (instance, disk, etc.)Availability zoneRegionProviderHuman (ops staff)58
  • 59.
    Availability Best Practice:ChaosMonkeyAKA Error Injection TestingForcibly create fault conditions in your cloud components.Kill instances, detach disks, screw up DNS, etc.Automate recovery from the errors.The team gets really good at reducing MTTR, increasing availability!Popularized by Netflix, who run it on their live environment.59
  • 60.
    For more onDesigning forAvailability, ScalabilityJonas BonérScalability, Availability, Stability Patterns http://slidesha.re/cK3NJvGeorge ReeseThe AWS Outage: The Cloud’s Shining Momenthttp://oreil.ly/eKCGG9John Ciancutti of Netflix5 Lessons We’ve Learned Using AWShttp://bit.ly/h8rU8b60
  • 61.
    Ten Cloud ApplicationDesignPatternsThank you!ManyShlomo SwidlerFounderOrchestratusshlomo@orchestratus.com@ShlomoSwidler

Editor's Notes

  • #28 Nemesis by Alfred Rethel, 1837.
  • #34 More on other SPOFs here in a minute. Now, let’s see what you do if you want that scalability