Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

AWS Sydney Summit 2013 - Architecting for High Availability

2,514 views

Published on

Session 3, Presentation 6 from the AWS Sydney Summit

Published in: Technology
  • it's a very good doc.
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

AWS Sydney Summit 2013 - Architecting for High Availability

  1. 1. Joseph ZieglerArchitecting for High AvailabilityAWS Technical Evangelist @jiyosubAlexander CourtisSolutions ArchitectSilverQuest ConsultingGuest presenter:
  2. 2. High Availability PrinciplesDesign for reliable, affordable, fault-tolerant systemsthat operate with a minimal amount of humaninteraction from day one
  3. 3. Agenda• Objective– Review services and approaches to build a highly available architecture on AWS• Sections– High Availability Overview– Relevant AWS Features and Services– Principles in Practice• Customer Case Study– Carsguide
  4. 4. Agenda• Objective– Review services and approaches to build a highly available architecture on AWS• Sections– High Availability Overview– Relevant AWS Features and Services– Principles in Practice• Customer Case Study– Carsguide
  5. 5. 55What is High Availability (HA)?• Availability: Percentage of time an application operates during its work cycle.• Loss of availability is known as an outage or downtime.– App is offline, unreachable or partially available.– App is slow to use.– Planned and unplanned.• Goal– No downtime.– Always available.
  6. 6. 66HA is related to …• Scalability– Ability of a application to accommodate growth without changing design.– If app cannot scale, then availability will be impacted.– Scalability doesn’t guarantee availability.• Fault Tolerance– Built-in redundancy so apps can continue functioning when components fail.– FT is crucial to HA.• Disaster Recovery– The process, policies and procedures related to restoring service after a catastrophicevent.
  7. 7. 77Automation• “Everything is an API” philosophy enables automation of AWS resources.• AWS is literally a programmable data center.• Provisioning resources is a web service call away.• Many different ways to automate:– AWS CloudFormation– Numerous SDKs: Java, .NET, Python, Ruby, PHP– Command line tools• Automation is one of the key differentiators between AWS and traditionalinfrastructure.• Automation assists with HA.
  8. 8. Agenda• Objective– Review services and approaches to build a highly available architecture on AWS• Sections– High Availability Overview– Relevant AWS Features and Services– Principles in Practice• Customer Case Study– Carsguide
  9. 9. AWS GLOBALINFRASTRUCTURE
  10. 10. US-WEST (Oregon)EU-WEST (Ireland)ASIA PAC (Tokyo)ASIA PAC(Singapore)US-WEST (N. California)SOUTH AMERICA (Sao Paulo)US-EAST (Virginia)GOV CLOUDASIA PAC (Sydney)
  11. 11. US-WEST (Oregon))EU-WEST (Ireland)ASIA PAC (Tokyo)ASIA PAC(Singapore)US-WEST (N. California)SOUTH AMERICA (Sao Paulo)US-EAST (Virginia)GOV CLOUDASIA PAC (Sydney)
  12. 12. AWS BUILDING BLOCKSInherently Highly Available andFault Tolerant ServicesHighly Available withthe right architecture Amazon S3 Amazon DynamoDB Amazon CloudFront Amazon Route53 Elastic Load Balancing Amazon SQS Amazon SNS Amazon SES Amazon SWF … Amazon EC2 Amazon EBS Amazon RDS Amazon VPC
  13. 13. 1313Relevant Features of AWS• Leverage FT services whenever possible.• Use multiple AZs• Use abstract machine and system representations– Build images from recipes, stacks from CloudFormation• Implement elasticity– Bootstrapping, load balancing, Auto Scaling, etc…– Instance asks: “Who am I and what is my role?”
  14. 14. Agenda• Objective– Review services and approaches to build a highly available architecture on AWS• Sections– High Availability Overview– Relevant AWS Features and Services– Principles in Practice• Customer Case Study– Carsguide
  15. 15. Principles of HA1. DESIGN FOR FAILURE2. MULTIPLE AVAILABILITY ZONES3. SCALING4. SELF-HEALING5. LOOSE COUPLING
  16. 16. LET’S BUILD AHIGHLY AVAILABLESYSTEM
  17. 17. #1DESIGN FOR FAILURE
  18. 18. « Everything failsall the time »Werner VogelsCTO of Amazon
  19. 19. AVOID SINGLE POINTS OF FAILURE
  20. 20. AVOID SINGLE POINTS OF FAILUREASSUME EVERYTHING FAILS,AND WORK BACKWARDS
  21. 21. YOUR GOALApplications should continue to function
  22. 22. AMAZON EBSELASTIC BLOCK STORE
  23. 23. AMAZON ELBELASTIC LOAD BALANCING
  24. 24. HEALTH CHECKS
  25. 25. #2MULTIPLEAVAILABILITY ZONES
  26. 26. AMAZON RDSMULTI-AZ
  27. 27. AMAZON ELB ANDMULTIPLE AZs
  28. 28. #3SCALING
  29. 29. AUTO SCALINGSCALE UP/DOWN EC2 CAPACITY
  30. 30. #4SELF-HEALING
  31. 31. HEALTH CHECKS+AUTO SCALING
  32. 32. HEALTH CHECKS+AUTO SCALING=SELF-HEALING
  33. 33. #5LOOSECOUPLING
  34. 34. BUILD LOOSELYCOUPLED SYSTEMSThe looser they are coupled,the bigger they scale,the more fault tolerant they get…
  35. 35. AMAZON SQSSIMPLE QUEUE SERVICE
  36. 36. PUBLISH&NOTIFYRECEIVE TRANSCODE
  37. 37. PUBLISH&NOTIFYRECEIVE TRANSCODE
  38. 38. CLOUDWATCH METRICSFOR AMAZON SQS+AUTO SCALING
  39. 39. Simple WorkflowSWF
  40. 40. Keeps track of :StateExecuted tasksTimeoutsErrors
  41. 41. WORKFLOWACTORS
  42. 42. DECIDERSCOORDINATION LOGIC1. Poll for work on a decision listLong polling: 60 seconds2. Evaluate workflow execution historySWF sends full history in JSON format3. Return decision to Amazon SWFUsually scheduling another task
  43. 43. WorkersCOORDINATION LOGIC1. Poll for work on a specific task listLong polling: 60 seconds2. Execute works, send heartbeatsSWF sends input data from deciders3. Return success / failureDetailed data can be provided to deciders
  44. 44. NO NEW LANGUAGETO LEARNYOUR CODE IS YOUR WORKFLOW LANGUAGESWF MAINTAINS STATE
  45. 45. AWS FLOWFRAMEWORKJava Library • Entire workflow can beexpressed in sequential code •Integrated with Java Utils API
  46. 46. CHAINED TASKSWITHOUT DECISIONS?use AMAZON SQSNOTIFYRECEIVE TRANSCODE
  47. 47. TASK GRAPH WITH DECISIONS?use AMAZON SWFSPAMCHECKRECEIVEVIDEOCHECKLENGTHREJECTSHORTENVIDEOPUBLISH& NOTIFYGOODLONGOKSPAMTRANSCODE
  48. 48. Principles of HA1. DESIGN FOR FAILURE2. MULTIPLE AVAILABILITY ZONES3. SCALING4. SELF-HEALING5. LOOSE COUPLING
  49. 49. YOUR GOALApplications should continue to function
  50. 50. IT’S ALL ABOUTCHOICEBALANCE COST & HIGH AVAILABILITY
  51. 51. Agenda• Objective– Review services and approaches to build a highly available architecture on AWS• Sections– High Availability Overview– Relevant AWS Features and Services– Principles in Practice• Customer Case Study– Carsguide
  52. 52. Alexander CourtisSolutions Architect
  53. 53. carsguide.com.au – Lead Tracker• Requirements• Architecture• Development Approach• Technologies
  54. 54. carsguide.com.au – Lead Tracker• Requirements• Architecture• Development Approach• Technologies
  55. 55. 106106Lead Tracking ProcessPersist AuditB2BNotify
  56. 56. 107107Non-Functional Requirements• Meet B2B SLAs– Fault Tolerant– Scalable– Fully Auditable• Partial Manual Recovery• Parallel Execution
  57. 57. 108108Alex On Software Engineering: Principle #4• The Best Developers Are The Laziest• Avoid Inventing Octagonal Wheels• Work Very Hard Avoiding Future Work– Automate Testing– Production Requires Little To No Maintenance• Break Into Small, Independent Chunks
  58. 58. carsguide.com.au – Lead Tracker• Requirements• Architecture• Development Approach• Technologies
  59. 59. 110110And The Winner Is…+ +Amazon SWF Spring Framework
  60. 60. Availability Zone #1DeciderDeciderWorkerWorkerWorkerWorkerRDS DBInstanceDynamoDBAmazon SESAmazon SNSRDS DB InstanceStandby (Multi-AZ)Availability Zone #2DeciderDeciderWorkerWorkerWorkerWorkerDynamoDB Amazon SNS
  61. 61. carsguide.com.au – Lead Tracker• Requirements• Architecture• Development Approach• Technologies
  62. 62. Development• Don’t Start With SWF• Build Stateless, Standalone Services• Unit / Integration Test Services• Wrap Services As SWF Workers• Build SWF Deciders For Repeatable Workflows• Build A Single “Master” Decider
  63. 63. Artifacts• 2 Artifacts– Client JAR, used by external application servers to start the process– Master JAR, containing SWF deciders/workers and services• Why have a single Master JAR?– To make bootstrapping as simple as possible: each server instance is identical, youjust select a “flavour” i.e. Decider or Worker
  64. 64. carsguide.com.au – Lead Tracker• Requirements• Architecture• Development Approach• Technologies
  65. 65. B2B ServicesSpring Web Services Apache JAXBApache Amazon SESGSON
  66. 66. 117117Lead Persistence• Well Structured, Fixed Schema Data• Transactional– Relational DatabaseSpring Data JPAAmazon RDS+
  67. 67. 118118Audit Persistence• Important• Variable Format, Unstructured Data• Write Often, Read Rarely– NoSQL– Document Data Store+Spring DataAmazon DynamoDB
  68. 68. 119119Invoking SWF• SWF is invoked via a simple JSON web service call– Roll your own– Java SDK client• Suit yourself• We used the Java SDK client
  69. 69. 120120Workers• Wrap your services as an SWF Worker or Activity• aspectj generated classes
  70. 70. Worker Example@Activities(version = "1.0")@ActivityRegistrationOptions(defaultTaskHeartbeatTimeoutSeconds = FlowConstants.NONE,defaultTaskScheduleToCloseTimeoutSeconds = 180,defaultTaskScheduleToStartTimeoutSeconds = 60,defaultTaskStartToCloseTimeoutSeconds = 60)public interface MyFancyActivities {/*** Post something that is worthy** @param wowFancy mandatory; must be fancy* @return populated log indicating success or failure*/FancyLog postFancy(FancyThing wowFancy);...
  71. 71. 122122Deciders• No GUI or unmanageable “code”• Synchronous code, using Promises• Orchestrates workers and other decider workflows• Executes many times– Stateless
  72. 72. public class RogerDeciderImpl {...@Overridepublic void decide(final Stuff bigStuff) {Promise<StanDecision> stan = stanClient.decide(bigStuff);Promise<FranDecision> fran = franClient.decide(bigStuff);Promise<EarthDestroDecision> decision = rogerClient.decide(stan, fran);klausClient.audit(decision);mothershipClient.blowUp(decision);}Decider Implementation Example
  73. 73. 124124Deployment• EC2 instances managed via Puppet• Apache Maven does everything from source code management to running the processes• Is there a better way to bootstrap?+Amazon ElasticBeanstalkpom.xmlAlex’s AmazingElastic Mavenstalk™=
  74. 74. Architecting for High Availability

×