Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling the Platform for Your Startup - Startup Talks June 2015

1,160 views

Published on

Join AWS at this session to understand how to architect an infrastructure to handle going from zero to millions of users. From leveraging highly scalable AWS services to making smart decisions on building out your application, you'll learn a number of best practices for scaling your infrastructure in the cloud.

Published in: Technology
  • Be the first to comment

Scaling the Platform for Your Startup - Startup Talks June 2015

  1. 1. Scaling  the  Platform  for  your   Startup Dean  Bryen,  AWS  Solutions  Architecture Peter  Mounce,  Senior  Software  Developer  at  JUST  EAT
  2. 2. Why  are  you  here? • Building  the  technology  platform  for  your  startup • You  want  to  prepare  for  success • Learn  about  design  patterns  &  scalability • A  pragmatic  approach  for  startups
  3. 3. Priorities  for  startups • Racing  within  a  window  of  opportunity • Small  team  with  no  legacy • Focus  on  solving  a  problem • Avoid  over-­engineering  &  re-­engineering • Reduce  risk  of  failure  when  you  go  viral
  4. 4. A  scalable  architecture • Can  support  growth  in  users,  traffic,  data  size   • Without  practical  limits • Without  a  drop  in  performance • Seamlessly  -­ just  by  adding  more  resources • Efficiently  -­ in  terms  of  cost  per  user
  5. 5. Day  1  – Dev  &  private  beta
  6. 6. Single  host THE server (e.g. Apache, MySQL) Elastic IP www.example.com Amazon Route 53 DNS service Server Image (AMI)
  7. 7. Day  2  -­ Public  beta
  8. 8. We  need  a  bigger  server • Add  larger  &  faster  storage  (EBS) • Use  the  right  instance  type • Easy  to  change  instance  sizes • Not  our  long  term  strategy • Will  hit  an  endpoint  eventually • No  fault  tolerance
  9. 9. Separating  web  and  DB • More  capacity • Scale  each  tier  individually • Tailor  instance  for  each  tier – Instance  type – Storage • Security – Security  groups – DB  in  a  private  VPC  subnet
  10. 10. But  how  do  I  choose  what   DB  technology  I  need?   SQL?  NoSQL?
  11. 11. Why  start  with  a  Relational  DB? • SQL  is  versatile  &  feature-­rich • Lots  of  existing  code,  tools,  knowledge • Clear  patterns  to  scalability  (for  read-­heavy  apps) • Reality:  eventually  you  will  have  a  polyglot  data  layer – There  will  be  workloads  where  NoSQL  is  a  better  fit – Combination  of  both  Relational  and  NoSQL – Use  the  right  tool  for  each  workload
  12. 12. Key  Insight:  Relational  Databases  are  Complex • Our  experience  running  Amazon.com taught  us  that   relational  databases  can  be  a  pain  to  manage  and  operate   with  high  availability • Poorly  managed  relational  databases  are  a  leading  cause   of  lost  sleep  and  downtime  in  the  IT  world! • Especially  for  startups  with  small  teams
  13. 13. Relational  Databases MySQL,  Aurora,  PostgreSQL,  Oracle,  SQL  Server Fully managed; zero admin Amazon RDS Aurora
  14. 14. Improving  efficiency
  15. 15. Offload  static  content • Amazon  S3:  highly  available  hosting  that  scales – Static  files  (JavaScript,  CSS,  images) – User  uploads • S3  URLs  – serve  directly  from  S3 • Let  the  web  server  focus  on  dynamic  content
  16. 16. Amazon  CloudFront • Worldwide  network  of  edge  locations • Cache  on  the  edge   – Reduce  latency – Reduce  load  on  origin  servers   – Static  and dynamic  content – Even  few  seconds  caching  of  popular  content  can  have  huge  impact • Connection  optimizations – Optimize  transfer  route – Reuse  connections – Benefits  even  non  cachable content
  17. 17. CloudFront for  static  &  dynamic  content Amazon Route 53 EC2 instance(s) S3 bucket Static content Dynamic content css/* js/* Images/* Default(*) CloudFron t distributio n
  18. 18. Database  caching • Faster  response  from  RAM • Reduce  load  on  database Application server 1.  If  data  in  cache,   return  result 2. If  not  in  cache,   read  from  DB RDS database Amazon ElastiCache 3.  And  store  in   cache
  19. 19. Amazon  ElastiCache:  in-­memory  cache • Simple  to  Deploy   • Managed – Automatically  replaces  failed  nodes – Patch  management • Elastic • Compatible ElastiCache
  20. 20. Day  3  – Paying  customers
  21. 21. High  Availability Availability Zone a RDS DB instance Web server S3 bucket for static assets www.example.com Amazon Route 53 DNS service Amazon CloudFront ElastiCache node 1
  22. 22. High  Availability Availability Zone a RDS DB instance Availability Zone b Web server Web server S3 bucket for static assets www.example.com Amazon Route 53 DNS service Amazon CloudFront ElastiCache node 1
  23. 23. High  Availability Availability Zone a RDS DB instance Availability Zone b www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server S3 bucket for static assets Amazon CloudFront ElastiCache node 1
  24. 24. Elastic  Load  Balancing • Managed  Load  Balancing  Service • Fault  tolerant • Health  Checks • Distributes  traffic  across  AZs • Elastic  – automatically  scales  its  capacity
  25. 25. High  Availability Availability Zone a RDS DB instance Availability Zone b www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server S3 bucket for static assets ElastiCache node 1 Amazon CloudFront
  26. 26. High  Availability Availability Zone a RDS DB instance Availability Zone b www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server RDS DB standby S3 bucket for static assets ElastiCache node 1 Amazon CloudFront
  27. 27. Data  layer  HA Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server RDS DB standby
  28. 28. Data  layer  HA Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server RDS DB standby ElastiCache node 2
  29. 29. User  sessions • Problem:  Often  stored  on  local  disk  (not  shared)     • Quickfix:  ELB  Session  stickiness • Solution:  DynamoDB Elastic Load Balancing Web server Web server Logged  in Logged  out
  30. 30. Amazon  DynamoDB • Managed  document  and  key-­value  store • Simple  to  launch  and  scale • To  millions  of  IOPS • Both  reads  and  writes • Consistent,  fast  performance • Durable:  perfect  for  storage  of  session  data https://github.com/aws/aws-­‐dynamodb-­‐session-­‐tomcat http://docs.aws.amazon.com/aws-­‐sdk-­‐php/guide/latest/feature-­‐dynamodb-­‐session-­‐handler.html
  31. 31. Day  4  – Let’s  go  viral!
  32. 32. Replace  guesswork  with  elastic  IT Startups  pre-­‐AWS Demand Unhappy Customers Waste $$$ Traditional Capacity Capacity Demand AWS Cloud
  33. 33. Scaling  the  web  tier Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server RDS DB standby ElastiCache node 2
  34. 34. Scaling  the  web  tier Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server RDS DB standby ElastiCache node 2 Web server Web server
  35. 35. Scaling  the  web  tier Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing Web server Web server RDS DB standby ElastiCache node 2 Web server Web server
  36. 36. Automatic  resizing  of  compute   clusters  based  on  demand   Feature Details Control Define  minimum  and  maximum  instance pool   sizes  and  when  scaling  and  cool  down  occurs. Integrated  to  Amazon   CloudWatch Use  metrics gathered  by  CloudWatch to  drive   scaling. Instance  types Run  Auto  Scaling  for  on-­‐demand  and  Spot   Instances. Compatible  with  VPC. aws autoscaling create-­‐auto-­‐scaling-­‐group -­‐-­‐auto-­‐scaling-­‐group-­‐name   MyGroup -­‐-­‐launch-­‐configuration-­‐name   MyConfig -­‐-­‐min-­‐size  4 -­‐-­‐max-­‐size  200 -­‐-­‐availability-­‐zones  us-­‐west-­‐2c,  us-­‐west-­‐2b Auto  Scaling Trigger  auto-­‐scaling  policy Amazon   CloudWatch
  37. 37. Decompose  into  small,   loosely  coupled,  stateless   building  blocks Prerequisite
  38. 38. What  does  this  mean  in  practice? • Only  store  transient  data  on  local  disk • Needs  to  persist  beyond  a  single  http  request? – Then  store  it  elsewhere User  uploads User  Sessions Amazon  S3 AWS  DynamoDB Application  Data Amazon  RDS
  39. 39. Having  decomposed  into   small,  loosely  coupled,   stateless  building  blocks You  can  now  Scale  out  with  ease Having  done  that…
  40. 40. Having  decomposed  into   small,  loosely  coupled,   stateless  building  blocks We  can  also  Scale  back  with  ease Having  done  that…
  41. 41. Take  the  shortcut • While  this  architecture  is  simple  you  still  need  to  deal  with:   – Configuration  details – Deploying  code  to  multiple  instances – Maintaining  multiple  environments  (Dev,  Test,  Prod) – Maintain  different  versions  of  the  application • Solution:  Use  AWS  Elastic  Beanstalk
  42. 42. AWS  Elastic  Beanstalk  (EB) • Easily  deploy,  monitor,  and  scale  three-­tier  web   applications  and  services. • Infrastructure  provisioned  and  managed  by  EB   • You  maintain  control. • Preconfigured  application  containers   • Easily  customizable. • Support  for  these  platforms:
  43. 43. Loose  coupling  with  SQS Tight  coupling • Place  asynchronous  tasks  into  Amazon  SQS • SQS  – buffer  that  protects  backend  systems • Process  at  own  pace • Respond  quickly  to  end  users SQS Get Message Back End EC2 Instance Put Message Front End EC2 Instance
  44. 44. Day  5  – Add  more  features
  45. 45. Mobile Push Notifications Mobile Analytics Cognito Cognito Sync Analytics Kinesis Data Pipeline RedShift EMR Your  Applications AWS  Global  Infrastructure Network VPC Direct Connect Route  53 Storage EBS S3 Glacier CloudFront Database DynamoDBRDS ElastiCache Deployment  &  Management Elastic Beanstalk OpsWorks Cloud Formation Code Deploy Code Pipeline Code Commit Security  &  Administration CloudWatch Config Cloud Trail IAM Directory KMS Application SQS SWF App Stream Elastic Transcoder SES Cloud Search SNS Enterprise  Applications WorkSpaces WorkMail WorkDocs Compute EC2 ELB Auto Scaling LambdaECS
  46. 46. AWS  building  blocks Inherently  Scalable  &  Highly  Available Scalable  &  Highly  Available a Elastic  Load  Balancing a Amazon  CloudFront a Amazon  Route53 a Amazon  S3 a Amazon  SQS a Amazon  SES a Amazon  CloudSearch a AWS  Lambda a … a Amazon  DynamoDB a Amazon  Redshift a Amazon  RDS a Amazon  Elasticache a … 4 Amazon  EC2 4 Amazon  VPC Automated Configurable With  the  right  architecture
  47. 47. Stay  focused  as  you  scale  your  team AWS Cloud-­‐Based Infrastructure Your Business More  Time  to  Focus  on Your  Business Configuring  Your   Cloud  Assets 70% 30%70% On-­‐Premise Infrastructure 30% Managing  All  of  the   “Undifferentiated  Heavy  Lifting”
  48. 48. Day  6  – Growing  fast
  49. 49. Scaling  Relational  DBs • Increase  RDS  instance  specs – Larger  instance  type – More  storage  /  more  PIOPS • Read  Replicas  (Master  – Slave) – Scale  out  beyond  capacity  of  single  DB  instance – Available  in  Amazon  RDS  for  MySQL,  PostgreSQL  and  Amazon  Aurora – Replication  lag – Writes  =>  master – Reads  with  tolerance  to  stale  data  =>  read  replica  (slave) – Reads  with  need  for  most  recent  data  =>  master
  50. 50. Scaling  the  DB Web server Web server Web server Web server Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing RDS DB standby ElastiCache node 2
  51. 51. Scaling  the  DB Web server Web server Web server Web server Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing RDS DB standby ElastiCache node 2 RDS read replica
  52. 52. Scaling  the  DB Web server Web server Web server Web server Availability Zone a RDS DB instance ElastiCache node 1 Availability Zone b S3 bucket for static assets www.example.com Amazon Route 53 DNS service Elastic Load Balancing RDS DB standby ElastiCache node 2 RDS read replica RDS read replica
  53. 53. What  if  your  app  is  write-­heavy? Challenge:  You  will  eventually  hit  the  write  throughput  or   storage  limit  of  the  master  node   Solutions: • Federation  (splitting  into  multiple  DBs  based  on  function) • Sharding (splitting  one  data  set  up  across  multiple  hosts)
  54. 54. Database  federation • Split  up  tables  to  smaller   autonomous  databases   • Harder  to  do  cross-­‐function  queries • Essentially  delaying  the  need  for   sharding • Won’t  help  with  single  huge   functions/tables Forums  DB Users  DB Products  DB
  55. 55. Sharded horizontal  scaling • Each  partition  hosts  a  portion   of  the  rows  of  a  table • More  complex  at  the   application  layer • ORM  support  can  help • No  practical  limit  on  scalability • Operation  complexity   • Shard  by  key  space • RDBMS  or  NoSQL User ShardID 002345 A 002346 B 002347 C 002348 B 002349 A Shard  C Shard  B Shard  A
  56. 56. NoSQL data  stores • Trade  query  &  integrity  features  of  Relational  DBs  for – More  flexible  data  model   – Horizontal  scalability  &  predictable  performance DynamoDB Provisioned  read/write  performance  per  table
  57. 57. Massive  and  Seamless  Scale • Distributed  system  that  can  scale  both  reads  and writes – Sharding +  Replicas • Automatic  &  transparent    partitioning: – Data  set  size  growth – Provisioned  capacity  increases table
  58. 58. Summary
  59. 59. Amazon Route 53 DNS service No  limit Availability Zone a RDS DB instance ElastiCache node 2 Availability Zone b S3 bucket for static assets www.example.com Elastic Load Balancing RDS DB standby ElastiCache node 3 RDS read replica RDS read replica DynamoDB RDS read replica ElastiCache node 4 RDS read replica ElastiCache node 1 CloudSearchLambdaSES SQS
  60. 60. A  quick  review • Keep  it  simple  and  stateless • Make  use  of  managed  self-­scaling  services • Multi-­AZ  and  AutoScale your  EC2  infrastructure • Use  the  right  DB  for  each  workload   • Cache  data  at  multiple  levels • Simplify  operations  with  deployment  tools
  61. 61. Next  steps? READ!     •aws.amazon.com/documentation •aws.amazon.com/architecture •aws.amazon.com/start-­ups ASK  FOR  HELP! • forums.aws.amazon.com • aws.amazon.com/support
  62. 62. Performance  testing  @  JUST  EAT (Or:  DoS yourself  every  night  in  production  to  prove  you  can  take  it) @justeat_tech  +  @petemounce http://tech.just-­eat.com
  63. 63. Please  wait  while  I  start   my  DoS  attack... (Demo  -­ start  fake  load,  show  dashboards) @justeat_tech  +  @petemounce http://tech.just-­eat.com
  64. 64. The  problem  with  performance   tests  &  continuous  delivery ● Don’t  want  to  sacrifice  continuous  delivery  &  decoupled   teams ● Don’t  want  performance  to  suffer All  the  usual  problems: ● Bottleneck  through  single  environment ● Individual  tests  take  too  long @justeat_tech  +  @petemounce http://tech.just-­eat.com
  65. 65. Why? Continuously  test ● performance ● capacity If  we  find  a  problem  Thursday  night: 1. don’t  run  fake  load  over  the  weekend 2. enjoy  weekend  as  normal 3. fix  it  next  week  with  leisure @justeat_tech  +  @petemounce http://tech.just-­eat.com
  66. 66. Gamble! OH:  “We  deploy  tens  of  small  changes  a  day.  I  bet  we   won’t  break  production...” OH:  “Let’s  just  do  it  in  production  with  fake  traffic  at  the   same  time  as  customers!” @justeat_tech  +  @petemounce http://tech.just-­eat.com
  67. 67. Not  that  much  of  a   gamble,  really We  have  tight  feedback  loops  at  this  point. Engineers  being  on  call ...  highly  invested  in  not  regressing  performance. @justeat_tech  +  @petemounce http://tech.just-­eat.com
  68. 68. How? Pick  scenarios  we  care  about Pick  data  variations  to  exercise Add  header(s)  to  discriminate  fake  load  vs  customer  load And  then: ● Run  it  every  night  during  peak  time ● If  no  alerts  fire,  we’re  good @justeat_tech  +  @petemounce http://tech.just-­eat.com
  69. 69. What  did  we  gain? Continuous  confidence  in  capacity @justeat_tech  +  @petemounce http://tech.just-­eat.com
  70. 70. What  did  we  gain? Continuous  confidence  in  dealing  with  spikes @justeat_tech  +  @petemounce http://tech.just-­eat.com
  71. 71. What  did  we  gain? Performance  as  a  1st-­class  concern @justeat_tech  +  @petemounce http://tech.just-­eat.com
  72. 72. What  did  we  gain? Tests  become  independent  of  environments’  data @justeat_tech  +  @petemounce http://tech.just-­eat.com
  73. 73. (Remind  me  to  stop  my   DoS  attack  now) (Demo  -­ stop  fake  load,  show  dashboards) @justeat_tech  +  @petemounce http://tech.just-­eat.com
  74. 74. Thank  You @justeat_tech +  @petemounce http://tech.just-­eat.com Yes,  we’re  recruiting  too. http://tech.just-­eat.com/jobs

×