Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling micro-services Architecture on AWS

19,854 views

Published on

In this talk we are going to explore how Hailo evolved a monolithic LAMP stack into micro-services platform based on Go. We are going to share the challenges we faced and some of the design patterns that helped us scale our system. We will take a peek into our internal orchestration architecture and the tooling we built to help us automate and manage our platform

Published in: Technology

Scaling micro-services Architecture on AWS

  1. 1. Scaling micro-services architecture on AWS Boyan Dimitrov, Senior Systems Engineer at Hailo @nathariel
  2. 2. Outline • Intro to the Hailo world • Our cloud journey and architecture evolution • Platform design patterns and challenges • Tooling AWS User Group UK 2014
  3. 3. AWS User Group UK 2014
  4. 4. AWS User Group UK 2014 The world’s highest-rated taxi app – almost 20,000 five-star reviews To date, Hailo has carried more than 11 million passengers Hailo has over 50,000 registered taxi drivers worldwide
  5. 5. AWS User Group UK 2014
  6. 6. November 2011: Hailo 1.0 Launch Users: 1 Regions: eu-west-1 AWS User Group UK 2014
  7. 7. eu-west-1 Java MYSQL PHP Architecture specifics • Monolithic PHP and Java applications • Built and supported by 3-4 backend engineers • City-specific environments • MySQL master-master replication for resilience • Multi-AZ since day 1 AWS specifics Route 53 ELB S3 AWS User Group UK 2014
  8. 8. Challenges • Hard to develop new features • Painful to push code changes and to support many independent city specific environments • Adding new instances and more capacity is a very slow and expensive process • Unreliable and slow failover procedures • SPOF AWS User Group UK 2014
  9. 9. December 2013: Hailo 2.0 AWS User Group UK 2014 Users: 1 000 000+ Regions: eu-west-1, us-east-1, ap-northeast-1
  10. 10. Architecture specifics • Micro-services architecture based on Go and Java • Seamless service discovery, service to service communication, monitoring and instrumentation • Everything is automated • Ability to scale services up and down based on demand AWS specifics Route 53 ELB S3 AWS User Group UK 2014 Autoscaling Cloudfront Redshift
  11. 11. eu-west-1 Message Bus+ Go Services Proxy Layer Java Services C* us-east-1 Proxy Layer C* ap-northeast -1 Proxy Layer C* AWS User Group UK 2014 Distributed Queue+ Message Bus+ Distributed Queue+ Message Bus+ Distributed Queue+ Go Services Java Services Go Services Java Services
  12. 12. Challenges • Hard to develop new features Completing new features in days, not months • Painful to push code changes Seamless service deployment and ability to run multiple versions of a service • Adding new instances and adding more capacity is slow Our servers scale up and down based on demand • Unreliable and slow failover procedures Automated reaping of misbehaving services and AZ failover • SPOF Fault-tolerant distributed services architecture AWS User Group UK 2014
  13. 13. Infrastructure operating cost – a very important KPI AWS User Group UK 2014
  14. 14. Platform design patterns and challenges AWS User Group UK 2014
  15. 15. AWS User Group UK 2014 Orchestration Layer Overview • External orchestration services responsible for all environments • Internal orchestration services responsible for the local environment only
  16. 16. AWS User Group UK 2014 External Orchestration Layer under the hood • The external orchestration layer is built on the same platform and shares the same distributed, scalability and resiliency specifics • Each external orchestration service instance has a “global” view of our infrastructure • Relies heavily on STS to operate across different accounts and regions
  17. 17. AWS User Group UK 2014 Inside an environment: Auto Scaling and service provisioning
  18. 18. • Increased operational and deployment complexity - requires constant service resource utilization monitoring and manual shuffling. • Risk of performance impact due to “noisy neighbours” • Suboptimal resource management AWS User Group UK 2014 Challenges
  19. 19. AWS User Group UK 2014 Micro-services + Containers + Scheduling
  20. 20. • Increased operational and deployment complexity – requires constant service resource utilization monitoring and manual shuffling On-demand infrastructure resources and services provisioning based on SLA • Risk of performance impact due to “noisy neighbours” Each service is isolated from the rest • Suboptimal resource management Services are grouped together in the most optimal way. We expect up to 30% cost reduction of our worker services operational cost once we roll out this solution AWS User Group UK 2014 Micro-services + Containers + Scheduling on AWS will be a dominant architecture pattern in the next few years Challenges
  21. 21. Tooling AWS User Group UK 2014 Because all resources are ephemeral and will fail…
  22. 22. AWS User Group UK 2014 A holistic view of the platform
  23. 23. AWS User Group UK 2014 Service level health checks
  24. 24. AWS User Group UK 2014 Reliable and repeatable service provisioning
  25. 25. Everything is an event stream AWS User Group UK 2014
  26. 26. Platform events count as well! AWS User Group UK 2014
  27. 27. AWS User Group UK 2014 Still “things” will fail in mysterious ways
  28. 28. AWS User Group UK 2014 Circuit breakers and graceful degradation when things go wrong
  29. 29. Thank you, any questions? @nathariel boyan@hailocab.com

×