Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

How Mapbox Scales over 9 AWS Regions

514 views

Published on

Slides from Devops Madrid in November 2015
http://www.meetup.com/madrid-devops/events/225555911/

Published in: Technology
  • Be the first to comment

  • Be the first to like this

How Mapbox Scales over 9 AWS Regions

  1. 1. How Mapbox Scales Over 9 Data Centers Johan @freenerd johan@mapbox.com Madrid DevOps November 2015 1
  2. 2. • What is Mapbox? • 9 data centers? • Tracing a map request 2
  3. 3. 3
  4. 4. 4
  5. 5. 5
  6. 6. 6
  7. 7. • What is Mapbox? • 9 Data Centers? • Tracing a map request 7
  8. 8. 9 Data Centers? 8
  9. 9. 9
  10. 10. 10
  11. 11. 9 Data Centers? • Why run this in many regions? • One region = cheaper, less complex, easier to build and maintain 11
  12. 12. 9 Data Centers? • Global high availability • Global low latency 12
  13. 13. 9 Data Centers? Global high availability 13
  14. 14. 9 Data Centers? Global high availability • Mapbox is critical infrastructure for our customers • Mapbox SLA: 99.9% • Problems for high availability • AWS problems • Mapbox software or configuration problems • Critical deploys 14
  15. 15. 9 Data Centers? • Global high availability • Global low latency 15
  16. 16. 9 Data Centers? Global low latency 16
  17. 17. 9 Data Centers? Global low latency • Can't beat the speed of light • Latency is critical for using a map • Bring our data closer to our users 17
  18. 18. 9 Data Centers? • Global high availability • Global low latency 18
  19. 19. Let's trace a request 19
  20. 20. 20
  21. 21. 21
  22. 22. What is a map? 22
  23. 23. 23
  24. 24. • Grid over the world • Every cell of the grid is a tile • Different zoomlevels • Zoomlevel 0 is the world • Zoomlevel 13 is a city • Every tile is identified by mapid, coordinates and zoomlevel 24
  25. 25. 25
  26. 26. Client • Browser loads Javascript • Mapbox.js allows for customizing map with very few lines • Javascript • Determine viewport • Request each individual tile 26
  27. 27. Client https://tiles.mapbox.com/v4/ map.id/17/70428/42997.png?access_token=pk.xxx 27
  28. 28. 28
  29. 29. CDN • Content Distribution Network • Physical cache close to users • AWS: Cloudfront • Others: Akamai, Fastly, CloudFlare 29
  30. 30. CDN 30
  31. 31. CDN • When a request comes in: • Find nearest edge location • Terminate TLS • Match request to behaviour • Look in cache (based on URL & Query String) • If object is there: return 31
  32. 32. CDN • Your CDN works best if it can serve everything from cache • How to remove stale data? • Trade-off: high cache hit rate vs. update delay • Time-To-Live when a cached object expires • We use 5 minutes • 35 % cache hit rate 32
  33. 33. 33
  34. 34. DNS • Originally: Resolve domain names to IP addresses • Also: Route request to nearest data center • best region for request based historic on latency • Amazon: Route53 • Others: Dyn, easyDNS, Akamai 34
  35. 35. DNS 35
  36. 36. 36
  37. 37. Load Balancer • Route requests to application servers • Entry point to a region • AWS: Elastic Load Balancer (ELB) • Others: haproxy, nginx, f5 37
  38. 38. Load Balancer • Terminate TLS • Determine which application server to route to • Healthy server • ELB: Server with least outstanding requests • Wait for results and return 38
  39. 39. 39
  40. 40. Application Servers • Virtual Machines • AWS: Elastic Compute Cloud (EC2) • Others: Google Compute Cloud, Rackspace, Digital Ocean 40
  41. 41. Application Servers • c3.xlarge instances • Ubuntu Linux • Node.js/Express 41
  42. 42. Application Servers • Authenticate • Load map data • Fetch tile and return 42
  43. 43. 43
  44. 44. DynamoDB • Primary/Replica • Reads to replicas, writes only to primary • Replicas only in 2 regions • Reads for non-replica regions need to go over the Internet • In-instance caching of authentication/map information 1 https://www.mapbox.com/blog/scaling-the-mapbox-infrastructure-with- dynamodb-streams/ 44
  45. 45. 45
  46. 46. Application Servers Fetch tiles • check simultanously in cache (redis) and object store (s3) • return from where is found first • if only found in object store, update local cache 46
  47. 47. Application Servers • redis is used as least-recently used cache, thus popular tiles for a region are usually cached • s3 is slow, because data is in us-east-1 bucket only • Stats: • 80% cache hits • r3.4xlarge with 122 GB of memory 47
  48. 48. 48
  49. 49. 49
  50. 50. From 2 to 9 regions 50
  51. 51. From 2 to 9 regions 51
  52. 52. From 2 to 9 regions 52
  53. 53. Thanks • What is Mapbox? • 9 Data Centers? • Tracing a Map Request @freenerd johan@mapbox.com 53
  54. 54. http://geodevelopers.org/ 54
  55. 55. Elasticity • EC2 instances are provisioned via Auto Scaling Group • Auto Scaling is based on instance CPU load • Scale up/down if CPU load over/under 55%/20% for 2 minutes 55
  56. 56. 56

×