How Mapbox Scales over 9 AWS Regions

•

0 likes•726 views

Johan

Slides from Devops Madrid in November 2015 http://www.meetup.com/madrid-devops/events/225555911/

Technology

How Mapbox Scales
Over 9 Data Centers
Johan
@freenerd
johan@mapbox.com
Madrid DevOps November 2015
1

• What is Mapbox?
• 9 data centers?
• Tracing a map request
2

• What is Mapbox?
• 9 Data Centers?
• Tracing a map request
7

9 Data Centers?
• Why run this in many regions?
• One region = cheaper, less complex, easier to build and
maintain
11

9 Data Centers?
• Global high availability
• Global low latency
12

9 Data Centers?
Global high availability
13

9 Data Centers?
Global high availability
• Mapbox is critical infrastructure for our customers
• Mapbox SLA: 99.9%
• Problems for high availability
• AWS problems
• Mapbox software or conﬁguration problems
• Critical deploys
14

9 Data Centers?
• Global high availability
• Global low latency
15

9 Data Centers?
Global low latency
• Can't beat the speed of light
• Latency is critical for using a map
• Bring our data closer to our users
17

9 Data Centers?
• Global high availability
• Global low latency
18

• Grid over the world
• Every cell of the grid is a tile
• Different zoomlevels
• Zoomlevel 0 is the world
• Zoomlevel 13 is a city
• Every tile is identiﬁed by mapid,
coordinates and zoomlevel
24

Client
• Browser loads Javascript
• Mapbox.js allows for customizing map with very few lines
• Javascript
• Determine viewport
• Request each individual tile
26

Client
https://tiles.mapbox.com/v4/
map.id/17/70428/42997.png?access_token=pk.xxx
27

CDN
• Content Distribution Network
• Physical cache close to users
• AWS: Cloudfront
• Others: Akamai, Fastly, CloudFlare
29

CDN
• When a request comes in:
• Find nearest edge location
• Terminate TLS
• Match request to behaviour
• Look in cache (based on URL & Query String)
• If object is there: return
31

CDN
• Your CDN works best if it can serve everything from cache
• How to remove stale data?
• Trade-off: high cache hit rate vs. update delay
• Time-To-Live when a cached object expires
• We use 5 minutes
• 35 % cache hit rate
32

DNS
• Originally: Resolve domain names to IP addresses
• Also: Route request to nearest data center
• best region for request based historic on latency
• Amazon: Route53
• Others: Dyn, easyDNS, Akamai
34

Load Balancer
• Route requests to application servers
• Entry point to a region
• AWS: Elastic Load Balancer (ELB)
• Others: haproxy, nginx, f5
37

Load Balancer
• Terminate TLS
• Determine which application server to route to
• Healthy server
• ELB: Server with least outstanding requests
• Wait for results and return
38

Application Servers
• Virtual Machines
• AWS: Elastic Compute Cloud (EC2)
• Others: Google Compute Cloud, Rackspace, Digital Ocean
40

Application Servers
• c3.xlarge instances
• Ubuntu Linux
• Node.js/Express
41

Application Servers
• Authenticate
• Load map data
• Fetch tile and return
42

DynamoDB
• Primary/Replica
• Reads to replicas, writes only to
primary
• Replicas only in 2 regions
• Reads for non-replica regions need
to go over the Internet
• In-instance caching of
authentication/map information
1
https://www.mapbox.com/blog/scaling-the-mapbox-infrastructure-with-
dynamodb-streams/
44

Application Servers
Fetch tiles
• check simultanously in cache (redis) and object store (s3)
• return from where is found ﬁrst
• if only found in object store, update local cache
46

Application Servers
• redis is used as least-recently used cache, thus popular tiles
for a region are usually cached
• s3 is slow, because data is in us-east-1 bucket only
• Stats:
• 80% cache hits
• r3.4xlarge with 122 GB of memory
47

Thanks
• What is Mapbox?
• 9 Data Centers?
• Tracing a Map Request
@freenerd
johan@mapbox.com
53

Elasticity
• EC2 instances are provisioned via Auto Scaling Group
• Auto Scaling is based on instance CPU load
• Scale up/down if CPU load over/under 55%/20% for 2
minutes
55

What's hot

It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, ShopifyHostedbyConfluent

Geode - Day 2Swapnil Bawaskar

uReplicator: Uber Engineering’s Scalable, Robust Kafka ReplicatorMichael Hongliang Xu

Scalable IoT platformSwapnil Bawaskar

Kafka Summit SF 2017 - Fast Data in Supply Chain Planningconfluent

Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per DayAnkur Bansal

The Benefits of Publicly-Accessible Data - SNODASThomas Horner

Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, UberHostedbyConfluent

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streamsconfluent

Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...Flink Forward

Securing the Message Bus with Kafka Streams | Paul Otto and Ryan Salcido, Raf...HostedbyConfluent

Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafkaconfluent

Powering an API with GraphQL, Golang, and NoSQLNic Raboy

Flink Forward San Francisco 2019: Developing and operating real-time applicat...Flink Forward

InfluxData Internals by Ryan BettsInfluxData

Case Study: Stream Processing on AWS using Kappa ArchitectureJoey Bolduc-Gilbert

Administrative techniques to reduce Kafka costs | Anna Kepler, ViasatHostedbyConfluent

HBaseCon2017 Apache HBase at DidiHBaseCon

Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...Flink Forward

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...Flink Forward

What's hot (20)

It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify

Geode - Day 2

uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator

Scalable IoT platform

Kafka Summit SF 2017 - Fast Data in Supply Chain Planning

Hadoop summit - Scaling Uber’s Real-Time Infra for Trillion Events per Day

The Benefits of Publicly-Accessible Data - SNODAS

Kafka Tiered Storage | Satish Duggana and Sriharsha Chintalapani, Uber

Kafka Summit SF 2017 - Real-Time Document Rankings with Kafka Streams

Flink Forward San Francisco 2019: Moving from Lambda and Kappa Architectures ...

Securing the Message Bus with Kafka Streams | Paul Otto and Ryan Salcido, Raf...

Kafka Summit NYC 2017 - Data Processing at LinkedIn with Apache Kafka

Powering an API with GraphQL, Golang, and NoSQL

Flink Forward San Francisco 2019: Developing and operating real-time applicat...

InfluxData Internals by Ryan Betts

Case Study: Stream Processing on AWS using Kappa Architecture

Administrative techniques to reduce Kafka costs | Anna Kepler, Viasat

HBaseCon2017 Apache HBase at Didi

Flink forward SF 2017: Elizabeth K. Joseph and Ravi Yadav - Flink meet DC/OS ...

Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...

Similar to How Mapbox Scales over 9 AWS Regions

Mapbox.com: Serving maps from 8 regionsJohan

eHarmony in the CloudCraig Dickson

Amazon Web Services Architecture - An OverviewScott Weber

Sabre presentation for MySQL user conference 2004Alan Walker

MCSA 70-412 Chapter 05Computer Networking

Getting Started with HadoopCloudera, Inc.

سکوهای ابری و مدل های برنامه نویسی در ابرdatastack

REDSHIFT - AmazonDouglas Bernardini

AWS re:Invent 2013 RecapBarry Jones

Apache Geode Meetup, LondonApache Geode

Big data on awsSerkan Özal

Streaming sql and druid arupmalakar

Hadoop 3.0 - Revolution or evolution?Uwe Printz

In-memory Caching in HDFS: Lower Latency, Same Great TasteDataWorks Summit

SQL To NoSQL - Top 6 Questions Before Making The MoveIBM Cloud Data Services

Distributed Logging Architecture in the Container EraGlenn Davis

MySQL in the Hosted CloudColin Charles

Managing storage on Prem and in CloudHoward Marks

Innovation in the Data Warehouse - StampedeCon 2016StampedeCon

Dimension Data Cloud Business Unit - Solution OfferingRifaHaryadi

Similar to How Mapbox Scales over 9 AWS Regions (20)

Mapbox.com: Serving maps from 8 regions

eHarmony in the Cloud

Amazon Web Services Architecture - An Overview

Sabre presentation for MySQL user conference 2004

MCSA 70-412 Chapter 05

Getting Started with Hadoop

سکوهای ابری و مدل های برنامه نویسی در ابر

REDSHIFT - Amazon

AWS re:Invent 2013 Recap

Apache Geode Meetup, London

Big data on aws

Streaming sql and druid

Hadoop 3.0 - Revolution or evolution?

In-memory Caching in HDFS: Lower Latency, Same Great Taste

SQL To NoSQL - Top 6 Questions Before Making The Move

Distributed Logging Architecture in the Container Era

MySQL in the Hosted Cloud

Managing storage on Prem and in Cloud

Innovation in the Data Warehouse - StampedeCon 2016

Dimension Data Cloud Business Unit - Solution Offering

Recently uploaded

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal

Developing An App To Navigate The Roads of BrazilV3cube

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge

Handwritten Text Recognition for manuscripts and early printed textsMaria Levchenko

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo

Automating Google Workspace (GWS) & more with Apps Scriptwesley chun

How to convert PDF to text with Nanonetsnaman860154

Partners Life - Insurer Innovation Award 2024The Digital Insurer

Finology Group – Insurtech Innovation Award 2024The Digital Insurer

From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software

The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad

04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG

🐬 The future of MySQL is Postgres 🐘RTylerCroy

IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700

Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC

Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2

How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes

Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge

[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745

Recently uploaded (20)

WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service

Developing An App To Navigate The Roads of Brazil

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf

Handwritten Text Recognition for manuscripts and early printed texts

Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...

Automating Google Workspace (GWS) & more with Apps Script

How to convert PDF to text with Nanonets

Partners Life - Insurer Innovation Award 2024

Finology Group – Insurtech Innovation Award 2024

From Event to Action: Accelerate Your Decision Making with Real-Time Automation

The Codex of Business Writing Software for Real-World Solutions 2.pptx

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx

🐬 The future of MySQL is Postgres 🐘

IAC 2024 - IA Fast Track to Search Focused AI Solutions

Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...

Breaking the Kubernetes Kill Chain: Host Path Mount

Exploring the Future Potential of AI-Enabled Smartphone Processors

How to Troubleshoot Apps for the Modern Connected Worker

Driving Behavioral Change for Information Management through Data-Driven Gree...

[2024]Digital Global Overview Report 2024 Meltwater.pdf

How Mapbox Scales over 9 AWS Regions

1. How Mapbox Scales Over 9 Data Centers Johan @freenerd johan@mapbox.com Madrid DevOps November 2015 1

2. • What is Mapbox? • 9 data centers? • Tracing a map request 2

3. 3

4. 4

5. 5

6. 6

7. • What is Mapbox? • 9 Data Centers? • Tracing a map request 7

8. 9 Data Centers? 8

9. 9

10. 10

11. 9 Data Centers? • Why run this in many regions? • One region = cheaper, less complex, easier to build and maintain 11

12. 9 Data Centers? • Global high availability • Global low latency 12

13. 9 Data Centers? Global high availability 13

14. 9 Data Centers? Global high availability • Mapbox is critical infrastructure for our customers • Mapbox SLA: 99.9% • Problems for high availability • AWS problems • Mapbox software or conﬁguration problems • Critical deploys 14

15. 9 Data Centers? • Global high availability • Global low latency 15

16. 9 Data Centers? Global low latency 16

17. 9 Data Centers? Global low latency • Can't beat the speed of light • Latency is critical for using a map • Bring our data closer to our users 17

18. 9 Data Centers? • Global high availability • Global low latency 18

19. Let's trace a request 19

20. 20

21. 21

22. What is a map? 22

23. 23

24. • Grid over the world • Every cell of the grid is a tile • Different zoomlevels • Zoomlevel 0 is the world • Zoomlevel 13 is a city • Every tile is identiﬁed by mapid, coordinates and zoomlevel 24

25. 25

26. Client • Browser loads Javascript • Mapbox.js allows for customizing map with very few lines • Javascript • Determine viewport • Request each individual tile 26

27. Client https://tiles.mapbox.com/v4/ map.id/17/70428/42997.png?access_token=pk.xxx 27

28. 28

29. CDN • Content Distribution Network • Physical cache close to users • AWS: Cloudfront • Others: Akamai, Fastly, CloudFlare 29

30. CDN 30

31. CDN • When a request comes in: • Find nearest edge location • Terminate TLS • Match request to behaviour • Look in cache (based on URL & Query String) • If object is there: return 31

32. CDN • Your CDN works best if it can serve everything from cache • How to remove stale data? • Trade-off: high cache hit rate vs. update delay • Time-To-Live when a cached object expires • We use 5 minutes • 35 % cache hit rate 32

33. 33

34. DNS • Originally: Resolve domain names to IP addresses • Also: Route request to nearest data center • best region for request based historic on latency • Amazon: Route53 • Others: Dyn, easyDNS, Akamai 34

35. DNS 35

36. 36

37. Load Balancer • Route requests to application servers • Entry point to a region • AWS: Elastic Load Balancer (ELB) • Others: haproxy, nginx, f5 37

38. Load Balancer • Terminate TLS • Determine which application server to route to • Healthy server • ELB: Server with least outstanding requests • Wait for results and return 38

39. 39

40. Application Servers • Virtual Machines • AWS: Elastic Compute Cloud (EC2) • Others: Google Compute Cloud, Rackspace, Digital Ocean 40

41. Application Servers • c3.xlarge instances • Ubuntu Linux • Node.js/Express 41

42. Application Servers • Authenticate • Load map data • Fetch tile and return 42

43. 43

44. DynamoDB • Primary/Replica • Reads to replicas, writes only to primary • Replicas only in 2 regions • Reads for non-replica regions need to go over the Internet • In-instance caching of authentication/map information 1 https://www.mapbox.com/blog/scaling-the-mapbox-infrastructure-with- dynamodb-streams/ 44

45. 45

46. Application Servers Fetch tiles • check simultanously in cache (redis) and object store (s3) • return from where is found ﬁrst • if only found in object store, update local cache 46

47. Application Servers • redis is used as least-recently used cache, thus popular tiles for a region are usually cached • s3 is slow, because data is in us-east-1 bucket only • Stats: • 80% cache hits • r3.4xlarge with 122 GB of memory 47

48. 48

49. 49

50. From 2 to 9 regions 50

51. From 2 to 9 regions 51

52. From 2 to 9 regions 52

53. Thanks • What is Mapbox? • 9 Data Centers? • Tracing a Map Request @freenerd johan@mapbox.com 53

54. http://geodevelopers.org/ 54

55. Elasticity • EC2 instances are provisioned via Auto Scaling Group • Auto Scaling is based on instance CPU load • Scale up/down if CPU load over/under 55%/20% for 2 minutes 55

56. 56

How Mapbox Scales over 9 AWS Regions

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How Mapbox Scales over 9 AWS Regions

Similar to How Mapbox Scales over 9 AWS Regions (20)

More from Johan

More from Johan (13)

Recently uploaded

Recently uploaded (20)

How Mapbox Scales over 9 AWS Regions