Scaling to 30,000 Requests Per Second and Beyond with MongoDB

Scaling to 30,000 Requests Per Second
and Beyond
with MongoDB
Mike Chesnut
Director of Operations Engineering
Crittercism

MongoDB World
June 23-25
world.mongodb.com
Code: 25GN for 25% off

What I’ll Talk About
● Crittercism - Overview
● Router (mongos) Architecture
● Sharding Considerations
● The Balancer and Me
● Q&A

● Pick something and go with it
● Make mistakes along the way
● Correct the mistakes you can
● Work around the ones you can’t
How a Startup Gets Started

Critter-What?
A Brief History...

Architecture
APIFeedback
App Loads
Crashes
Handled
Exceptions

Architecture
DynamoDB
APIFeedback
App Loads
Crashes
Handled
Exceptions
Metadata

Architecture
DynamoDB
API
API
Feedback
App Loads
Crashes
Handled
Exceptions
Metadata
Performance
Data
Geo Data

Critter-What?
… Which brings us to today.

Critter-What?
● feedback widget
● crash reporting
● live stats
● crash grouping
● app performance
management
● geo data
● user analytics
● executive
dashboard

Architecture
DynamoDB
API
API
Feedback
App Loads
Crashes
Handled
Exceptions
Metadata
Performance
Data
Geo Data
40,000+ req/s

Router Architecture
mongod
server
mongod
server
mongod
server
replica set
mongod
server
mongod
server
mongod
server
replica set
mongod
server
mongod
server
mongod
server
replica set
mongos
client
process
application server
mongos
client
process
application server
Client Application(s) MongoDB Cluster

Single mongos per client problems we encountered:
Router Architecture

Router Architecture
Single mongos per client problems we encountered:
● thousands of connections to config servers
● config server CPU load
● configdb propagation delays

Router Architecture
mongod
server
mongod
server
mongod
server
replica set
mongod
server
mongod
server
mongod
server
replica set
mongod
server
mongod
server
mongod
server
replica set
mongos
client
process
application server
mongos
client
process
application server
Client Application(s) MongoDB ClusterRouter Tier

Router Architecture
Separate mongos tier advantages:

Router Architecture
● greatly reduced number of connections to each mongod
● far fewer hosts talking to the config servers
● much faster configdb propagation

Router Architecture
Disadvantages:

Router Architecture
Disadvantages:
● additional network hop
● fewer points of failure

Pick something you want to live with.
Sharding Considerations

The Balancer and Me
Why wouldn’t you run the balancer in the first place?
● great question
● for us, it’s because we deleted a ton of data at one point, and left a
bunch of holes
○ we turned it off while deleting this data
○ and then were unable to turn it back on
● but maybe you start without it
● or maybe you need to turn it off for maintenance and forget to turn
it back on
Obviously, don’t do this. But if you do, here’s what happens...

The Balancer and Me
Fresh, new, empty cluster… But no balancer running.

The Balancer and Me
Now we’re pretty full, so let’s add another shard...

The Balancer and Me
And keep inserting...

The Balancer and Me
Suddenly we find ourselves with a very unbalanced cluster.

The Balancer and Me
But if we enable the balancer, it will DoS the 5th shard!

The Balancer and Me
The approximate effect looks something like this:

So what can we do?
The Balancer and Me

So what can we do?
1. add IOPS
The Balancer and Me

So what can we do?
1. add IOPS
2. make sure your config servers have plenty of CPU (and IOPS)
The Balancer and Me

So what can we do?
1. add IOPS
3. slowly move chunks manually
The Balancer and Me

So what can we do?
1. add IOPS
4. approach a balanced state
The Balancer and Me

So what can we do?
1. add IOPS
5. hold your breath
The Balancer and Me

So what can we do?
1. add IOPS
5. hold your breath
6. try re-enabling the balancer
The Balancer and Me

How to manually balance:
The Balancer and Me

1. determine a chunk on a hot shard
2. monitor effects on both the source and target shards
3. move the chunk
4. allow the system to settle
5. repeat
The Balancer and Me

mongos> db.chunks.find({"shard":"<shard_name>",
"ns":"<db_name>.<collection>"}).limit(1).pretty()
You’ll get a single chunk (as both min and max); note its shard key and
ObjectId.
The Balancer and Me

"min" : {
"unsymbolized_hash" :
"1572663b72e87[...]",
"_id" : ObjectId("50b97db98238[...]")
},
The Balancer and Me

iostat -xhm 1
mongostat
The Balancer and Me

3. move the chunk
mongos> sh.moveChunk("<db_name>.<collection>", {
"unsymbolized_hash" : "1572663b72e87[...]",
"_id" : ObjectId("50b97db98238[...]") },
"<target_shard>")
The Balancer and Me

Conclusion here:
Run the balancer.
The Balancer and Me

● Design ahead of time
o “NoSQL” lets you play it by ear
o but some of these decisions will bite you later
● Be willing to correct past mistakes
o dedicate time and resources to adapting
o learn how to live with the mistakes you can’t correct
Summary

References
● MongoDB Blog post:http://blog.mongodb.org/post/77278906988/crittercism-scaling-to-
billions-of-requests-per-day-on
● MongoDB Documentation on mongos
routers:http://docs.mongodb.org/master/core/sharded-cluster-query-routing/
● MongoDB Documentation on the
balancer:http://docs.mongodb.org/manual/tutorial/manage-sharded-cluster-balancer/
● MongoDB Documentation on shard
keys:http://docs.mongodb.org/manual/core/sharding-shard-key/
Crittercism: http://www.crittercism.com/

Scaling to 30,000 Requests Per Second and Beyond with MongoDB

Recommended

Recommended

More Related Content

What's hot

What's hot (6)

Similar to Scaling to 30,000 Requests Per Second and Beyond with MongoDB

Similar to Scaling to 30,000 Requests Per Second and Beyond with MongoDB (20)

Recently uploaded

Recently uploaded (20)

Scaling to 30,000 Requests Per Second and Beyond with MongoDB

Editor's Notes