You fire up your brand-new MongoDB Atlas cluster, and the application that used to run like a charm slows to a crawl. Welcome to space, time, and the laws of computing without warp drives. If you want to know how to maintain thousands of operations per second across thousands of miles, this talk is for you.
7. In theory and in practice...
us-west-2
us-west-1
us-east-1
us-east-2
5
32
37
8
min. fiber roundtrip time (milliseconds)
8. ... and what we actually get
us-west-2
us-west-1
us-east-1
us-east-2
11
52
62
21
ICMP ping time (milliseconds)
9. I Don’t Have That Problem
I’ll co-locate my app and my data. One region is good enough
• Some of us don’t have that luxury
• Our users are not where our app is
I wish I got even close to 100ms round trip
• Let’s take a closer look at that app
• 100ms always matters
10. Latency Matters
• Google — increase page load by 500ms, 25% fewer searches
• Amazon — for each 100ms, lose 1% of sales
• Facebook — pages 500ms slower, 1% drop-off in traffic
• a one-second delay in page response decreases customer
satisfaction by 16%
— Campbell / Majors, Database Reliability Engineering
11. What can we do?
• Latency is significant – and it won't go away
• Avoid, Ignore, Embrace
• Get the most out of every round-trip (batching)
• Do something else during round-trip (async)
13. Insert documents into MongoDB, with long-
range replication across trans-continental
links. Some inserts will fail due to duplicate
keys. Catch those and report the offending
documents.
14. Setup
• us-east-1 to us-west-1 (2,300mi, 62ms)
• 2-member replica set on m4.16xlarge
• mongodb 4.0.10
• write concern w:2, j:false
• clients in python 3.7.3 / pymongo 3.8.0 / motor 2.0.0
17. errors = []
for i in range(num_docs):
doc = { "_id" : i, "a" : random() }
try:
coll.insert_one(doc)
except DuplicateKeyError as e:
errors.append(doc)
sync / single
20. for i in range(0, num_docs, batch_size):
batch = [
InsertOne({ "_id" : j, "a" : random()})
for j in range(i, i+batch_size)
]
coll.bulk_write(batch)
sync / bulk
21. for i in range(0, num_docs, batch_size):
batch = [
InsertOne({ "_id" : j, "a" : random()})
for j in range(i, i+batch_size)
]
try:
coll.bulk_write(batch, ordered=False)
except BulkWriteError as e:
for x in e.details[u'writeErrors']:
error_id = x[u'op']['_id']
errors.append(
get_document(batch, error_id)
)
31. sync /
single
sync / bulk
async /
single
async / bulk
east-1 / west-1
62 ms
8
52,700 /
100,000
490
140,000 /
8,000
east-1 / east-2
11 ms
39
70,000 /
30,000
1,400
140,000 /
2,500
32. Summary
• Computing becomes an intercontinental Game of Chess
• ... and Einstein is on the table
• Understand what your latencies are – they won't go away
• Avoid, Ignore, Embrace
• Batching and asynchronous programming