Benchmarking, Load Testing, and Preventing Terrible Disasters

Mike Kania
Production Engineer @ Parse

Benchmarking, Load
Testing, and Preventing
Terrible Disasters

What Parse Does
We have 500k+ apps running on Parse.
Provide services to —
•Store user data
•Run server side JavaScript
•Send push notiﬁcations
•Handle crash reporting
•Generate analytics

Parse + MongoDB
• Use many of MongoDB’s feature set
• Support almost every type of workload you can
imagine
•Millions of collections and indexes
• new ones being created every minute
•Run MongoDB exclusively on AWS
•We do crazy things with MongoDB

Why Should You Listen
to Me?
• Parse has one of the most complex MongoDB
infrastructures(in the world?)
• Started using MongoDB in 1.8
• Upgraded 2.6 everywhere 6 months ago
• We have some battle wounds from upgrading
MongoDB to pass on to you

Why Shouldn’t You
Listen to Me?
MongoDB is a jack of all trades, and
there’s certain features that we haven’t
touched.
•Sharding — We built our own way
to shard data
•Aggregation/Map Reduce — We
don’t touch this at all

History of MongoDB
Upgrades at Parse
1.8 2.0 2.2 2.4 2.6 3.0
{Doitlive

Cowboy Upgrade
1. Review “Upgrade Requirements” and
known bugs in JIRA
2. Run intigration/unit tests agains the
new version
3. Spin up a hidden secondary. Watch for
problems
4. Unhide SECONDARY.. Watch for
problems
5. Promote to PRIMARY
6. Declare success! Oh wait I mean
watch for problems.

What Went Wrong
• 60% perf reduction
• all geo indexes block global
lock until the ﬁrst document
found
• unindexable writes suddenly
refused
• changed the deﬁnition of
scan limits,

A New Approach
1.8 2.0 2.2 2.4 2.6 3.0
{
{
Doitlive
Doitwith
production
workloads
in a
test environment

Flashback
• Open sourced benchmarking
tool speciﬁcally for MongoDB
• Captures production
workloads
• Replay those workloads
over and over again with
conﬁgurable speeds
• Recently merged a pull request
to support load testing with
Mongo sharing

Record
Get the config setup:
•oplog_server: A secondary that will be used to
tail the oplog for write operations
•profiler_server: The primary in the target replica
set to capture profiling data
•duration_sec: Defines how long you want to
record

Enable Proﬁling
• Keep in mind, it does an additional write for every
operation.
•./set_mongo_proﬁling.py -a enable -n
$PRIMARY_HOSTNAME

Moar Better Recording
• What about just capturing it over the wire?
• Maybe use mongosniff
• MongoDB has a built in pcap library.
• Enter mongocaputils
• Also open source
• Still a little buggy

Running the Record
./record.py

Creating a Consistent
Snapshot
Need a way to quickly capture a consistent
snapshot of your dataset
We use EBS snapshots,
•locking mongod
•creating an EBS snapshot of all the RAIDed
volumes on /var/lib/mongodb
•unlocking mongod.

Quickly Replaying
Workloads
•Pre-Warming EBS snapshots after each run is slow
and time consuming
•Pulling down the blocks from S3 takes hours or
days if you have terabytes of data.
•We decided to use LVM on top of EBS
•Does incur I/O overhead
•Allows us to do LVM snapshots!

How we used LVM
Deﬁne a restore point before benchmarking
•lvcreate -l 10%VG -s -n restore_point /dev/
mongovg/mongoraid
Merge Copy-on-Write logical volume to rollback
•Stop MongoDB
•Unmount Filesystem
•lvconvert –merge /dev/mongovg/restore_point

Creating the Test
Environment
• Spin up new EC2 instance and restore the EBS
volumes from snapshot
•New EBS volumes need to be pre-warmed.
Blocks are lazily loaded from S3
• Benchmark server which will run Flashback
request and has the workload on disk.
•Nothing specials needs to happen here

Benchmarking New
Shiny Storage Engines
In MongoDB 3.0, each storage engine has a
different on-disk format
So we also need to run an initial sync of each
new storage engine against our restored
MMAPv1 backup, and then run benchmarks
on each format.
MMAPv1
(restored from
snapshot)
RocksDB
WiredTiger
initial sync
initial sync

Side Note: The Storage
Efﬁciency of the RocksDB/
WiredTiger is Amazing*
*You should totally check out the “Storage Engine Wars” talk
by Charity Majors and Igor Canadi
0
1,000
2,000
3,000
4,000
283GB318GB
3,245GB
MMAPv1 WiredTiger RocksDB

Running the Replay
• Two styles to replay: real and
stress
ﬂashback
-ops_ﬁlename=OUTPUT
-style=real
-url=$MONGO_HOST:27017
-workers=50
MongoDB 2.6
MMAPv1
MongoDB 3.0
MMAPv1
MongoDB 3.0
RocksDB
Flashback

Metrics Gathering
• Flashback percentile latencies broken down by
operation type.
• Useful from a high level
• Not so useful when diving into query regressions

Logging Pipeline
• Mongo logs are hard to parse.
• Thankfully you don’t need to worry about it
• Just use our open source PEG parser
mongologtools
• Ship JSON via Scribe to an internal Facebook
data diving tool

First Results
Op
2.6
MMAPv1
3.0
MMAPv1
3.0
RockDB
query 2.93ms 4.43ms 3.04ms
p50 Query Latency
Op
2.6
MMAPv1
3.0
MMAPv1
3.0 RockDB
query 177.41ms
619471.47m
s
1441442.26
ms
p99 Query Latency

First Regression
•Regression in $nearSphere
queries just for 3.0
•SERVER-17469 — patched in
3.0.2
• After the ﬁx average latency for
$nearSphere went from
•2354 ms to 35 ms

More Ad-Hoc Analysis
MMAPv1RocksDB
# documents scanned
durationmsdurationms
# documents scanned

P99 Latency
query
insert
remove
update
ﬁndandmodify
count
0ms 10ms 20ms 30ms 40ms
1
5
1
1
0
2
0
28
1
22
23
2
0
15
11
21
32
8
2.6 MMAPv1
3.0 MMAPv1
3.0 RockDB
Some time later…

Benchmarks Won’t
Find Everything
•[RocksDB] Preﬁx collision could happen between
restarts
https://github.com/mongodb-partners/mongo/
commit/
da8a90b3b71bf291684ffc5a6d2fd32118ce1a7b
•[MongoDB] Secondary reads block replication
https://jira.mongodb.org/browse/SERVER-18190

Where are we now with
testing 3.0?
• MongoDB 3.0 with RocksDB is serving some
production trafﬁc and it looks amazing.
milliseconds
API Request

Linkage
• Flashback
• https://github.com/ParsePlatform/ﬂashback
• Mongologtools
• https://github.com/tmc/mongologtools
• MongoDB 3.0 Benchmarking Results
• http://blog.parse.com/learn/engineering/mongodb-rocksdb-writing-
so-fast-it-makes-your-head-spin/
• nearSphere regression
• https://jira.mongodb.org/browse/SERVER-17469
• WT/RocksDB secondary crash
• https://jira.mongodb.org/browse/SERVER-17882

Benchmarking, Load Testing, and Preventing Terrible Disasters

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Viewers also liked

Viewers also liked (11)

Similar to Benchmarking, Load Testing, and Preventing Terrible Disasters

Similar to Benchmarking, Load Testing, and Preventing Terrible Disasters (20)

More from MongoDB

More from MongoDB (20)

Recently uploaded

Recently uploaded (20)

Benchmarking, Load Testing, and Preventing Terrible Disasters