Scaling MongoDB for real time analytics
Upcoming SlideShare
Loading in...5

Scaling MongoDB for real time analytics






Total Views
Views on SlideShare
Embed Views



2 Embeds 12 10 2



Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.


11 of 1

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Is there any video regarding these slides?
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Scaling MongoDB for real time analytics Scaling MongoDB for real time analytics Presentation Transcript

    • What we doWe want to make digital advertising anamazing user experience.There is more to metrics that clicks.
    • Ads
    • Data
    • Assembling sessions exposure ping pingevent ping ping ping ➔ ➔ session event ping
    • Information
    • Crunching session sessionsession session session sessionsession session session session session ➔ ➔ 42 session session
    • Metrics
    • Reports
    • What we doTrack ads, make pretty reports.
    • That doesn’tsound so hardWe don’t know when sessions endThere’s a lot of dataIt’s all done in (close to) real time
    • Numbers200 Gb logs100 million data pointsper day~300 metrics per data point= 6000 updates / s at peak
    • How we use(d) MongoDB“Virtual memory” to offload data while we waitfor sessions to finishShort time storage (<48 hours) for batch jobs,replays and manual analysisMetrics storage
    • Why we use MongoDBSchemalessness makes things so much easier,the data we collect changes as we come upwith new ideasSharding makes it possible to scale writesSecondary indexes and rich query language aregreat features (for the metrics store)It’s just… nice
    • Btw.We use JRuby, it’s awesome
    • A story in 9 iterations
    • 1st iterationsecondary indexes and updatesOne document per session, update as newdata comes alongOutcome: 1000% write lock
    • #1Everything is aboutworking around the GLOBAL WRITE LOCK
    • MongoDB 1.8.1 db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true)db.coll.update({_id: "abc"}, {$push: {x: “...”}}, true)
    • MongoDB 2.0.0 db.coll.update({_id: "xyz"}, {$inc: {x: 1}}, true)db.coll.update({_id: "abc"}, {$push: {x: “...”}}, true)
    • 2nd iterationusing scans for two step assemblingInstead of updating, save each fragment, thenscan over _id to assemble sessions
    • 2nd iterationusing scans for two step assemblingOutcome: not as much lock, but still not greatperformance. We also realised we couldn’tremove data fast enough
    • #2Everything is aboutworking around the GLOBAL WRITE LOCK
    • #3 Give a lot ofthought to yourPRIMARY KEY
    • 3rd iterationpartitioningPartitioning the data by writing to a newcollection every hourOutcome: complicated, fragmented database
    • #4Make sure you canREMOVEOLD DATA
    • 4th iterationshardingTo get around the global write lock and gethigher write performance we moved to asharded cluster.Outcome: higher write performance, lots ofproblems, lots of ops time spent debugging
    • #5Everything is aboutworking around the GLOBAL WRITE LOCK
    • #6 SHARDING IS NOT ASILVER BULLET and it’s complex, if you can, avoid it
    • #7IT WILL FAIL design for it
    • 5th iterationmoving things to separate clustersWe saw very different loads on the shards andrealised we had databases with very differentusage patterns, some that made autoshardingnot work. We moved these off the cluster.Outcome: a more balanced and stable cluster
    • #8Everything is aboutworking around the GLOBAL WRITE LOCK
    • #9ONE DATABASEwith one usage patternPER CLUSTER
    • #10 MONITOREVERYTHINGlook at your health graphs daily
    • 6th iterationmonster machinesWe got new problems removing data andneeded some room to breathe and thinkSolution: upgraded the servers to High-Memory Quadruple Extra Large (with cheese).I♥
    • #11Don’t try to scale upSCALE OUT
    • #12 When you’re out of ideasCALL THEEXPERTS
    • 7th iterationpartitioning (again) and pre-chunkingWe rewrote the database layer to write to anew database each day, and we created allchunks in advance. We also decreased the sizeof our documents by a lot.Outcome: no more problems removing data.
    • #13Smaller objects means asmaller database, and asmaller database means LESS RAM NEEDED
    • #14 Give a lot ofthought to yourPRIMARY KEY
    • #15Everything is aboutworking around the GLOBAL WRITE LOCK
    • 8th iterationrealize when you have the wrong toolTransient data might not need all the bells andwhistles.Outcome: Redis gave us 100x performance inthe assembling step
    • #16When all you have is a HAMMEReverything looks like a NAIL
    • 9th iterationrinse and repeatWe now have the same scaling issues later inthe chain.Outcome: Upcoming rewrite to make writes/updated more effectiveRedis was actually slower
    • #17Everything is aboutworking around the GLOBAL WRITE LOCK
    • Thank you @effata
    • Since we got time…
    • TipsEC2You have three copies of your data, do youreally need EBS?Instance store disks are included in the priceand they have predictable performance.m1.xlarge comes with 1.7 TB of storage.
    • TipsAvoid bulk insertsVery dangerous if there’s a possibility ofduplicate key errorsIt’s not fixed in 2.0 even though the driver has aflag for it.
    • TipsSafe modeRun every Nth insert in safe modeThis will give you warnings when bad thingshappen; like failovers