Mongo Sharding: Case Study


Published on

Tips, tricks and lessons learned from sharding a 200GB production database.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • MongoDB, like almost all other servers performs best when the data resides in memory- eliminating or reducing the need for the expesive trip to disk to retrieve data. As datasets grow in size, increasing available memory becomes resource and cost prohibitive. Sharding allows the data to be distributed across multiple servers, each one containing part of the collection. This reduces the memory requirements per server and can result in better performance overall.
  • A production-ready sharded environment might look something like this.You have multiple replicasets, in this case: there are 3.You have 3 config servers as well. In development environments, you can use a single config server but in production you should always have 3 for redundancy.
  • Group of mongod servers that maintain the same data setProvides redundancy and high availabilityShould be your standard for all production environments*NOT* a substitute or alternative for a good backup strategy
  • Consistently high page faults combined with a database lock rate remaining above 50% indicate the servers are struggling to keep up with demand.
  • Changing a shard key is not possible or practical, so really- it’s impossible to put enough emphasis on choosing the right shard key, especially if you wait until late in the game to shard (like we did).There’s no rule I can give you for how to choose the shard key, it really depends on the environment and the data set (as much as I hate answers that contain the phrase “it depends”).Use the system.profile collection to see what queries are commonly performing slow.Work with your developers to understand the dataMake sure your developers understand sharding in MongoDb
  • As nice as it looks now, this has the potential to be a horrible shard key design.Consider a write-heavy scenario where new documents are being inserted.But… consider a static collection where new documents aren’t commonly inserted. All of a sudden this looks pretty darn sweet!
  • Expanding production involved building new replica sets that would accept the sharded collectionsI use Amazon EC2 for this environment, so expanding was pretty easy: create an EC2 imate, clone instances from it, update the /etc/mongod.conf to reflect the new replica set, then initiate the new replica set. Sure, there are existing community EC2 images provided by Mongo Inc, but I chose to use my own.Does dev match prod? For me: no. To accommodate this expansion (I was sharding two different databases into 3 shards each), sharding production alone increased my server overhead by 12 servers. To make dev match, that would have been another 12 servers, for a total of 24 servers. Not a good fit for our financial model at this time. End result? The data in dev mimics prod, as does the database structure, but the hardware does not.
  • We’ve done all our homework. We know we need to shard, we’ve figured out what we should shard, identified our shard keys, and we’ve built the hardware to do it. The shard key must be indexed. In this case, we’re using a hash of the _id as our shard key, so we need to create that index since it doesn’t exist.
  • Now that the sharding commands have been issued, how do you tell what’s going on?How can you tell if things are going wrong?
  • Once sharding is complete, and the balancer has moved the chunks- you’ll see a nice, even distribution of chunks.This wasn’t the case for us.
  • Initially, we weren’t seeing any chunk distribution.Checking the logs showed that one of the config servers was out of sync on server time with the other two.Troubleshooting revealed that this server couldn’t hit the NTP servers to update its clock.After that, we made some progress- but not much.Checking the logs for the balancer showed many “aborted” transactions. This indicates that the chunk was being moved, but a write operation occurred on that chunk. When the chunk changes during migration, mongod has to abort the migration.To get around this, you have to either reduce the write operations to the collection, or dedicate windows of operation to each process.
  • Mongo Sharding: Case Study

    1. 1. Phoenix MUG Sharding: A Case Study @wfbutton
    2. 2. Overview • What is sharding • How we knew it was time to shard • What to shard • Choosing a shard key • Building servers • Integrating sharding into a production environment • Monitoring for success/failure • Lessons learned • Things you can do today
    3. 3. About Me • DevOps/IT/DBA for • Extensive background in both development and ops, specifically in scalability and sustainability Will Button | @wfbutton |
    4. 4. What is sharding?
    5. 5. Mongos Config Servers • Stores metadata for the cluster • Not a replica set • Metadata consists of: • collections • shards • chunks • mongos instances • • • • routing service for mongo shards likely to run on application server apps talk to mongos in sharded environment
    6. 6. When Should I Shard?
    7. 7. When Should I Shard? { "ts" : ISODate("2013-11-01T01:34:30.683Z"), "op" : "query", "ns" : "MyListContent.RepoThing", "query" : { "query" : { "provId" : "cae56942-5c9c-776c-0506-2c9f4092e107", "provUnifiedId" : "40233411" }, "$readPreference" : { "mode" : "primary" } }, "ntoreturn" : 0, "ntoskip" : 0, "nscanned" : 58, "keyUpdates" : 0, "numYield" : 16, "lockStats" : { "timeLockedMicros" : { "r" : NumberLong(1260461), "w" : NumberLong(0) }, "timeAcquiringMicros" : { "r" : NumberLong(1275073), "w" : NumberLong(2369) } }, "nreturned" : 57, "responseLength" : 91643, "millis" : 1200, "client" : "", "user" : "" } 58 records scanned 57 documents returned 1200 milliseconds YIKES!
    8. 8. What To Shard socialcatalog03:SECONDARY> db.system.profile.aggregate( ... { $group: ... { _id: "$ns", count: ... { $sum: 1 } ... } ... } ... ) { "result" : [ { "_id" : "admin.$cmd", "count" : 1 }, { "_id" : "MyListContent.BrandPage", "count" : 97 }, { "_id" : "MyListContent.RepoThingUpdate", "count" : 50 }, { "_id" : "MyListContent.RepoThing", "count" : 1824 } ], "ok" : 1 } system.profile collection is your friend!
    9. 9. Choosing A Shard Key • Next to getting married, the most important decision you’ll ever make
    10. 10. Choosing A Shard Key Collection:stuff Shard key: _id 0…………..100…………..200…………..300…………..400…………..500…………..600 Shard 1 Shard 2 Shard 3
    11. 11. Adding New Servers • • • • Expanding production Using Amazon EC2 Updating production Does dev match prod? Build EC2 Image Clone Instances Update conf rs.init()
    12. 12. Shard: Actual Steps mongos> db.BrandPage.ensureIndex( { "_id": "hashed" } ) mongos> sh.shardCollection("MyListContent.BrandPage", { "_id": "hashed" })
    13. 13. Monitoring Shard Status
    14. 14. Monitoring Shard Status mongos> db.BrandPage.getShardDistribution() Shard socialcatalog03 at socialcatalog03/,, data : 1.26GiB docs : 3334394 chunks : 41 estimated data per chunk : 31.49MiB estimated docs per chunk : 81326 Totals data : 1.26GiB docs : 3334394 chunks : 41 Shard socialcatalog03 contains 100% data, 100% docs in cluster, avg obj size on shard : 406B
    15. 15. mongos> db.BrandPage.getShardDistribution() Shard rs210 at rs210/,, data : 54.48MiB docs : 122774 chunks : 7 estimated data per chunk : 7.78MiB estimated docs per chunk : 17539 Shard rs220 at rs220/,, data : 54.09MiB docs : 122151 chunks : 7 estimated data per chunk : 7.72MiB estimated docs per chunk : 17450 Shard rs310 at rs310/,, data : 54.65MiB docs : 123138 chunks : 7 estimated data per chunk : 7.8MiB estimated docs per chunk : 17591 Shard rs320 at rs320/,, data : 54.63MiB docs : 123163 chunks : 7 estimated data per chunk : 7.8MiB estimated docs per chunk : 17594 Shard socialcatalog02 at socialcatalog02/,, data : 46.54MiB docs : 105031 chunks : 6 estimated data per chunk : 7.75MiB estimated docs per chunk : 17505 Shard socialcatalog03 at socialcatalog03/,, data : 99.9MiB docs : 242755 chunks : 7 estimated data per chunk : 14.27MiB estimated docs per chunk : 34679 Totals data : 364.31MiB docs : 839012 chunks : 41 Shard rs210 contains 14.95% data, 14.63% docs in cluster, avg obj size on shard : 465B Shard rs220 contains 14.84% data, 14.55% docs in cluster, avg obj size on shard : 464B Shard rs310 contains 15% data, 14.67% docs in cluster, avg obj size on shard : 465B Shard rs320 contains 14.99% data, 14.67% docs in cluster, avg obj size on shard : 465B Shard socialcatalog02 contains 12.77% data, 12.51% docs in cluster, avg obj size on shard : 464B Shard socialcatalog03 contains 27.42% data, 28.93% docs in cluster, avg obj size on shard : 431B
    16. 16. Sharding takes time… • But check the logs Thu Nov 21 00:19:35.964 [Balancer] caught exception while doing balance: error checking clock skew of cluster,, :: caused by :: 13650 clock skew of the cluster,, is too far out of bounds to allow distributed locking. Thu Nov 21 21:25:16.249 [conn84709] about to log metadata event: { _id: "aws-prod-mongo301-2013-11-21T21:25:16528e7a3c374ed2e78b6298e4", server: "aws-prod-mongo301", clientAddr: "", time: new Date(1385069116248), what: "moveChunk.from", ns: "MyListContent.BrandPage", details: { min: { _id: -7394546541005003026 }, max: { _id: -6937685518831975781 }, step1 of 6: 0, note: "aborted" } }
    17. 17. Tips, Tricks, Gotchas • • • • Always use 3 config servers Always use NTP Always use CNAMES Always specify configdb servers in the same order • Shard early, shard often
    18. 18. Things You Can Do Today • • • • • • • Enable/analyze system.profile Identify long running queries Review indexes, queries and performance Verify replica sets are in sync Setup alerting for replica set sync Replica sets are not backups Schedule a data review with the devs to plan sharding strategies
    19. 19. Sharding MongoDB: A Case Study Say hi! @wfbutton