Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Sharding Methods for MongoDB

24,475 views

Published on

Learn about various sharding methods for MongoDB.

Published in: Technology
  • DOWNLOAD FULL. BOOKS INTO AVAILABLE FORMAT ......................................................................................................................... ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. PDF EBOOK here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. EPUB Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... 1.DOWNLOAD FULL. doc Ebook here { https://tinyurl.com/y8nn3gmc } ......................................................................................................................... ......................................................................................................................... ......................................................................................................................... .............. Browse by Genre Available eBooks ......................................................................................................................... Art, Biography, Business, Chick Lit, Children's, Christian, Classics, Comics, Contemporary, Cookbooks, Crime, Ebooks, Fantasy, Fiction, Graphic Novels, Historical Fiction, History, Horror, Humor And Comedy, Manga, Memoir, Music, Mystery, Non Fiction, Paranormal, Philosophy, Poetry, Psychology, Religion, Romance, Science, Science Fiction, Self Help, Suspense, Spirituality, Sports, Thriller, Travel, Young Adult,
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

Sharding Methods for MongoDB

  1. 1. Sharding Methods For MongoDB Jay Runkel jay.runkel@mongodb.com @jayrunkel #MongoDB
  2. 2. 2 • Customer Stories • Sharding for Performance/Scale – When to shard? – How many shards do I need? • Types of Sharding • How to Pick a Shard Key • Sharding for Other Reasons Agenda
  3. 3. Customer Stories
  4. 4. 4
  5. 5. 5 • 50M users. • 6B check-ins to date (6M per day growth). • 55M points of interest / venues. • 1.7M merchants using the platform for marketing • Operations Per Second: 300,000 • Documents: 5.5B Foursquare
  6. 6. 6 • 11 MongoDB clusters – 8 are sharded • Largest cluster has 15 shards (check ins) – Sharded on user id Foursquare clusters
  7. 7. 7 • Large data set CarFax
  8. 8. 8 • 13 billion+ documents – 1.5 billion documents added every year • 1 vehicle history report is > 200 documents • 12 Shards • 9-node replica sets • Replicas distributed across 3 data centers CarFax Shards
  9. 9. 9
  10. 10. What is Sharding?
  11. 11. 12 Sharding Overview Primary Secondary Secondary Shard 1 Primary Secondary Secondary Shard 2 Primary Secondary Secondary Shard 3 Primary Secondary Secondary Shard N … Query Router Query Router Query Router …… Driver Application
  12. 12. 14 Scaling: Sharding mongod Read/Write Scalability Key Range 0..100
  13. 13. 15 Scaling: Sharding Read/Write Scalability mongod mongod Key Range 0..50 Key Range 51..100
  14. 14. 16 Scaling: Sharding mongod mongod mongod mongod Key Range 0..25 Key Range 26..50 Key Range 51..75 Key Range 76.. 100 Read/Write Scalability
  15. 15. How do I know I need to shard?
  16. 16. 18 Does one server/replica… • Have enough disk space to store all my data? • Handle my query throughput (operations per second)? • Respond to queries fast enough (latency)?
  17. 17. 19 • Have enough disk space to store all my data? • Handle my query throughput (operations per second)? • Respond to queries fast enough (latency)? Does one server/replica set… Server Specs Disk Capacity Disk IOPS RAM Network Disk IOPS RAM Network
  18. 18. How many shards do I need?
  19. 19. 21 • Sum of disk space across shards > greater than required storage size Disk Space: How Many Shards Do I Need?
  20. 20. 22 • Sum of disk space across shards > greater than required storage size Disk Space: How Many Shards Do I Need? Example Storage size = 3 TB Server disk capacity = 2 TB 2 Shards Required
  21. 21. 23 • Working set should fit in RAM – Sum of RAM across shards > Working Set • WorkSet = Indexes plus the set of documents accessed frequently • WorkSet in RAM  – Shorter latency – Higher Throughput RAM: How Many Shards Do I Need?
  22. 22. 24 • Measuring Index Size and Working Set db.stats() – index size of each collection db.serverStatus({ workingSet: 1}) – working set size estimate RAM: How Many Shards Do I Need?
  23. 23. 25 • Measuring Index Size and Working Set db.stats() – index size of each collection db.serverStatus({ workingSet: 1}) – working set size estimate RAM: How Many Shards Do I Need? Example Working Set = 428 GB Server RAM = 128 GB 428/128 = 3.34 4 Shards Required
  24. 24. 26 • Sum of IOPS across shards > greater than required IOPS • IOPS are difficult to estimate – Update doc – Update indexes – Append to journal – Log entry? • Best approach – build a prototype and measure Disk Throughput: How Many Shards Do I Need
  25. 25. 27 • Sum of IOPS across shards > greater than required IOPS • IOPS are difficult to estimate – Update doc – Update indexes – Append to journal – Log entry? • Best approach – build a prototype and measure Disk Throughput: How Many Shards Do I Need Example Required IOPS = 11000 Server disk IOPS = 5000 3 Shards Required
  26. 26. 28 • S = ops/sec of a single server • G = required ops/sec • N = # of shards • G = N * S * .7 N = G/.7S OPS: How Many Shards Do I Need?
  27. 27. 29 • S = ops/sec of a single server • G = required ops/sec • N = # of shards • G = N * S * .7 N = G/.7S OPS: How Many Shards Do I Need? Sharding Overhead
  28. 28. 30 • S = ops/sec of a single server • G = required ops/sec • N = # of shards • G = N * S * .7 N = G/.7S OPS: How Many Shards Do I Need? Example S = 4000 G = 10000 N = 3.57 4 Shards
  29. 29. Types of Sharding
  30. 30. 32 • Range • Tag-Aware • Hashed Sharding Types
  31. 31. 33 Range Sharding mongod mongod mongod mongod Key Range 0..25 Key Range 26..50 Key Range 51..75 Key Range 76.. 100 Read/Write Scalability
  32. 32. 34 Tag-Aware Sharding mongod mongod mongod mongod Shard Tags Shard Tag Start End Winter 23 Dec 21 Mar Spring 22 Mar 21 Jun Summer 21 Jun 23 Sep Fall 24 Sep 22 Dec Tag Ranges Winter Spring Summer Fall
  33. 33. 35 Hash-Sharding mongod mongod mongod mongod Hash Range 0000..4444 Hash Range 4445..8000 Hash Range i8001..aaaa Hash Range aaab..ffff
  34. 34. 36 Hashed shard key • Pros: – Evenly distributed writes • Cons: – Random data (and index) updates can be IO intensive – Range-based queries turn into scatter gather Shard 1 mongos Shard 2 Shard 3 Shard N
  35. 35. 37 Range sharding document distribution
  36. 36. 38 Hashed sharding document distribution
  37. 37. How do I Pick A Shard Key
  38. 38. 40 Shard Key characteristics • A good shard key has: – sufficient cardinality – distributed writes – targeted reads ("query isolation") • Shard key should be in every query if possible – scatter gather otherwise • Choosing a good shard key is important! – affects performance and scalability – changing it later is expensive
  39. 39. 41 Low cardinality shard key • Induces "jumbo chunks" • Examples: boolean field Shard 1 mongos Shard 2 Shard 3 Shard N [ a, b )
  40. 40. 42 Ascending shard key • Monotonically increasing shard key values cause "hot spots" on inserts • Examples: timestamps, _id Shard 1 mongos Shard 2 Shard 3 Shard N [ ISODate(…), $maxKe
  41. 41. Reasons to Shard
  42. 42. 44 • Scale – Data volume – Query volume • Global deployment with local writes – Geography aware sharding • Tiered Storage • Fast backup restore Reasons to shard
  43. 43. 45 Global Deployment/Local Writes Primary:NYC Secondary:NYC Primary:LON Primary:SYD Secondary:LON Secondary:NYC Secondary:SYD Secondary:LON Secondary:SYD
  44. 44. 46 • Save hardware costs • Put frequently accessed documents on fast servers – Infrequently accessed documents on less capable servers • Use Tag aware sharding Tiered Storage mongod mongod mongod mongod Current Current Archive Archive SSD SSD HDD HDD
  45. 45. 47 • 40 TB Database • 2 shards of 20 TB each • Challenge – Cannot meet restore SLA after data loss Fast Restore mongod mongod 20 TB 20 TB
  46. 46. 48 • 40 TB Database • 4 shards of 10 TB each • Solution – Reduce the restore time by 50% Fast Restore mongod mongod 10 TB 10 TB mongod mongod 10 TB 10 TB
  47. 47. Summary
  48. 48. 50 • To determine required # of shards determine – Storage requirements – Latency requirements – Throughput requirements • Derive total – Disk capacity – Disk throughput – RAM • Calculate # of shards based upon individual server specs Determining the # of shards
  49. 49. 51 • Scalability • Geo-aware clusters • Tiered Storage • Reduce backup restore times Leverage Sharding For
  50. 50. 52 • MongoDB Manual: http://docs.mongodb.org/manual/sharding/ • Other Webinars: – How to Achieve Scale With MongoDB • White Papers – MongoDB Performance Best Practices – MongoDB Architecture Guide Sharding: Where to go from here…
  51. 51. Get Expert Advice on Scaling. For Free. For a limited time, if you’re considering a commercial relationship with MongoDB, you can sign up for a free one hour consult about scaling with one of our MongoDB Engineers. Sign Up: http://bit.ly/1rkXcfN
  52. 52. 54 Webinar Q&A jay.runkel@mongodb.com @jayrunkel Stay tuned after the webinar and take our survey for your chance to win MongoDB swag.
  53. 53. Thank You

×