Mongodb sharding

10,340
-1

Published on

basic mongodb sharding introduction

0 Comments
15 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
10,340
On Slideshare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
217
Comments
0
Likes
15
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Mongodb sharding

    1. 1. Mongodb Sharding by Fan, Xiangrong 1
    2. 2. Agenda• Why Sharding• Sharding Architecture• What is Sharding• Sharding Balancer• Write/Reads with Sharding• Sharding Limitation• Demo 2
    3. 3. Why Sharding• All writes go to master• Latency sensitive queries still go to master• Single replica set has limitation of 12 nodes• Memory can’t be large enough when active dataset is big• Local Disk is not big enough• Vertical upgrade is too expensive 3
    4. 4. Architecture
    5. 5. Config Config Servers Servers mongod mongod• We have three config servers in prod cluster or one in test environment mongod• Changes are made using 2 phase commit to provide strong consistency among all 3 config servers• If anyone is down, meta data will be read only• System is online as long as 1/3 is up 5
    6. 6. shard1 Shards mongo mongo• Each Shard can be master, master/slave or replica set• Replica set provides auto-failover capability for sharding cluster• Regular mongod processes 6
    7. 7. Mongos mongos• Sharding Router• Acts just like a mongod to clients, it makes the cluster “invisible” to clients• You can have as many as you want• It’s suggested to run on appserver• It caches metadata from config servers 7
    8. 8. Mongos Mongod Clients 8
    9. 9. What is Sharding• It’s range based• Automatic balancing for changes in load and data distribution• Convert from single replica set to sharding cluster without downtime• Easy addition of new shards without downtime• Scaling to one thousand nodes• No single points of failure• Automatic failover 9
    10. 10. Shard key• It can be one or more fields• every document needs a shard key (null is ok)• shard key can’t be updated• MongoDBs sharding is order-preserving.You can define the shard key as ascending order or descending order, like { tag : 1, timestamp : -1 }• null < numbers < strings < objects < arrays < binary data < ObjectIds < booleans < dates < regular expressions 10
    11. 11. Chunk• A chunk is a contiguous range of data from a particular collection• Collection is broken into chunks by range• A chunk is a logical concept, not a physical reality. $minKey <= key < $maxKey• Each document must belong to one and only one chunk• default size is 64M, can be specified by -- chunksize 11
    12. 12. Chunk 12
    13. 13. Chunk Split• When chunk reaches its size limit, split happens• Split is an inexpensive metadata operation• You can manually split chunks 13
    14. 14. Chunk Split 14
    15. 15. Chunk Split (-∞, +∞ ) 14
    16. 16. Chunk Split(-∞, 0) [0, +∞ ) 14
    17. 17. Chunk Split(-∞, -500) [-500, 0) [0, +∞ ) 14
    18. 18. Chunk Split(-∞, -500) [-500, 0) [0, 500) [500, +∞) 14
    19. 19. Chunk Migration• Chunk Migration is an expensive operation• Only one chunk migration happens at any time• based on overall size of the shard• Balancer will automatically migrate chunks between shards• you can also manually move chunks 15
    20. 20. Sharding Balancer• keep data evenly distributed on all shards• minimize the amount of data transfered• For a balancing round to occur, a shard must have at least nine more chunks than the least-populous shard• it can be turn off • db.settings.update({"_id" : "balancer"}, {"$set" : {"stopped" : true }}, true) 16
    21. 21. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 17
    22. 22. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 17
    23. 23. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 18
    24. 24. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 19
    25. 25. Choosing Shard Key• A good shard key can distribute reads and writes, but that also keeps the data you’re using together• Don’t use ascending shard key like ID• Don’t use low cardinality shard key like continent• Don’t use random shard key like MD5• Good example: Coarsely ascending key + search key 20
    26. 26. Reads/Writes with Sharding 21
    27. 27. Sharding Limitation• Unique index can’t be created without shared key as a prefix• You can’t update shard key• Only one chunk move in the cluster at a time• Sharding does not yet support data center awareness• Add new shards brings in more traffic to existing cluster• 20Pb size limit 22
    28. 28. Demo• Startup Shards• Startup config servers• Startup mongos• Configure Shards• Shard Data• Look at config data @ mongo config server 23
    29. 29. Startup Shards• mkdir /data/db/a /data/db/b• ./mongod --shardsvr --dbpath /data/db/a -- port 10000 > /tmp/sharda.log &• cat /tmp/sharda.log• ./mongod --shardsvr --dbpath /data/db/b -- port 10001 > /tmp/shardb.log &• cat /tmp/shardb.log 24
    30. 30. Startup Config Server• mkdir /data/db/config• ./mongod --configsvr --dbpath /data/db/ config --port 20000 > /tmp/configdb.log &• cat /tmp/configdb.log 25
    31. 31. Startup Mongos• ./mongos --configdb localhost:20000 > / tmp/mongos.log &• cat /tmp/mongos.log 26
    32. 32. Config Shards• $ ./mongo• MongoDB shell version: 1.6.0• connecting to: test• > use admin • switched to db admin• > db.runCommand( { addshard : "localhost:10000" } ) • { "shardadded" : "shard0000", "ok" : 1 }• > db.runCommand( { addshard : "localhost:10001" } ) • { "shardadded" : "shard0001", "ok" : 1 } 27
    33. 33. Shard Data• > db.runCommand( { enablesharding : "test" } ) • {"ok" : 1}• > db.runCommand( { shardcollection : "test.people", key : {name : 1} } ) • {"ok" : 1} 28
    34. 34. Add Shards• db.runCommand( { addshard : "foo/ <serverhostname>[:<port>]" } ); {"ok" : 1 , "added" : "foo"} 29
    35. 35. Look at config data• Login config database• db.shards.find()• db.databases.find()• db.chunks.find()• db.printShardingStatus(true) 30
    36. 36. Recommend Reads• Mongodb Documentation • http://www.mongodb.org/display/DOCS/ Sharding• Book “Scaling Mongodb” • You can find it on www.safaribooksonline.com 31
    37. 37. Q &A 32
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×