Mongodb Sharding    by Fan, Xiangrong            1
Agenda• Why Sharding• Sharding Architecture• What is Sharding• Sharding Balancer• Write/Reads with Sharding• Sharding Limi...
Why Sharding• All writes go to master• Latency sensitive queries still go to master• Single replica set has limitation of ...
Architecture
Config         Config Servers                                              Servers                                          ...
shard1                Shards                mongo                                      mongo• Each Shard can be master, ma...
Mongos                                        mongos• Sharding Router• Acts just like a mongod to clients, it makes  the c...
Mongos Mongod  Clients     8
What is Sharding•   It’s range based•   Automatic balancing for changes in load and data distribution•   Convert from sing...
Shard key•   It can be one or more fields•   every document needs a shard key (null is ok)•   shard key can’t be updated•  ...
Chunk•   A chunk is a contiguous range of data from a    particular collection•   Collection is broken into chunks by rang...
Chunk  12
Chunk Split• When chunk reaches its size limit, split  happens• Split is an inexpensive metadata operation• You can manual...
Chunk Split    14
Chunk Split   (-∞, +∞ )    14
Chunk Split(-∞, 0)        [0, +∞ )          14
Chunk Split(-∞, -500)   [-500, 0)        [0, +∞ )                         14
Chunk Split(-∞, -500)   [-500, 0)        [0, 500)   [500, +∞)                         14
Chunk Migration• Chunk Migration is an expensive operation• Only one chunk migration happens at any  time• based on overal...
Sharding Balancer• keep data evenly distributed on all shards• minimize the amount of data transfered• For a balancing rou...
Sharding Balancermongod   mongod   mongod    mongod   mongod   mongod   mongod                      mongos                ...
Sharding Balancermongod            mongod    mongod   mongod   mongod   mongod         mongod                      mongos ...
Sharding Balancermongod            mongod    mongod   mongod   mongod   mongod         mongod                      mongos ...
Sharding Balancermongod   mongod   mongod    mongod   mongod   mongod   mongod                      mongos                ...
Choosing Shard Key• A good shard key can distribute reads and  writes, but that also keeps the data you’re  using together...
Reads/Writes with    Sharding        21
Sharding Limitation•   Unique index can’t be created without shared    key as a prefix•   You can’t update shard key•   Onl...
Demo• Startup Shards• Startup config servers• Startup mongos• Configure Shards• Shard Data• Look at config data @ mongo config...
Startup Shards• mkdir /data/db/a /data/db/b• ./mongod --shardsvr --dbpath /data/db/a --  port 10000 > /tmp/sharda.log &• c...
Startup Config Server• mkdir /data/db/config• ./mongod --configsvr --dbpath /data/db/  config --port 20000 > /tmp/configdb.log ...
Startup Mongos• ./mongos --configdb localhost:20000 > /  tmp/mongos.log &• cat /tmp/mongos.log                     26
Config Shards•   $ ./mongo•   MongoDB shell version: 1.6.0•   connecting to: test•   > use admin    •   switched to db admi...
Shard Data• > db.runCommand( { enablesharding :  "test" } )   • {"ok" : 1}• > db.runCommand( { shardcollection :  "test.pe...
Add Shards• db.runCommand( { addshard : "foo/  <serverhostname>[:<port>]" } ); {"ok" : 1 ,  "added" : "foo"}              ...
Look at config data•   Login config database•   db.shards.find()•   db.databases.find()•   db.chunks.find()•   db.printSharding...
Recommend Reads• Mongodb Documentation • http://www.mongodb.org/display/DOCS/    Sharding• Book “Scaling Mongodb” • You ca...
Q &A  32
Upcoming SlideShare
Loading in...5
×

Mongodb sharding

9,956

Published on

basic mongodb sharding introduction

0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
9,956
On Slideshare
0
From Embeds
0
Number of Embeds
10
Actions
Shares
0
Downloads
205
Comments
0
Likes
13
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript of "Mongodb sharding"

    1. 1. Mongodb Sharding by Fan, Xiangrong 1
    2. 2. Agenda• Why Sharding• Sharding Architecture• What is Sharding• Sharding Balancer• Write/Reads with Sharding• Sharding Limitation• Demo 2
    3. 3. Why Sharding• All writes go to master• Latency sensitive queries still go to master• Single replica set has limitation of 12 nodes• Memory can’t be large enough when active dataset is big• Local Disk is not big enough• Vertical upgrade is too expensive 3
    4. 4. Architecture
    5. 5. Config Config Servers Servers mongod mongod• We have three config servers in prod cluster or one in test environment mongod• Changes are made using 2 phase commit to provide strong consistency among all 3 config servers• If anyone is down, meta data will be read only• System is online as long as 1/3 is up 5
    6. 6. shard1 Shards mongo mongo• Each Shard can be master, master/slave or replica set• Replica set provides auto-failover capability for sharding cluster• Regular mongod processes 6
    7. 7. Mongos mongos• Sharding Router• Acts just like a mongod to clients, it makes the cluster “invisible” to clients• You can have as many as you want• It’s suggested to run on appserver• It caches metadata from config servers 7
    8. 8. Mongos Mongod Clients 8
    9. 9. What is Sharding• It’s range based• Automatic balancing for changes in load and data distribution• Convert from single replica set to sharding cluster without downtime• Easy addition of new shards without downtime• Scaling to one thousand nodes• No single points of failure• Automatic failover 9
    10. 10. Shard key• It can be one or more fields• every document needs a shard key (null is ok)• shard key can’t be updated• MongoDBs sharding is order-preserving.You can define the shard key as ascending order or descending order, like { tag : 1, timestamp : -1 }• null < numbers < strings < objects < arrays < binary data < ObjectIds < booleans < dates < regular expressions 10
    11. 11. Chunk• A chunk is a contiguous range of data from a particular collection• Collection is broken into chunks by range• A chunk is a logical concept, not a physical reality. $minKey <= key < $maxKey• Each document must belong to one and only one chunk• default size is 64M, can be specified by -- chunksize 11
    12. 12. Chunk 12
    13. 13. Chunk Split• When chunk reaches its size limit, split happens• Split is an inexpensive metadata operation• You can manually split chunks 13
    14. 14. Chunk Split 14
    15. 15. Chunk Split (-∞, +∞ ) 14
    16. 16. Chunk Split(-∞, 0) [0, +∞ ) 14
    17. 17. Chunk Split(-∞, -500) [-500, 0) [0, +∞ ) 14
    18. 18. Chunk Split(-∞, -500) [-500, 0) [0, 500) [500, +∞) 14
    19. 19. Chunk Migration• Chunk Migration is an expensive operation• Only one chunk migration happens at any time• based on overall size of the shard• Balancer will automatically migrate chunks between shards• you can also manually move chunks 15
    20. 20. Sharding Balancer• keep data evenly distributed on all shards• minimize the amount of data transfered• For a balancing round to occur, a shard must have at least nine more chunks than the least-populous shard• it can be turn off • db.settings.update({"_id" : "balancer"}, {"$set" : {"stopped" : true }}, true) 16
    21. 21. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 17
    22. 22. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 17
    23. 23. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 18
    24. 24. Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 19
    25. 25. Choosing Shard Key• A good shard key can distribute reads and writes, but that also keeps the data you’re using together• Don’t use ascending shard key like ID• Don’t use low cardinality shard key like continent• Don’t use random shard key like MD5• Good example: Coarsely ascending key + search key 20
    26. 26. Reads/Writes with Sharding 21
    27. 27. Sharding Limitation• Unique index can’t be created without shared key as a prefix• You can’t update shard key• Only one chunk move in the cluster at a time• Sharding does not yet support data center awareness• Add new shards brings in more traffic to existing cluster• 20Pb size limit 22
    28. 28. Demo• Startup Shards• Startup config servers• Startup mongos• Configure Shards• Shard Data• Look at config data @ mongo config server 23
    29. 29. Startup Shards• mkdir /data/db/a /data/db/b• ./mongod --shardsvr --dbpath /data/db/a -- port 10000 > /tmp/sharda.log &• cat /tmp/sharda.log• ./mongod --shardsvr --dbpath /data/db/b -- port 10001 > /tmp/shardb.log &• cat /tmp/shardb.log 24
    30. 30. Startup Config Server• mkdir /data/db/config• ./mongod --configsvr --dbpath /data/db/ config --port 20000 > /tmp/configdb.log &• cat /tmp/configdb.log 25
    31. 31. Startup Mongos• ./mongos --configdb localhost:20000 > / tmp/mongos.log &• cat /tmp/mongos.log 26
    32. 32. Config Shards• $ ./mongo• MongoDB shell version: 1.6.0• connecting to: test• > use admin • switched to db admin• > db.runCommand( { addshard : "localhost:10000" } ) • { "shardadded" : "shard0000", "ok" : 1 }• > db.runCommand( { addshard : "localhost:10001" } ) • { "shardadded" : "shard0001", "ok" : 1 } 27
    33. 33. Shard Data• > db.runCommand( { enablesharding : "test" } ) • {"ok" : 1}• > db.runCommand( { shardcollection : "test.people", key : {name : 1} } ) • {"ok" : 1} 28
    34. 34. Add Shards• db.runCommand( { addshard : "foo/ <serverhostname>[:<port>]" } ); {"ok" : 1 , "added" : "foo"} 29
    35. 35. Look at config data• Login config database• db.shards.find()• db.databases.find()• db.chunks.find()• db.printShardingStatus(true) 30
    36. 36. Recommend Reads• Mongodb Documentation • http://www.mongodb.org/display/DOCS/ Sharding• Book “Scaling Mongodb” • You can find it on www.safaribooksonline.com 31
    37. 37. Q &A 32
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×