Mongodb sharding
Upcoming SlideShare
Loading in...5
×
 

Mongodb sharding

on

  • 8,863 views

basic mongodb sharding introduction

basic mongodb sharding introduction

Statistics

Views

Total Views
8,863
Views on SlideShare
5,394
Embed Views
3,469

Actions

Likes
12
Downloads
137
Comments
0

17 Embeds 3,469

http://www.dbafan.com 2927
http://www.ebaydba.net 492
http://www.freedba.net 11
http://translate.googleusercontent.com 7
http://xianguo.com 5
http://cache.baiducontent.com 5
http://cache.baidu.com 5
http://theoldreader.com 3
http://dbafan.com 3
http://www.zhuaxia.com 2
http://reader.youdao.com 2
http://www.ebaydba.com 2
http://webcache.googleusercontent.com 1
http://feeds.feedburner.com 1
http://127.0.0.1 1
http://fanyi.youdao.com 1
http://sweepstakesandcontestsdo.com 1
More...

Accessibility

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Mongodb sharding Mongodb sharding Presentation Transcript

  • Mongodb Sharding by Fan, Xiangrong 1
  • Agenda• Why Sharding• Sharding Architecture• What is Sharding• Sharding Balancer• Write/Reads with Sharding• Sharding Limitation• Demo 2
  • Why Sharding• All writes go to master• Latency sensitive queries still go to master• Single replica set has limitation of 12 nodes• Memory can’t be large enough when active dataset is big• Local Disk is not big enough• Vertical upgrade is too expensive 3
  • Architecture
  • Config Config Servers Servers mongod mongod• We have three config servers in prod cluster or one in test environment mongod• Changes are made using 2 phase commit to provide strong consistency among all 3 config servers• If anyone is down, meta data will be read only• System is online as long as 1/3 is up 5
  • shard1 Shards mongo mongo• Each Shard can be master, master/slave or replica set• Replica set provides auto-failover capability for sharding cluster• Regular mongod processes 6
  • Mongos mongos• Sharding Router• Acts just like a mongod to clients, it makes the cluster “invisible” to clients• You can have as many as you want• It’s suggested to run on appserver• It caches metadata from config servers 7
  • Mongos Mongod Clients 8
  • What is Sharding• It’s range based• Automatic balancing for changes in load and data distribution• Convert from single replica set to sharding cluster without downtime• Easy addition of new shards without downtime• Scaling to one thousand nodes• No single points of failure• Automatic failover 9
  • Shard key• It can be one or more fields• every document needs a shard key (null is ok)• shard key can’t be updated• MongoDBs sharding is order-preserving.You can define the shard key as ascending order or descending order, like { tag : 1, timestamp : -1 }• null < numbers < strings < objects < arrays < binary data < ObjectIds < booleans < dates < regular expressions 10
  • Chunk• A chunk is a contiguous range of data from a particular collection• Collection is broken into chunks by range• A chunk is a logical concept, not a physical reality. $minKey <= key < $maxKey• Each document must belong to one and only one chunk• default size is 64M, can be specified by -- chunksize 11
  • Chunk 12
  • Chunk Split• When chunk reaches its size limit, split happens• Split is an inexpensive metadata operation• You can manually split chunks 13
  • Chunk Split 14
  • Chunk Split (-∞, +∞ ) 14
  • Chunk Split(-∞, 0) [0, +∞ ) 14
  • Chunk Split(-∞, -500) [-500, 0) [0, +∞ ) 14
  • Chunk Split(-∞, -500) [-500, 0) [0, 500) [500, +∞) 14
  • Chunk Migration• Chunk Migration is an expensive operation• Only one chunk migration happens at any time• based on overall size of the shard• Balancer will automatically migrate chunks between shards• you can also manually move chunks 15
  • Sharding Balancer• keep data evenly distributed on all shards• minimize the amount of data transfered• For a balancing round to occur, a shard must have at least nine more chunks than the least-populous shard• it can be turn off • db.settings.update({"_id" : "balancer"}, {"$set" : {"stopped" : true }}, true) 16
  • Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 17
  • Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 17
  • Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 18
  • Sharding Balancermongod mongod mongod mongod mongod mongod mongod mongos 19
  • Choosing Shard Key• A good shard key can distribute reads and writes, but that also keeps the data you’re using together• Don’t use ascending shard key like ID• Don’t use low cardinality shard key like continent• Don’t use random shard key like MD5• Good example: Coarsely ascending key + search key 20
  • Reads/Writes with Sharding 21
  • Sharding Limitation• Unique index can’t be created without shared key as a prefix• You can’t update shard key• Only one chunk move in the cluster at a time• Sharding does not yet support data center awareness• Add new shards brings in more traffic to existing cluster• 20Pb size limit 22
  • Demo• Startup Shards• Startup config servers• Startup mongos• Configure Shards• Shard Data• Look at config data @ mongo config server 23
  • Startup Shards• mkdir /data/db/a /data/db/b• ./mongod --shardsvr --dbpath /data/db/a -- port 10000 > /tmp/sharda.log &• cat /tmp/sharda.log• ./mongod --shardsvr --dbpath /data/db/b -- port 10001 > /tmp/shardb.log &• cat /tmp/shardb.log 24
  • Startup Config Server• mkdir /data/db/config• ./mongod --configsvr --dbpath /data/db/ config --port 20000 > /tmp/configdb.log &• cat /tmp/configdb.log 25
  • Startup Mongos• ./mongos --configdb localhost:20000 > / tmp/mongos.log &• cat /tmp/mongos.log 26
  • Config Shards• $ ./mongo• MongoDB shell version: 1.6.0• connecting to: test• > use admin • switched to db admin• > db.runCommand( { addshard : "localhost:10000" } ) • { "shardadded" : "shard0000", "ok" : 1 }• > db.runCommand( { addshard : "localhost:10001" } ) • { "shardadded" : "shard0001", "ok" : 1 } 27
  • Shard Data• > db.runCommand( { enablesharding : "test" } ) • {"ok" : 1}• > db.runCommand( { shardcollection : "test.people", key : {name : 1} } ) • {"ok" : 1} 28
  • Add Shards• db.runCommand( { addshard : "foo/ <serverhostname>[:<port>]" } ); {"ok" : 1 , "added" : "foo"} 29
  • Look at config data• Login config database• db.shards.find()• db.databases.find()• db.chunks.find()• db.printShardingStatus(true) 30
  • Recommend Reads• Mongodb Documentation • http://www.mongodb.org/display/DOCS/ Sharding• Book “Scaling Mongodb” • You can find it on www.safaribooksonline.com 31
  • Q &A 32