Tag-based sharding
Distribute your data as you need
MongoDB User Group
Madrid, October 13th 2015
Juan Antonio Roy Couto
Who am I?
Juan Antonio Roy Couto
Financial Software Developer
Twitter: @juanroycouto
Linkedin: https://www.linkedin.com/in/juanroycouto
Personal blog: http://www.juanroy.es
Contributor at: http://www.mongodbspain.com
Charrosfera member: http://www.charrosfera.com
Email: juanroycouto@gmail.com
MongoDB User Group
Tag-based sharding
❏ Cluster overview
❏ Definitions
❏ Steps for balancing
❏ Steps to split a chunk
❏ Migration steps
❏ Normal MongoDB operation
❏ Pre-splitting
❏ Commands to split a chunk
❏ Tag-based sharding overview
❏ Tag your shards
❏ Tag your chunk ranges
Table of Contents
MongoDB User Group
Tag-based sharding
❏ Replica set
❏ Shards
❏ config servers
❏ config database
❏ mongos
Cluster overview
MongoDB User Group
Tag-based sharding
Cluster overview
Replica Set
● High availability
● Data safety
● Disaster recovery
MongoDB User Group
Tag-based sharding
Replica Set
Secondary
Secondary
Primary
Scale out
Even data distribution across all of the
shards based on a shardkey
A shardkey range belongs to only one
shard
More efficient queries
Cluster overview
Shards
MongoDB User Group
Tag-based sharding
Cluster
Shard 0 Shard 2Shard 1
A-I J-Q R-Z
Cluster overview
MongoDB User Group
Tag-based sharding
Cluster overview
Config servers
MongoDB User Group
Tag-based sharding
● config database
● Identical information (consistency check).
● Metadata:
○ Cluster shards list
○ Data per shard (chunk ranges)
○ ...
● Don’t sync from each other.
● Default Config server (All mongos read it)
Cluster overview
config database
Collections:
● changelog: splits and migration information
● chunks *
● collections * (only sharded)
● databases *
● lockpings
● locks
● mongos
● settings
● shards *
● system.indexes
● tags
● version *
MongoDB User Group
Tag-based sharding
● Receives client requests and returns results.
● Reads the metadata and sends the query to the necessary
shard/shards
● Does not store data
● Keeps a cache version of the metadata. We can refresh it by:
○ mongos>db.runCommand( { flushRouterConfig : 1 } )
○ or restarting the server
Cluster overview
mongos
MongoDB User Group
Tag-based sharding
MongoDB User Group
Tag-based shardingDefinitions
● Range: Data division based on the values of the shardkey.
● Chunk: They are not physical data. Chunks are just a logical
grouping of data into ranges (64MB by default).
● Split: Chunk division. No data is moved.
● Migration: Chunk movements between shards in order to get an
even distribution. Only one chunk is moved at a time.
● Balanced system: The same number of chunks per shard.
● Balancer: Checks if a migration is needed and starts it.
MongoDB User Group
Tag-based sharding
Split
Migration
Steps for balancing
Steps to split a chunk
MongoDB User Group
Tag-based sharding
Shard 0
Final
Shard 0
Chunk 1 Chunk 1
Chunk 2
Chunk 2
Chunk 3
Config
server
mongos
1. Needs chunk 2 to be
splitted?
2. Split points
list
3. Update
metadata
4. Refresh
cache
Migration steps
MongoDB User Group
Tag-based sharding
Shard 1 (To)
Shard 0 (From)
Chunk 4
Chunk 1
Chunk 2
Chunk 3
Config
server
mongos
1. Is balancer running?
2. Is the system imbalance?
3. Pick chunk 3
4. Split chunk 3?
5. Begin (1)
6. Deletes finished?(2)
7. Read this chunk(3)
8. Transfer(4)
9. Update metadata(5)
10. Remove chunk 3 from
shard 0(6)
11. Refresh mongos cache Chunk 3
1
4
7
8
9
1
1
Normal MongoDB operation
MongoDB User Group
Tag-based sharding
Shard 0 Shard 1 Shard 2 Shard 3
mongos
Client
Migrations
Useful for storing data directly
in the shards (massive
data loads).
Avoid bottlenecks.
MongoDB does not need to
split or migrate chunks.
After the split, the migration
must be finished before
data loading.
Pre-splitting
MongoDB User Group
Tag-based sharding
Cluster
Shard 0 Shard 2Shard 1
Chunk 1
Chunk 5
Chunk 3
Chunk 4
Chunk 2
Splitting a chunk:
mongos>for (var i=0; i<20, i++) {
sh.splitAt(“testdb.presplit”, { x : 1000*i } );
}
Querying existing chunks:
mongos>use config
mongos>db.chunks.find( { ns : “testdb.presplit” } )
Commands to split a chunk
MongoDB User Group
Tag-based sharding
Tags are used when you want to pin ranges to a specific shard.
Tag-based sharding overview
MongoDB User Group
Tag-based sharding
shard0
EMEA
shard1
APAC
shard2
LATAM
shard3
NORAM
mongos>sh.addShardTag(“shard0”, “EMEA”)
mongos>sh.addShardTag(“shard1”, “APAC”)
mongos>sh.addShardTag(“shard2”, “LATAM”)
mongos>sh.addShardTag(“shard3”, “NORAM”)
Tag your shards
MongoDB User Group
Tag-based sharding
mongos>sh.addTagRange( namespace, minimum, maximum, tag )
mongos>sh.addTagRange( “testdb.tagrange”,
{ “x” : 0 },
{ “x” : 1000 },
“EMEA” )
minimum: the minimum value (inclusive) of the shard key range to include in the tag.
maximum: the maximum value (exclusive) of the shard key range to include in the tag.
Tag your chunk ranges
MongoDB User Group
Tag-based sharding
Questions?
Questions?
MongoDB User Group
Tag-based sharding
We are looking for writers
mongodbspain.com
MongoDB User Group
Tag-based sharding
Thank you for your attention!
MongoDB User Group
Tag-based sharding
Madrid, October 13th 2015
Juan Antonio Roy Couto

Tag based sharding presentation

  • 1.
    Tag-based sharding Distribute yourdata as you need MongoDB User Group Madrid, October 13th 2015 Juan Antonio Roy Couto
  • 2.
    Who am I? JuanAntonio Roy Couto Financial Software Developer Twitter: @juanroycouto Linkedin: https://www.linkedin.com/in/juanroycouto Personal blog: http://www.juanroy.es Contributor at: http://www.mongodbspain.com Charrosfera member: http://www.charrosfera.com Email: juanroycouto@gmail.com MongoDB User Group Tag-based sharding
  • 3.
    ❏ Cluster overview ❏Definitions ❏ Steps for balancing ❏ Steps to split a chunk ❏ Migration steps ❏ Normal MongoDB operation ❏ Pre-splitting ❏ Commands to split a chunk ❏ Tag-based sharding overview ❏ Tag your shards ❏ Tag your chunk ranges Table of Contents MongoDB User Group Tag-based sharding
  • 4.
    ❏ Replica set ❏Shards ❏ config servers ❏ config database ❏ mongos Cluster overview MongoDB User Group Tag-based sharding
  • 5.
    Cluster overview Replica Set ●High availability ● Data safety ● Disaster recovery MongoDB User Group Tag-based sharding Replica Set Secondary Secondary Primary
  • 6.
    Scale out Even datadistribution across all of the shards based on a shardkey A shardkey range belongs to only one shard More efficient queries Cluster overview Shards MongoDB User Group Tag-based sharding Cluster Shard 0 Shard 2Shard 1 A-I J-Q R-Z
  • 7.
    Cluster overview MongoDB UserGroup Tag-based sharding
  • 8.
    Cluster overview Config servers MongoDBUser Group Tag-based sharding ● config database ● Identical information (consistency check). ● Metadata: ○ Cluster shards list ○ Data per shard (chunk ranges) ○ ... ● Don’t sync from each other. ● Default Config server (All mongos read it)
  • 9.
    Cluster overview config database Collections: ●changelog: splits and migration information ● chunks * ● collections * (only sharded) ● databases * ● lockpings ● locks ● mongos ● settings ● shards * ● system.indexes ● tags ● version * MongoDB User Group Tag-based sharding
  • 10.
    ● Receives clientrequests and returns results. ● Reads the metadata and sends the query to the necessary shard/shards ● Does not store data ● Keeps a cache version of the metadata. We can refresh it by: ○ mongos>db.runCommand( { flushRouterConfig : 1 } ) ○ or restarting the server Cluster overview mongos MongoDB User Group Tag-based sharding
  • 11.
    MongoDB User Group Tag-basedshardingDefinitions ● Range: Data division based on the values of the shardkey. ● Chunk: They are not physical data. Chunks are just a logical grouping of data into ranges (64MB by default). ● Split: Chunk division. No data is moved. ● Migration: Chunk movements between shards in order to get an even distribution. Only one chunk is moved at a time. ● Balanced system: The same number of chunks per shard. ● Balancer: Checks if a migration is needed and starts it.
  • 12.
    MongoDB User Group Tag-basedsharding Split Migration Steps for balancing
  • 13.
    Steps to splita chunk MongoDB User Group Tag-based sharding Shard 0 Final Shard 0 Chunk 1 Chunk 1 Chunk 2 Chunk 2 Chunk 3 Config server mongos 1. Needs chunk 2 to be splitted? 2. Split points list 3. Update metadata 4. Refresh cache
  • 14.
    Migration steps MongoDB UserGroup Tag-based sharding Shard 1 (To) Shard 0 (From) Chunk 4 Chunk 1 Chunk 2 Chunk 3 Config server mongos 1. Is balancer running? 2. Is the system imbalance? 3. Pick chunk 3 4. Split chunk 3? 5. Begin (1) 6. Deletes finished?(2) 7. Read this chunk(3) 8. Transfer(4) 9. Update metadata(5) 10. Remove chunk 3 from shard 0(6) 11. Refresh mongos cache Chunk 3 1 4 7 8 9 1 1
  • 15.
    Normal MongoDB operation MongoDBUser Group Tag-based sharding Shard 0 Shard 1 Shard 2 Shard 3 mongos Client Migrations
  • 16.
    Useful for storingdata directly in the shards (massive data loads). Avoid bottlenecks. MongoDB does not need to split or migrate chunks. After the split, the migration must be finished before data loading. Pre-splitting MongoDB User Group Tag-based sharding Cluster Shard 0 Shard 2Shard 1 Chunk 1 Chunk 5 Chunk 3 Chunk 4 Chunk 2
  • 17.
    Splitting a chunk: mongos>for(var i=0; i<20, i++) { sh.splitAt(“testdb.presplit”, { x : 1000*i } ); } Querying existing chunks: mongos>use config mongos>db.chunks.find( { ns : “testdb.presplit” } ) Commands to split a chunk MongoDB User Group Tag-based sharding
  • 18.
    Tags are usedwhen you want to pin ranges to a specific shard. Tag-based sharding overview MongoDB User Group Tag-based sharding shard0 EMEA shard1 APAC shard2 LATAM shard3 NORAM
  • 19.
    mongos>sh.addShardTag(“shard0”, “EMEA”) mongos>sh.addShardTag(“shard1”, “APAC”) mongos>sh.addShardTag(“shard2”,“LATAM”) mongos>sh.addShardTag(“shard3”, “NORAM”) Tag your shards MongoDB User Group Tag-based sharding
  • 20.
    mongos>sh.addTagRange( namespace, minimum,maximum, tag ) mongos>sh.addTagRange( “testdb.tagrange”, { “x” : 0 }, { “x” : 1000 }, “EMEA” ) minimum: the minimum value (inclusive) of the shard key range to include in the tag. maximum: the maximum value (exclusive) of the shard key range to include in the tag. Tag your chunk ranges MongoDB User Group Tag-based sharding
  • 21.
  • 22.
    We are lookingfor writers mongodbspain.com MongoDB User Group Tag-based sharding
  • 23.
    Thank you foryour attention! MongoDB User Group Tag-based sharding Madrid, October 13th 2015 Juan Antonio Roy Couto

Editor's Notes

  • #15 Shard TO steps: 1 Creación de índices 2 Borrado docs en rango 3 Transfer start 4 Transfer starts 5 Transfer ends