Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tag based sharding presentation

1,624 views

Published on

Distribute your data as you need

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Tag based sharding presentation

  1. 1. Tag-based sharding Distribute your data as you need MongoDB User Group Madrid, October 13th 2015 Juan Antonio Roy Couto
  2. 2. Who am I? Juan Antonio Roy Couto Financial Software Developer Twitter: @juanroycouto Linkedin: https://www.linkedin.com/in/juanroycouto Personal blog: http://www.juanroy.es Contributor at: http://www.mongodbspain.com Charrosfera member: http://www.charrosfera.com Email: juanroycouto@gmail.com MongoDB User Group Tag-based sharding
  3. 3. ❏ Cluster overview ❏ Definitions ❏ Steps for balancing ❏ Steps to split a chunk ❏ Migration steps ❏ Normal MongoDB operation ❏ Pre-splitting ❏ Commands to split a chunk ❏ Tag-based sharding overview ❏ Tag your shards ❏ Tag your chunk ranges Table of Contents MongoDB User Group Tag-based sharding
  4. 4. ❏ Replica set ❏ Shards ❏ config servers ❏ config database ❏ mongos Cluster overview MongoDB User Group Tag-based sharding
  5. 5. Cluster overview Replica Set ● High availability ● Data safety ● Disaster recovery MongoDB User Group Tag-based sharding Replica Set Secondary Secondary Primary
  6. 6. Scale out Even data distribution across all of the shards based on a shardkey A shardkey range belongs to only one shard More efficient queries Cluster overview Shards MongoDB User Group Tag-based sharding Cluster Shard 0 Shard 2Shard 1 A-I J-Q R-Z
  7. 7. Cluster overview MongoDB User Group Tag-based sharding
  8. 8. Cluster overview Config servers MongoDB User Group Tag-based sharding ● config database ● Identical information (consistency check). ● Metadata: ○ Cluster shards list ○ Data per shard (chunk ranges) ○ ... ● Don’t sync from each other. ● Default Config server (All mongos read it)
  9. 9. Cluster overview config database Collections: ● changelog: splits and migration information ● chunks * ● collections * (only sharded) ● databases * ● lockpings ● locks ● mongos ● settings ● shards * ● system.indexes ● tags ● version * MongoDB User Group Tag-based sharding
  10. 10. ● Receives client requests and returns results. ● Reads the metadata and sends the query to the necessary shard/shards ● Does not store data ● Keeps a cache version of the metadata. We can refresh it by: ○ mongos>db.runCommand( { flushRouterConfig : 1 } ) ○ or restarting the server Cluster overview mongos MongoDB User Group Tag-based sharding
  11. 11. MongoDB User Group Tag-based shardingDefinitions ● Range: Data division based on the values of the shardkey. ● Chunk: They are not physical data. Chunks are just a logical grouping of data into ranges (64MB by default). ● Split: Chunk division. No data is moved. ● Migration: Chunk movements between shards in order to get an even distribution. Only one chunk is moved at a time. ● Balanced system: The same number of chunks per shard. ● Balancer: Checks if a migration is needed and starts it.
  12. 12. MongoDB User Group Tag-based sharding Split Migration Steps for balancing
  13. 13. Steps to split a chunk MongoDB User Group Tag-based sharding Shard 0 Final Shard 0 Chunk 1 Chunk 1 Chunk 2 Chunk 2 Chunk 3 Config server mongos 1. Needs chunk 2 to be splitted? 2. Split points list 3. Update metadata 4. Refresh cache
  14. 14. Migration steps MongoDB User Group Tag-based sharding Shard 1 (To) Shard 0 (From) Chunk 4 Chunk 1 Chunk 2 Chunk 3 Config server mongos 1. Is balancer running? 2. Is the system imbalance? 3. Pick chunk 3 4. Split chunk 3? 5. Begin (1) 6. Deletes finished?(2) 7. Read this chunk(3) 8. Transfer(4) 9. Update metadata(5) 10. Remove chunk 3 from shard 0(6) 11. Refresh mongos cache Chunk 3 1 4 7 8 9 1 1
  15. 15. Normal MongoDB operation MongoDB User Group Tag-based sharding Shard 0 Shard 1 Shard 2 Shard 3 mongos Client Migrations
  16. 16. Useful for storing data directly in the shards (massive data loads). Avoid bottlenecks. MongoDB does not need to split or migrate chunks. After the split, the migration must be finished before data loading. Pre-splitting MongoDB User Group Tag-based sharding Cluster Shard 0 Shard 2Shard 1 Chunk 1 Chunk 5 Chunk 3 Chunk 4 Chunk 2
  17. 17. Splitting a chunk: mongos>for (var i=0; i<20, i++) { sh.splitAt(“testdb.presplit”, { x : 1000*i } ); } Querying existing chunks: mongos>use config mongos>db.chunks.find( { ns : “testdb.presplit” } ) Commands to split a chunk MongoDB User Group Tag-based sharding
  18. 18. Tags are used when you want to pin ranges to a specific shard. Tag-based sharding overview MongoDB User Group Tag-based sharding shard0 EMEA shard1 APAC shard2 LATAM shard3 NORAM
  19. 19. mongos>sh.addShardTag(“shard0”, “EMEA”) mongos>sh.addShardTag(“shard1”, “APAC”) mongos>sh.addShardTag(“shard2”, “LATAM”) mongos>sh.addShardTag(“shard3”, “NORAM”) Tag your shards MongoDB User Group Tag-based sharding
  20. 20. mongos>sh.addTagRange( namespace, minimum, maximum, tag ) mongos>sh.addTagRange( “testdb.tagrange”, { “x” : 0 }, { “x” : 1000 }, “EMEA” ) minimum: the minimum value (inclusive) of the shard key range to include in the tag. maximum: the maximum value (exclusive) of the shard key range to include in the tag. Tag your chunk ranges MongoDB User Group Tag-based sharding
  21. 21. Questions? Questions? MongoDB User Group Tag-based sharding
  22. 22. We are looking for writers mongodbspain.com MongoDB User Group Tag-based sharding
  23. 23. Thank you for your attention! MongoDB User Group Tag-based sharding Madrid, October 13th 2015 Juan Antonio Roy Couto

×