Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Migrate from a MongoDB replica-set to a sharded cluster


Published on

When deploying a MongoDB instance, there is a couple of important decision that has to be made. An important one is the decision whether the instance is going to be a Replica-set or a Sharded Cluster. It’s not uncommon to start with a Replica-set, as its easier to deploy and simpler to operate. For some workload, the Replica-set instance may not be the best option, mainly performance-wise, and the migration to a Sharded cluster is the only way.

This presentation will review the challenges when migrating from a Replica-set instance to a Sharded cluster. We will demonstrate real-world issues that users have encounter when migrating from a Replica-set to a Sharded cluster. We are going to list best-practices for the migration and the changes that may be required when moving to a Sharded cluster.

Published in: Software
  • Be the first to comment

  • Be the first to like this

Migrate from a MongoDB replica-set to a sharded cluster

  1. 1. TM Migrate from a MongoDB Replica-Set to a Sharded Cluster Percona Live Online May 2020 Antonios Giannopoulos Database Administrator Jason Terpko Database Administrator
  2. 2. About us 20 years of DB experience Combined 30 years in the IT industry Combined Antonios Giannopoulos Jason Terpko Members of Rackspace since 2014 Regular Speakers Percona Live & Europe Jason loves Antonios loves MongoDB To hate MongoDB
  3. 3. 3 o Definitions o Reasons to migrate o Prepare for the migration o Migration o Welcome to the Sharding era o Scaling o Q&A Migrate from a MongoDB Replica-Set to a Sharded Cluster Agenda
  4. 4. A replica set in MongoDB is a group of mongod processes that maintain the same data set. Replica-Set 4
  5. 5. A Sharded cluster in MongoDB is a group of replica Set(s) that are accessible through one or many mongos processes. Sharded Cluster 5
  6. 6. Why migrate? He who has a why to live can bear almost any how -Friedrich Nietzsche
  7. 7. 7 Replica-sets supports vertical scaling. Scalability Replica Sets 0 500 1000 1500 2000 0 100 200 300 400 500 600 700 32G vs 64G of RAM 32G 64G• Only Primary can serve writes • More secondaries can scale reads • Increase in performance is not linear • May start hitting kernel hard limits • May start hitting storage engine hard limits • May start hitting hardware limitations
  8. 8. 8 Sharded clusters supports horizontal scaling Scalability – Sharded Clusters • Add as many shards/mongos for scaling • Shards can be same or different sizes • Scales both reads and writes • Shard key operations close to linear performance increase
  9. 9. 9 Geographically distributed clusters – Replica Sets
  10. 10. 10 Geographically distributed clusters – Sharded clusters Requirements: - Implement Zones - Shard key prefix must be {region:1}
  11. 11. 11 The implementation is very similar to Geo- distributed clusters Storage takes the place of the region, and shard key(s) must be prefixed with it Hot/Cold partition architecture
  12. 12. 12 Replica-Set requires application changes Workload isolation On Sharded cluster change is transparent
  13. 13. 13 Certain administrative actions run faster on a sharded cluster. Some examples: o Build an index o Perform an Initial sync o Backup/Restore Even if sharded clusters are more complex they can reduce costs: o Its cheaper to have lot of small servers than few large servers o1 X (T2 Double XL) vs 3 X (T2 Medium) oMongos and config servers can run on cheap hardware Manageability & Costs
  14. 14. Prepare for the migration By failing to prepare, you are preparing to fail ― Benjamin Franklin
  15. 15. 15 Prepare the additional component – Config servers § Requires at least three mongod processes (same version as your RS) § Arbiters and delayed slaves are not allowed § Doesn’t have high storage/IOPS/CPU/RAM requirements § For HA use different servers/VM/Containers of each process § Stores two databases: Ø Admin: Authentication & Authorization Ø Config: Sharded cluster metadata Note: We demonstrate basic config. Your organization config may differ
  16. 16. 16 § Requires at least one mongos processes (same version as your RS) § For HA purposes deploy at least two mongos – 3 recommended § Multiple mongos on the same HW § Truth is the actual number depends on the application § You can scale-in/out based on your needs § Doesn’t store any data locally § HW Specs: ØDisk: A volume for logging ØRAM: Depends on the number of mongos/type of operations ØCPU: Depends on the number of mongos/type of operations Prepare the additional component – mongos
  17. 17. 17 Some annoying work…
  18. 18. 18 All this trouble for two empty databases Congratulations it’s a Shell Not completely empty
  19. 19. 19 Intra-cluster communication Open the required rules, an iptables example An isolation layer (VCN/Subnet) between mongos-mongod is a good practice Application access Post-transition applications must connect to the mongos tier ONLY Exceptions: Monitoring & Backup agents and oplog readers* *replace oplog access with change streams Configure Networking
  20. 20. 20 Authentication will be handled by the mongos tier All users/roles from the replica-set must be copied to the mongos § Create users/roles from scratch § mongoimport/export users/roles § Use a script to copy the users/roles Authentication & Authorization Spoiler alert: In a later stage application users must be dropped from the PRIMARY
  21. 21. 21 Connection string on the driver must change and tested Connection String Connection string options:
  22. 22. 22 On your replica-set mongod you must add the following section into the config file Next, a rolling restart is necessary: 1) Restart the Secondaries one at a time 2) Stepdown the cluster 3) Restart the ex-Primary node If you skip that step, during the sh.addShard command (covered in the next section) Replica-set changes
  23. 23. 23 Make sure you haven’t left any upgrade halfway: - authSchemaUpgrade (Pre-4.0 clusters) You want your number match, Version:3 on the authSchema requires to run: - db.adminCommand({authSchemaUpgrade: 1 }); - setFeatureCompatibilityVersion Don’t leave unfinished items
  24. 24. 24 Make sure shell is properly monitored - Basic OS level checks - MongoDB basic checks - Prometheus monitoring Plan to add the shell tier to the backup policy - Mongos: Keep a copy of the config file (repo/deployment script). - Config servers: Data directory (balancer must be stopped) and config file. Middleware preparation
  25. 25. 25 First blood: Always ”sacrifice” DEV/QA/UAT instances 4.0 specific : Transactions(?) Inspect your codebase for incompatibilities For 4.0+ clusters and developers that follow best-practices: $where does not permit references to the db object from the $where function. This is uncommon in un-sharded collections. The geoSearch command is not supported in sharded environments. Pre-4.0 The group does not work with sharding. (Deprecated) db.eval() is incompatible with sharded collections (Deprecated) The $isolated update modifier does not work in sharded environments. (Deprecated/Removed) $snapshot queries do not work in sharded environments (Deprecated) Inspect your code
  26. 26. Migration That's one small step for a man, one giant leap for mankind ― Neil Armstrong
  27. 27. 27 Let’s Add The Shard ✓ ✓ ✓sh.addShard("rs0/rs0-0.mongod.local:30000”);
  28. 28. 28 What did “addShard” do? • Expected Result {"shardAdded" : "rs0", "ok" : 1} • Populates config.shards { "_id" : "rs0", "host" : "rs0/rs0-0.mongod.local:30000,rs0-1.mongod.local:31000,rs0-2.mongod.local:32000", "state" : 1 } • Initiates a Replica Set Monitor • Populates config.databases • Sets up and shards config.system.sessions Refresh for collection config.system.sessions to version 1|0||5eb70c73880167c73b042a5d took 3 ms • Time to Test NETWORK [conn11] Starting new replica set monitor for rs0/rs0-0.mongod.local:30000,rs0-1.mongod.local:31000,rs0-2.mongod.local:32000 { "_id" : "production", "primary" : "rs0", "partitioned" : false, "version" : { "uuid" : UUID("331d084e-ccfe-4072-a2fa-dad614a6b18f"), "lastMod" : 1 } }
  29. 29. 29 Updating Your URI from pymongo import MongoClient connection = MongoClient( 'mongodb://myuser:mypass@rs0-0.mongod.local:30000,rs0-1.mongod.local:31000/production?replicaSet=rs0’ ) Sharded Cluster Replica Set from pymongo import MongoClient connection = MongoClient( 'mongodb://myuser:mypass@mongos-0.mongos.local:50000,mongos-1.mongos.local:51000/production’ )
  30. 30. 30 Application Deploy Services Monitor • Application Logs • Mongos Logs • Resources • Recommended • PMM2 • Ops Manager 25% 100%
  31. 31. 31 Rollback – Partial or Full Services Monitor • Application Logs • Mongos Logs • Resources • Recommended • PMM2 • Ops Manager 100% 25% 75%
  32. 32. 32 Testing Resiliency Planned Maintenance • Graceful Process Termination • Election (Step Down) • Eventually • Removing Members • Removing Shards Unplanned Maintenance • Non-Graceful Process Termination • OOM • Segmentation Fault or Assertion • Election due to loss of primary
  33. 33. 33 Connection Management • Client Connections • Connection Pools • Cursor Management • Driver Settings • Mongos Connections • Connection Pools • Cursor Management • Mongos Settings* *via setParameter:
  34. 34. Welcome to Sharding era. So What Now? Do one thing every day that scares you – Eleanor Roosevelt
  35. 35. 35 Post-Flight Backups • Repeatable Deploys • Data Balance (Timing) • Oplog Length • Consistent State Upgrading and Downgrading • Order Matters • Confirm per Major Version • Feature Compatibility Version Monitoring Metrics • Mongos Layer
  36. 36. 36 Monitoring Write Operations from mongotriggers import MongoTrigger … triggers = MongoTrigger(client) triggers.register_insert_trigger(update_last_login, 'app', 'sessions') triggers.tail_oplog() Sharded Cluster with ChangeStreams Replica Set, Potential Legacy Code from pymongo import MongoClient with[{'$match': {'operationType': 'insert'}}, {'$addFields': {'fullDocument.LastLogin': datetime.datetime.utcnow()}}]) as stream: for doc in stream: resume_token = doc.get("_id") update_last_login(doc)
  37. 37. 37 Why Change Streams? • Target Flexibility • Resumable • Data Manipulation • Supported Feature • Changes with Topology • Single Authentication and Authorization Source • Transaction Compliant
  38. 38. 38 Sharding A Collection { "_id" : "mydb.mycoll-uuid_"00005cf6-1217-4414-935b-cf1bde09cc77"", "lastmod" : Timestamp(1, 5), "lastmodEpoch" : ObjectId("570733145f2bf94777a62155"), "ns" : "mydb.mycoll", "min" : { "uuid" : "00005cf6-1217-4414-935b-cf1bde09cc77" }, "max" : { "uuid" : "7fe55637-74c0-4e51-8eed-ab6b411d2b6e" }, "shard" : ”r1" } config.chunks
  39. 39. 39 Understand Your Workload Profiling Profiling will help you identify your workload. Enable statement profiling on level 2 (collects profiling data for all database operations) To collect a representative sample you might need to increase profiler size *size is in bytes db.getSiblingDB(<database>).setProfilingLevel(2); db.getSiblingDB(<database>).setProfilingLevel(0); db.getSiblingDB(<database>).system.profile.drop(); db.getSiblingDB(<database>).createCollection( "system.profile", { capped: true, size: <size>} ); db.getSiblingDB(<database>).setProfilingLevel(2);
  40. 40. 40 Shard Key Candidates Profiling Using the data you have collected create the following report per collection. Collection <Collection Name> - Profiling period <Start time> , <End time> - Total statements: <num> Number of Inserts: <num> Number of Queries: <num> Query Patterns: {pattern1}: <num> , {pattern2}: <num>, {pattern3}: <num> Number of Updates: <num> Update patterns: {pattern1}: <num> , {pattern2}: <num>, {pattern3}: <num> Number of Removes: <num> Remove patterns: {pattern1}: <num> , {pattern2}: <num>, {pattern3}: <num> Number of FindAndModify: <num> FindandModify patterns: {pattern1}: <num> , {pattern2}: <num>, {pattern3}: <num>
  41. 41. 41 Considerations for Candidates • High Cardinality • Unique Key • Targeted Operations • Read • Write • High Percentage of Operations • Potential Hotspots • Tiny documents combined with mediocre cardinality • Monotonically Increasing Fields • Low cardinality* • Data Pruning • Multiple Unique Indexes • Null Values • Modifications to Key(s)** • findAndModify or update not using key
  42. 42. 42 Reverting a Shard Key Choice Evaluate Your State and Choose a Path • Single Shard and Able To Revert to Replica Set • Workload Can Tolerate Extended Downtime • Workload Tolerates Brief or No Downtime Evaluate Your State and Choose a Path • Single Shard and Able To Revert to Replica Set • Workload Can Tolerate Extended Downtime • Workload Tolerates Brief or No Downtime Evaluate Your State and Choose a Path • Single Shard and Able To Revert to Replica Set • Workload Can Tolerate Extended Downtime • Workload Tolerates Brief or No Downtime mongodump -h mongos-1.mongos.local:51000 --authenticationDatabase admin -u <user> -p <pass> -d production -c users mongorestore -h mongos-1.mongos.local:51000 --authenticationDatabase admin -u ubuntu -p ubuntu -d production -c users --drop dump/production/users.bson db.locks.find({"_id" : "production.users"}) db.collections.find({"_id" : "production.users"}) db.chunks.find({"ns" : "production.users"}) *combined with mongos restarts and rs.stepDown()
  43. 43. Scaling When you scale a mountain, you have to leave your ego at home.” ― Anthony T. Hincks
  44. 44. 44 Add a new shard Adding shards Start Balancer Chunk Migrations will begin
  45. 45. 45 Migrated chunks are marked for deletion Adding shards
  46. 46. 46 Adding shards
  47. 47. 47 Parallel migrations Add one new shard – No parallelism 1 migration at a time Add two new shards – Parallelism 2 migrations at a time If Existing shards>New Shards Number of parallel migrations = New Shards Else Number of parallel migrations = Existing shards
  48. 48. 48 Balancing adds overhead Minimize the the impact by considering a: - Balancing window - _secondaryThrottle - _waitForDelete Be aware that documents in transit maybe visible from the secondaries (Read Concern “available”) Documents in the source shard are also maybe visible before RangeDeleter runs Any interference during migration may lead to orphaned documents Balancing Considerations
  49. 49. 49 Unsharded collections located on each database Primary Shard Unsharded collections Use the movePrimary command to distribute the primaries Requires write downtime to guarantee consistency
  50. 50. 50 Summary Phase Checklist Actions Considerations Why Migrate Performance Locality POC Build expertise Complexity Prepare Build shell tier Replica-set config Connection string Middleware QA/DEV/UAT migration PROD preparation Varies on the datasize Execute Add the first shard sh.addShard() Connection string change Rollback scenario Sharding Profilling Associated index sh.enableSharding sh.shardCollection Rollback scenario Pick optimal shard keys Sharding limitations Scale Build more shards Add shards Balancing Unsharded collections
  51. 51. Q & A Feel free to hit us with any questions at
  52. 52. Rackspace 1 Fanatical Pl. San Antonio, TX 78218 US sales: 1-800-961-2888 US support: 1-800-961-4454 Copyright © 2019 Rackspace | Rackspace® Fanatical Support® and other Rackspace marks are either registered service marks or service marks of Rackspace US, Inc. in the United States and other countries. Features, benefits and pricing presented depend on system configuration and are subject to change without notice. Rackspace disclaims any representation, warranty or other legal commitment regarding its services except for those expressly stated in a Rackspace services agreement. All other trademarks, service marks, images, products and brands remain the sole property of their respective holders and do not imply endorsement or sponsorship. Thank you